-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to cache fetched emoji to disk for big speedup #37
base: 2.0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @rambling-ai - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Docstrings: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
def __init__(self, disk_cache=False): | ||
super().__init__() | ||
self.disk_cache = disk_cache | ||
if self.disk_cache: | ||
self.cache_dir = Path(self.CACHE_DIR) | ||
self.cache_dir.mkdir(exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (edge_case_not_handled): Consider handling potential exceptions when creating the cache directory.
Creating a directory can fail due to permissions or filesystem issues. It might be beneficial to catch exceptions here and handle them gracefully, possibly logging an error or disabling disk caching if the directory cannot be created.
def __init__(self, disk_cache=False): | |
super().__init__() | |
self.disk_cache = disk_cache | |
if self.disk_cache: | |
self.cache_dir = Path(self.CACHE_DIR) | |
self.cache_dir.mkdir(exist_ok=True) | |
def __init__(self, disk_cache=False): | |
super().__init__() | |
self.disk_cache = disk_cache | |
if self.disk_cache: | |
self.cache_dir = Path(self.CACHE_DIR) | |
try: | |
self.cache_dir.mkdir(exist_ok=True) | |
except OSError: | |
self.disk_cache = False |
return BytesIO(f.read()) | ||
else: | ||
url = self.BASE_EMOJI_CDN_URL + quote_plus(emoji) + '?style=' + quote_plus(self.STYLE) | ||
_to_catch = HTTPError if not _has_requests else requests.HTTPError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code_refinement): Duplicated logic for determining _to_catch
exception.
The logic for setting _to_catch
is repeated in both the caching and non-caching branches. Consider moving this outside the conditional branches to avoid code duplication.
_to_catch = HTTPError if not _has_requests else requests.HTTPError | |
_to_catch = HTTPError if not _has_requests else requests.HTTPError |
try: | ||
data = self.request(url) | ||
return BytesIO(data) | ||
except _to_catch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code_refinement): Silently catching exceptions may obscure underlying issues.
Catching exceptions without logging or handling them in any way can make debugging more difficult if something goes wrong. Consider at least logging the error to aid in troubleshooting.
except _to_catch: | |
except _to_catch as e: | |
logging.error(f"Error requesting URL: {e}") |
@@ -154,18 +155,45 @@ class EmojiCDNSource(DiscordEmojiSourceMixin): | |||
|
|||
BASE_EMOJI_CDN_URL: ClassVar[str] = 'https://emojicdn.elk.sh/' | |||
STYLE: ClassVar[str] = None | |||
CACHE_DIR: ClassVar[str] = 'emoji_cache' | |||
|
|||
def __init__(self, disk_cache=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider separating the concerns of fetching and caching into different classes.
The addition of disk caching functionality directly into the EmojiCDNSource
class significantly increases its complexity and responsibilities. This class now handles both fetching emojis from a CDN and managing a disk cache, which complicates its logic and makes it harder to maintain. It's generally beneficial to adhere to the Single Responsibility Principle, ensuring that a class or module has only one reason to change.
Consider separating the concerns of fetching and caching into different classes. For instance, you could have one class dedicated to fetching emojis from the CDN and another class responsible for managing the disk cache. This separation makes the code easier to understand, test, and maintain, as each class focuses on a single responsibility.
By isolating the caching logic into its own class, you also make your code more modular and flexible. If in the future you decide to change the caching strategy or the source of emojis, you can do so with minimal impact on the rest of your codebase. This approach enhances the overall design and can lead to better software architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have skipped reviewing this pull request. It looks like we've already reviewed this pull request.
This PR adds an option to cache fetched emojis to disk. For my use case this provides a massive boost to performance.
To demonstrate the speed up, I called Pilmoji three times, the first time with disk_cache = False, the next two times with disk_cache=True. Here are the results:
The first two calls are roughly equivalent, but the third call benefits from the emojis already having been fetched and stored to disk locally.
And here's the speed test code:
This speed-up made it possible to use Pilmoji for my use case, so I hope it can help others too!