Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to cache fetched emoji to disk for big speedup #37

Open
wants to merge 1 commit into
base: 2.0
Choose a base branch
from

Conversation

rambling-ai
Copy link

This PR adds an option to cache fetched emojis to disk. For my use case this provides a massive boost to performance.

To demonstrate the speed up, I called Pilmoji three times, the first time with disk_cache = False, the next two times with disk_cache=True. Here are the results:

Block 1 duration: 1.4033 seconds
Block 2 duration: 1.3994 seconds
Block 3 duration: 0.0041 seconds

The first two calls are roughly equivalent, but the third call benefits from the emojis already having been fetched and stored to disk locally.

And here's the speed test code:

import time
from pilmoji import Pilmoji
from PIL import Image

test_image = Image.new("RGBA", (1080, 1920), (255, 255, 255, 0))
x = 100
y = 100
text_color = (255, 255, 255)
line = "⏱️⏳🏎️🐇🐢💨🚀⏩"

# Block 1
start_time = time.time()
with Pilmoji(test_image) as pilmoji:
    pilmoji.text((x, y), line, text_color)
end_time = time.time()
print(f"Block 1 duration: {end_time - start_time:.4f} seconds")

# Block 2
start_time = time.time()
with Pilmoji(test_image, disk_cache=True) as pilmoji:
    pilmoji.text((x, y), line, text_color)
end_time = time.time()
print(f"Block 2 duration: {end_time - start_time:.4f} seconds")

# Block 3
start_time = time.time()
with Pilmoji(test_image, disk_cache=True) as pilmoji:
    pilmoji.text((x, y), line, text_color)
end_time = time.time()
print(f"Block 3 duration: {end_time - start_time:.4f} seconds")

This speed-up made it possible to use Pilmoji for my use case, so I hope it can help others too!

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rambling-ai - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Docstrings: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

Comment on lines +160 to +165
def __init__(self, disk_cache=False):
super().__init__()
self.disk_cache = disk_cache
if self.disk_cache:
self.cache_dir = Path(self.CACHE_DIR)
self.cache_dir.mkdir(exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (edge_case_not_handled): Consider handling potential exceptions when creating the cache directory.

Creating a directory can fail due to permissions or filesystem issues. It might be beneficial to catch exceptions here and handle them gracefully, possibly logging an error or disabling disk caching if the directory cannot be created.

Suggested change
def __init__(self, disk_cache=False):
super().__init__()
self.disk_cache = disk_cache
if self.disk_cache:
self.cache_dir = Path(self.CACHE_DIR)
self.cache_dir.mkdir(exist_ok=True)
def __init__(self, disk_cache=False):
super().__init__()
self.disk_cache = disk_cache
if self.disk_cache:
self.cache_dir = Path(self.CACHE_DIR)
try:
self.cache_dir.mkdir(exist_ok=True)
except OSError:
self.disk_cache = False

return BytesIO(f.read())
else:
url = self.BASE_EMOJI_CDN_URL + quote_plus(emoji) + '?style=' + quote_plus(self.STYLE)
_to_catch = HTTPError if not _has_requests else requests.HTTPError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code_refinement): Duplicated logic for determining _to_catch exception.

The logic for setting _to_catch is repeated in both the caching and non-caching branches. Consider moving this outside the conditional branches to avoid code duplication.

Suggested change
_to_catch = HTTPError if not _has_requests else requests.HTTPError
_to_catch = HTTPError if not _has_requests else requests.HTTPError

try:
data = self.request(url)
return BytesIO(data)
except _to_catch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code_refinement): Silently catching exceptions may obscure underlying issues.

Catching exceptions without logging or handling them in any way can make debugging more difficult if something goes wrong. Consider at least logging the error to aid in troubleshooting.

Suggested change
except _to_catch:
except _to_catch as e:
logging.error(f"Error requesting URL: {e}")

@@ -154,18 +155,45 @@ class EmojiCDNSource(DiscordEmojiSourceMixin):

BASE_EMOJI_CDN_URL: ClassVar[str] = 'https://emojicdn.elk.sh/'
STYLE: ClassVar[str] = None
CACHE_DIR: ClassVar[str] = 'emoji_cache'

def __init__(self, disk_cache=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider separating the concerns of fetching and caching into different classes.

The addition of disk caching functionality directly into the EmojiCDNSource class significantly increases its complexity and responsibilities. This class now handles both fetching emojis from a CDN and managing a disk cache, which complicates its logic and makes it harder to maintain. It's generally beneficial to adhere to the Single Responsibility Principle, ensuring that a class or module has only one reason to change.

Consider separating the concerns of fetching and caching into different classes. For instance, you could have one class dedicated to fetching emojis from the CDN and another class responsible for managing the disk cache. This separation makes the code easier to understand, test, and maintain, as each class focuses on a single responsibility.

By isolating the caching logic into its own class, you also make your code more modular and flexible. If in the future you decide to change the caching strategy or the source of emojis, you can do so with minimal impact on the rest of your codebase. This approach enhances the overall design and can lead to better software architecture.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have skipped reviewing this pull request. It looks like we've already reviewed this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants