Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend caching plugin to handle markdown and iptc metadata #443

Merged
merged 4 commits into from
Jan 28, 2022

Conversation

dsschult
Copy link
Contributor

@dsschult dsschult commented Aug 7, 2021

Turn markdown metadata in gallery.py into a property so it can be overridden.
Extend the caching plugin to handle markdown and iptc metadata (previously only handled exif).
Invalidate cache when a file timestamp changes, so we always get the latest data.
Cache file modification time lookups within a build to reduce load on filesystem.

Supersedes #440.

@codecov
Copy link

codecov bot commented Aug 7, 2021

Codecov Report

Merging #443 (78ecbfe) into main (3b722ad) will increase coverage by 0.59%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #443      +/-   ##
==========================================
+ Coverage   87.33%   87.93%   +0.59%     
==========================================
  Files          23       23              
  Lines        1943     2014      +71     
==========================================
+ Hits         1697     1771      +74     
+ Misses        246      243       -3     
Impacted Files Coverage Δ
sigal/gallery.py 91.28% <100.00%> (+0.32%) ⬆️
sigal/plugins/extended_caching.py 93.75% <100.00%> (+12.50%) ⬆️
sigal/utils.py 96.42% <100.00%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b722ad...78ecbfe. Read the comment docs.

@corsmith
Copy link

I have an album with 150K pictures and 4K videos and use the extended_caching plugin. The 2.2 release takes approximately 18 minutes to rebuild the gallery with the plugin enabled (assuming no thumbnails to generate) and the cache file is approximately 23MB. Using the code in this PR it takes less than 5 minutes to rebuild the gallery with a cache size of 1.03GB (43x the previous size!). The pickle.load call in my setup alone takes 60 seconds of CPU time! While that is a little crazy the net effect is a 60% reduction in wall time to complete a full rebuild even with the previous extended_caching in 2.2. It is faster to parse the single file than to read 154K files for metadata across the filesystem as currently done in gallery.py -> class Media -> init -> self._get_metadata()

@saimn saimn added this to the 2.3 milestone Jan 28, 2022
@saimn
Copy link
Owner

saimn commented Jan 28, 2022

Sorry it seems I forgot to review this PR. I just had a look and it looks very good.
@dsschult - could you rebase it on main to make sure tests still run ?
If I had one comment about the code, maybe raw_metadata could be renamed to something like markdown_metadata or mkd_metadata ?

@dsschult
Copy link
Contributor Author

@saimn should be good now

@saimn
Copy link
Owner

saimn commented Jan 28, 2022

Excellent, thanks for the quick update !

@saimn saimn merged commit 0455096 into saimn:main Jan 28, 2022
@dsschult dsschult deleted the metadata_caching branch January 28, 2022 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants