Add lru_cache to parsing for improved performance #25

djhoese · 2020-05-04T20:35:06Z

This adds caching using the standard library's functools.lru_cache to the get_convert_dict function and the RegexFormatter.format method. These are both used during parse and only require the fmt and are therefore super easy/efficient to cache. So in the common case (especially Satpy) where we have a single pattern and are checking many other strings against it, this will have a huge impact. In pytroll/satpy#1178, I noticed this change (with ~5600 test strings and 20 format strings) improved my performance from ~4s to ~2.8s.

This PR also adds a purge function which allows the user to clear these caches with one single function. However, the default for lru_cache that we are using is 128 items which really isn't that many so I don't think this should ever be needed.

coveralls · 2020-05-04T21:19:03Z

Coverage increased (+0.1%) to 95.03% when pulling 65c1ff7 on djhoese:optimize-caching-parser into 6de9fc1 on pytroll:master.

mraspaud

LGTM, good idea with the memoization!

Add lru_cache to parsing for improved performance

e4a1623

djhoese added the enhancement label May 4, 2020

djhoese requested review from mraspaud and pnuu May 4, 2020 20:35

djhoese assigned mraspaud, djhoese and pnuu May 4, 2020

djhoese mentioned this pull request May 4, 2020

Optimize readers searching for matching filenames pytroll/satpy#1178

Merged

4 tasks

djhoese added this to To do in PCW Spring 2020 via automation May 4, 2020

djhoese moved this from To do to In progress in PCW Spring 2020 May 4, 2020

Fix stickler styling issues

65c1ff7

mraspaud approved these changes May 5, 2020

View reviewed changes

PCW Spring 2020 automation moved this from In progress to Ready to merge May 5, 2020

mraspaud merged commit 144b2be into pytroll:master May 5, 2020

PCW Spring 2020 automation moved this from Ready to merge to Done May 5, 2020

djhoese deleted the optimize-caching-parser branch May 5, 2020 12:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lru_cache to parsing for improved performance #25

Add lru_cache to parsing for improved performance #25

djhoese commented May 4, 2020

coveralls commented May 4, 2020 •

edited

Loading

mraspaud left a comment

Add lru_cache to parsing for improved performance #25

Add lru_cache to parsing for improved performance #25

Conversation

djhoese commented May 4, 2020

coveralls commented May 4, 2020 • edited Loading

mraspaud left a comment

Choose a reason for hiding this comment

coveralls commented May 4, 2020 •

edited

Loading