Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lru_cache to parsing for improved performance #25

Merged
merged 2 commits into from
May 5, 2020

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented May 4, 2020

This adds caching using the standard library's functools.lru_cache to the get_convert_dict function and the RegexFormatter.format method. These are both used during parse and only require the fmt and are therefore super easy/efficient to cache. So in the common case (especially Satpy) where we have a single pattern and are checking many other strings against it, this will have a huge impact. In pytroll/satpy#1178, I noticed this change (with ~5600 test strings and 20 format strings) improved my performance from ~4s to ~2.8s.

This PR also adds a purge function which allows the user to clear these caches with one single function. However, the default for lru_cache that we are using is 128 items which really isn't that many so I don't think this should ever be needed.

@djhoese djhoese requested review from mraspaud and pnuu May 4, 2020 20:35
@djhoese djhoese added this to To do in PCW Spring 2020 via automation May 4, 2020
@djhoese djhoese moved this from To do to In progress in PCW Spring 2020 May 4, 2020
@coveralls
Copy link

coveralls commented May 4, 2020

Coverage Status

Coverage increased (+0.1%) to 95.03% when pulling 65c1ff7 on djhoese:optimize-caching-parser into 6de9fc1 on pytroll:master.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good idea with the memoization!

PCW Spring 2020 automation moved this from In progress to Ready to merge May 5, 2020
@mraspaud mraspaud merged commit 144b2be into pytroll:master May 5, 2020
PCW Spring 2020 automation moved this from Ready to merge to Done May 5, 2020
@djhoese djhoese deleted the optimize-caching-parser branch May 5, 2020 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants