-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Reduce lru_cache memory overhead. #76603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently, functools.lru_cache implement own doubly-linked list. But it is inefficient than OrderedDict because each link node is GC object. I added two private C API for OrderedDict and make lru_cache use it.
Why I didn't implement C version of move_to_end() is to reduce lookup. I'll benchmark it and report result here. |
Current implementation (no news entry yet): |
This is a duplicate of bpo-28239. |
Ah, sorry, you use OrderedDict instead of just ordered dict. It should have different timing and memory consumption. |
Hmm, it seems my implementation is 30% slower when many mishit scenario. On the other hand, GC speed looks about 2x faster as expected. $ ./python -m perf compare_to master.json patched.json -G
Slower (5):
- lru_1000_100: 217 ns +- 6 ns -> 302 ns +- 6 ns: 1.39x slower (+39%)
- lru_10000_1000: 225 ns +- 4 ns -> 309 ns +- 2 ns: 1.37x slower (+37%)
- lru_100_1000: 114 ns +- 5 ns -> 119 ns +- 1 ns: 1.05x slower (+5%)
- lru_100_100: 115 ns +- 6 ns -> 119 ns +- 1 ns: 1.03x slower (+3%)
- lru_1000_1000: 134 ns +- 6 ns -> 136 ns +- 1 ns: 1.02x slower (+2%) Faster (4):
|
I found odict.pop() and odict.popitem() is very inefficient because I'll try to optimize it in holidays. |
Please stop revising every single thing you look at. The traditional design of LRU caches used doubly linked lists for a reason. In particular, when there is a high hit rate, the links can be updated without churning the underlying dictionary. |
FWIW, I'm the original author and designer of this code, so it would have been appropriate to assign this to me for sign-off on any proposed changes. |
Not me (the C implementation)? ;-) |
I don't proposing removing doubly linked list; OrderedDict uses On the other hand, I found problem of OrderedDict for most mis hit Now I think lru_cache's implementation is better OrderedDict. So I stop trying to remove lru_cache's own implementation. |
PR-5008 benchmark: $ ./python -m perf compare_to master.json patched2.json -G
Faster (9):
- gc(1000000): 98.3 ms +- 0.3 ms -> 29.9 ms +- 0.4 ms: 3.29x faster (-70%)
- gc(100000): 11.7 ms +- 0.0 ms -> 3.71 ms +- 0.03 ms: 3.14x faster (-68%)
- gc(10000): 1.48 ms +- 0.02 ms -> 940 us +- 6 us: 1.57x faster (-36%)
- lru_10_100: 149 ns +- 6 ns -> 138 ns +- 1 ns: 1.08x faster (-8%)
- lru_100_100: 115 ns +- 6 ns -> 108 ns +- 1 ns: 1.07x faster (-6%)
- lru_1000_1000: 134 ns +- 6 ns -> 127 ns +- 1 ns: 1.05x faster (-5%)
- lru_100_1000: 114 ns +- 5 ns -> 108 ns +- 1 ns: 1.05x faster (-5%)
- lru_1000_100: 217 ns +- 6 ns -> 212 ns +- 4 ns: 1.03x faster (-2%)
- lru_10000_1000: 225 ns +- 4 ns -> 221 ns +- 5 ns: 1.02x faster (-2%) |
LGTM. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: