Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New] Size limited LRU cache #58

Closed
ZhymabekRoman opened this issue May 27, 2022 · 11 comments · Fixed by #76
Closed

[New] Size limited LRU cache #58

ZhymabekRoman opened this issue May 27, 2022 · 11 comments · Fixed by #76

Comments

@ZhymabekRoman
Copy link
Contributor

I made a small working prototype of the cache implementation on top of OrderedDict. Please write what can be improved.

https://gist.github.com/ZhymabekRoman/a9c7a25c155dfdea52277cc74f28fa65

@ZhymabekRoman
Copy link
Contributor Author

There is a small problem - it seems to me that it works a little slow, I still have no idea how to speed up

@Animenosekai
Copy link
Owner

I took a look at the different implementations and I think that it could be possible to adapt pympler's asizeof implementation for Python >=3.2.

Here is my slightly modified implementation :
https://gist.github.com/Animenosekai/4e5a3a980e7ed2e542003a58a54ede96

@Animenosekai
Copy link
Owner

I just tested the different implementations and yes our implementation is quite slow, pympler's one is a bit faster but sys.getsizeof is definitely the fastest (which is normal considering that there is no recursion).

@Animenosekai
Copy link
Owner

Or maybe we could check if the object is a native object, in which case we would check with sys.getsizeof and we would use other implementations for more complex objects

@ZhymabekRoman
Copy link
Contributor Author

Maybe implement asynchronous checking and clearing the cache using ThreadPoolExecutor? This should speed up the caching process

@Animenosekai
Copy link
Owner

Animenosekai commented May 28, 2022

Maybe implement asynchronous checking and clearing the cache using ThreadPoolExecutor? This should speed up the caching process

Well I tried to implement it too in 58521f1 but seems to also be slow

I'll need to figure out why because it seems odd...

@ZhymabekRoman
Copy link
Contributor Author

So I tried to optimise the LRU cache, and it seems I have it. Some numbers:

raw dictionary implementation code:
import os

if __name__ == "__main__":
    a = {}
    for i in range(1_000):
        print(i)
        a[i] = os.urandom(200_000_0)

Also in the test, I changed the old LRU implementation to use os.urandom instead of just putting numbers.


Just raw dictionary time:

python3 just_dict.py  0.03s user 8.59s system 99% cpu **8.643 total**

Old LRU cache implementation time:

python3 lru_old.py  48.94s user 8.59s system 86% cpu **1:06.65 total**

New LRU cache implementation time:

python3 lru.py  0.50s user 7.64s system 99% cpu **8.201 total**

time shows that the new LRU implementation is like a zero-overhead. I think that's a good result. In the new implementation, instead of recomputing the cache size every time, it simply takes the object size into a separate class property, and uses these values to free the cache in the future.

New LRU implementation code:
https://gist.github.com/ZhymabekRoman/43ee515959de024416e29b8dd97e4d96

@Animenosekai
Copy link
Owner

Ohhh I understand how you did that. Really cool numbers!

Did you try on python <3.7 to see if it was working as well ?

@ZhymabekRoman
Copy link
Contributor Author

Did you try on python <3.7 to see if it was working as well ?

Hmm, no. I'll try to install debian stretch on my machine and test it there with Python 3.5.

@ZhymabekRoman
Copy link
Contributor Author

Yes, it works great! Some debugging shows that all the logic works as expected.

@ZhymabekRoman
Copy link
Contributor Author

I think this issue can be closed, since we have merged into next branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants