-
-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add: cache for remote images #310
Conversation
After some profiling, I found that an in-memory cache for etag is enough unless the server significantly lacks on I/O, but a cache wont help this anyway. Plus, with a week of field research, the processing time (mean Convert function call) of second requests to my instance of webp-server reduced by ~200ms which is exactly the latency between the server and the origin server. My redis memory usage is around 10MiB with ~10000 entries by a 3 day TTL. No serverside errors were observed during this period. However, my instance only serves ~20k requests with a total bandwidth of ~2.5GiB, and data on giant instances needs to be further collected. About TTL for redis:
Also, since webp-server now supports multiple origin servers, should the TTL config be applied to all hosts or separate configs are better. The drawback with the letter is that it will break current config file structure. The caching logic in this pr looks ok to me and it's ready for review. Please notify me if the dev team thinks another default behavior of redis ttl better fits the majority and I'll update soon! |
edd0a0b
to
b86ed62
Compare
Thank you for your contribution @HolgerHuo , yes, adding a stateful cache on However adding an external service, especially a stateful service(e,g Redis) will greatly increase the complexity of the entire system, this is my main concern about adding feature to this, including this and issue #305. We've some similar concerns on some aspects and as you can see currently WebP Server Go can run without any other service dependencies, like metadata service is replaced with static local JSON file, etc. Speaking of your comment you've mentioned:
Which I didn't get it, what do you mean by "unless the server significantly lacks on I/O, but a cache wont help this anyway." BTW, we've included |
Hi! I understand your concern over the complexity of of the system.
This was a comment for my previous thinking on
Did you mean adding features depending on external stateful services? How was #305 related to this?
I see these. I'll update the pr in a little while. Btw, has the dev team considered making webp-server able to scale horizontally, ie, to better work in a multi-node environment, since I saw #296 was implemented. I've considered about using go's internal in-memory cache, but i did miss that a WriteCache was implemented in this way already and I thought that an external service will help in a multi-node environment and further enable it to accommodate more requests. Certainly, other parts of the code also need to be modified to make it cloud native, and we might make this an option without breaking its single-machine simplicity with opt-in redis backend. |
By the way, as for ttl, do you think there is a better mechanism for cache expiration? Does a global ttl fit in most cases? |
Hi! I have refactored the code and initial results showed that go-cache is working normally. More tests are still ongoing but it should be stable and won't interfere with other logics of webp-server. |
Sorry, i overlooked the 0 for no expiration behavior in config.go and i fix that in the following commit. |
Nice work, using
Yes, but not limited to this, issue 305 suggests adding another GET parameter that we think might increase complexity of code, so we're still thinking about it.
My intuition feels so okay to use a global TTL, feel free to discuss if you've thought of another case which global TTL won't fit.
Currently we haven't thought about this at this time but we are open to ideas.
Agreed, and to me, I've come up with few things that might be solved at the same time(when considering about scale horizontally, especially in kubernetes environment) like:
I haven't thought about this problem very clearly. Your thoughts and suggestions are welcomed. |
Thanks!
I also think it ok but a better approach may be implementing lru eviction policy limited by resp's ttl. But in current design it doesn't feel so necessary.
I think to make able it (scaling horizontally as in kubernetes style), a shared fs is better because:
If run independently, go-cache may also fall behind since it's not persistent across restarts, and unlike writelock's lifecycle, cache for requests are usually much longer, but redis is not a must.
I don't think it possible without creating locks in a graceful manner, but race conditions are not so easily encountered because file io operations wont take too long. Anyhow, to make the server reliable, it's still necessary for locks to be implemented. |
There are some other thoughts like maintaining regarding k8s deployment like instance's lifecycle. And i think if the dev team is interested, i might try to work on this in another pr. PS: a small bug is fixed in the following commit. |
Nice!❤️ I've added a comment on code, the rest looks good for me, please comment when this PR is ready for merge.
I think we can create an issue for this topic to make sure we are on the same page before creating PR. |
Hi! I've updated the code. I'm not very familiar with go-cache module. But since it is an internal module and we have full knowledge of what is written to that cache, is this error handling necessary? Anyhow, it is a better practice.
Yes, we can continue further work there. Btw, happy chinese new year. |
Happy Lunar New Year! |
This reverts commit 123c96d.
This reverts commit 123c96d.
This reverts commit 123c96d.
Thoughts
Currently webp-server
ping
s every remote file to see if it has changed but in most cases, image source are static. The extraHEAD
request increases network overhead and causes latency for response. A Redis cache can be implemented to map each URL to its ETag and skip the pinging procedure. This pr illustrates a simple implementation and from my tests, the latency created by fetching remote image is eliminated.Improvements to make
Above are some of my thoughts to make
webp-server-go
more speedy. I don't know if these changes are effective or well-implemented and I would like to hear some advice from the dev team on how to best utilize caching in our scenerio. I'll keep working on this pr and hopefully it will be completed soon.