-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BaseStorage interface is missing callback parameter to optimize for Tornado Runloop #452
Comments
+1 |
2 similar comments
+1 |
+1 |
There would definitely be a performance boost when using slower storages although this will introduce a BC with all third-party storages. I'll see if I can come up with a PR that would work with callback / no-callback scenarios. |
Thank you for looking into this. Rob |
As of 4.12 thumbor uses a threadpool to load results from storage: https://github.com/thumbor/thumbor/blob/master/thumbor/handlers/__init__.py#L258 Instead of using callbacks I'll attempt to move the storage.get to the threadpool. |
@RobertBiehl have you tried enabling the thread pool? |
Not yet, I will have a look. Thanks. |
@RobertBiehl I'm about to submit a patch.
With
Current Thumbor pretty much did not finish when the storage was blocking. |
@masom which value ENGINE_THREADPOOL_SIZE is reasonable? |
We were seeing about 2 req/s, and only a 5% CPU usage without threadpool. (So not that much slower than you benchmark) |
Why don't we go with the first option of putting callbacks on the storage methods and just deal with this breaking change ? |
@guilhermef I'd prefer that! :) |
Current thumbor with slow storage:
Thumbor with ThreadPool patch:
I'll try to get the callback stuff in there. Right now the file_storage and redis_storage will likely not benefit from tornado's async callback. |
S3 storage will benefit from it I bet w.r.t threadpool: |
Yeah looking at the aws storage the async callback should give it a big benefit. I'll open a PR, would you mind trying that PR with that thread pool size? |
@masom Thanks I will try it out and let you know how it affected performance. |
@masom before I can give you feedback I need to solve an issue I have with https://github.com/willtrking/thumbor_aws. After a while hundreds of tcp connections are stuck in CLOSE_WAIT state. Currently the throughput pauses for a long time before some requests can get through again. I'll let you know once I can get back to really try out the threadpool patch. |
@masom Ok the problem disappeared temporarily for me to be able to do some benchmarks. With your #454 I now get 100% CPU across the board with 5 thumbor instances and a ENGINE_THREADPOOL_SIZE of 5. Concurrency is 20 and tested over about 1500 unique requests. Test 1 (first time image requests; loading from s3, generating new thumbnails and storing results in s3)
Test 2 (cached image requests; loading from s3 result storage)
|
@RobertBiehl Are these numbers better than without the thread pool? The ThreadPool should give a huge boost in performance for many i/o blocking tasks. Ideally at some point there should be dedicated thread pools per layer ( loader, storage, result_storage, engine ) to allow fine grained control. |
Yes it is definitly faster now. But I think the threadpool should only be used in situations where there are usually no non-blocking options (e.g. for image processing tasks). I thought using Tornado's runloop is the best practice for most I/O situations. So I'd be all for new async storage interfaces for the next major version. :) |
Yes althought right now the storages coming with thumbor are blocking the
|
+1 for callbacks |
I noticed a huge bottleneck when using the thumbor storages and result storages:
The BaseStorage interface in
https://github.com/thumbor/thumbor/blob/master/thumbor/result_storages/__init__.py#L22
supplies only synchronous methods with return values instead of callbacks like the loader interface in
https://github.com/thumbor/thumbor/blob/master/thumbor/loaders/file_loader.py
(perfectly fine)
https://github.com/thumbor/thumbor/blob/master/thumbor/result_storages/__init__.py
Should look more like
The text was updated successfully, but these errors were encountered: