New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yield during CPU intense commands (eg SUNIONSTORE)? #4582
Comments
Hello @Plasma, there is a simpler fix for that, you can write a module that implements something like SUNIONSTORE as a background command using the blocking command and the threaded contexts API. I plan to have some semi-official module with this kind of background stuff soon or later... |
And if the interleaved command touches one of the keys involved? Even a
read would be inconsistent. Operations that write would make ordering of
replication awkward.
At best case you could process read-only commands that don't touch any of
the same keys, but: some read-only commands can touch arbitrary keys (I'm
looking at you, SORT).
So as someone with no real knowledge of the internals - I can't see this
happing without a pretty major engine change, like: a 5.0 feature. But I
could be wrong!
…On 8 Jan 2018 9:18 a.m., "Andrew Armstrong" ***@***.***> wrote:
Background
We have a case where it can take sometimes up to 3 seconds for the use of
SUNIONSTORE to complete. I see other users have it worse too.
I'm using SUNIONSTORE (along with SCARD + DEL) to simply calculate the
cardinality of multiple unioned sets.
Problem
This blocks other commands to Redis (eg cache reads) and so stalls are
possible for other requests.
Workaround today
As a work-around to the SUNIONSTORE performance, I'm now performing
multiple SUNIONSTORE's across smaller sets (so the CPU time is less) and
then merging the sub-sets into one final SUNION.
This lets me interleave other requests by blocking less in CPU time, but
does increase overall CPU usage.
Proposal / Suggestion
Would it make sense to be able to have SUNIONSTORE have some form of "low
priority" response, to allow for interleaving of other commands?
In my case, I am performing batch processing and so do not require a real
time response, and would give up speed in a response in exchange for
allowing other operations to not be blocked behind a CPU intensive
SUNIONSTORE.
I understand redis is single threaded, so this may not be a simple
addition, but wanted to open it up for discussion.
Thanks for the great product.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4582>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABDsNTdrYGiaKy4HaoZcQphcOepPiIpks5tId18gaJpZM4RWGxk>
.
|
@mgravell yes, you are right, the semantical properties of the command would be different, but this is often ok with those kind of "background low priority" unions/intersections, what happens is that the module command uses SCAN to get elements from the sets, and store them, so elements that are added or removed from set A and B that are merged into C, may be present or not into C, but all the elements that had been in A and B all the time of the operation took to run, will surely be in C. Because of that the operation is hardly random and often very usable according to the business logic of the application. |
P.S. We should be able to do it the proper way soon btw... there is already a design for bringing this to the modules. It's based on the ability of modules to lock at single keys level, perform a background operation (the clients trying to access the same keys are blocked and re-started when the operation finished), and later unlock the keys and continue. |
Another note: there are no replication breakages in the schema above, everything would be replicated perfectly on slaves and AOF. It's just a matter of the consistency semantics of the operation. Basically such a module operates just like if you implement SUNIONSTORE into a client that uses SCAN to simulate a store into another key. |
Thanks for the discussion -- agree with @mgravell that it kind of breaks because between yields keys could be changed. Is there any performance improvement to be able to SCARD the union of each set in one go, without SUNIONSTORE + SCARD + DEL combination? I imagine not, as the CPU time is likely 99% spent in actually unioning the sets, not storing the result in another set. |
If you're in the mood for slightly less accurate but more efficient cardinality estimates, you should consider using the HyperLogLog. |
But are you sure that for your business logic you need point-in-time consistency in intersections/unions? Example: intersections are often used in ecommerce sites using Redis in order to enlist all the products that have characteristic A and B, like all the TVs which are OLED and in the range 40-50 inches. Now there is a process updating the sets A and B, removing and adding TVs as the marketplace gets updated continuously via various APIs. If in the final set a given TV which is added or remove during the operation is present or not in the intersection C, is really important? Anyway even running a plain |
Thanks @itamarhaber I've actually prototyped this side-by-side my SET implementation, slowly rolling it out now to compare difference in real-world data numbers. Definitely trivial CPU usage already compared to union. @antirez I think I could lose some consistency yes, and in fact, in the application design, I "know" some sets won't be changing, so not worried about underlying data being changed between yields for some operations. |
Background
We have a case where it can take sometimes up to 3 seconds for the use of SUNIONSTORE to complete. I see other users have it worse too.
I'm using SUNIONSTORE (along with SCARD + DEL) to simply calculate the cardinality of multiple unioned sets.
Problem
This blocks other commands to Redis (eg cache reads) and so stalls are possible for other requests.
Workaround today
As a work-around to the SUNIONSTORE performance, I'm now performing multiple SUNIONSTORE's across smaller sets (so the CPU time is less) and then merging the sub-sets into one final SUNION.
This lets me interleave other requests by blocking less in CPU time, but does increase overall CPU usage.
Proposal / Suggestion
Would it make sense to be able to have SUNIONSTORE have some form of "low priority" response, to allow for interleaving of other commands?
In my case, I am performing batch processing and so do not require a real time response, and would give up speed in a response in exchange for allowing other operations to not be blocked behind a CPU intensive SUNIONSTORE.
I understand redis is single threaded, so this may not be a simple addition, but wanted to open it up for discussion.
Thanks for the great product.
The text was updated successfully, but these errors were encountered: