You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've written a small Java program to copy and transform keys and their values from one redis cluster to another. I have to process tens of millions of keys. To do the job properly I need to send a round-trip request to get the type of the keys returned from SCAN which causes time and network traffic to add up quickly with millions of keys.
I'd like to suggest a modification to the SCAN command to allow for a filter on type or other attributes so I can batch the copy by type and save the round trip. Something like:
scan 0 match * count 100 type hash
or
scan 0 match * count 100 filter type=hash
which will filter the scan results to only return keys that represent a hash type. This avoids the request for the type. I can then parallelize the copy process to do hash, set, zset, etc. as separate tasks and avoid the overhead of the type call.
I can imagine many kinds of filters, such as keys that will expire in 1 hour, sets that contain more than nnn elements, lists with exactly one element, etc.
Perhaps filters can be combined : filter type=list AND ttl<3600
The text was updated successfully, but these errors were encountered:
I've written a small Java program to copy and transform keys and their values from one redis cluster to another. I have to process tens of millions of keys. To do the job properly I need to send a round-trip request to get the type of the keys returned from SCAN which causes time and network traffic to add up quickly with millions of keys.
I'd like to suggest a modification to the SCAN command to allow for a filter on type or other attributes so I can batch the copy by type and save the round trip. Something like:
scan 0 match * count 100 type hash
or
scan 0 match * count 100 filter type=hash
which will filter the scan results to only return keys that represent a hash type. This avoids the request for the type. I can then parallelize the copy process to do hash, set, zset, etc. as separate tasks and avoid the overhead of the type call.
I can imagine many kinds of filters, such as keys that will expire in 1 hour, sets that contain more than
nnn
elements, lists with exactly one element, etc.Perhaps filters can be combined :
filter type=list AND ttl<3600
The text was updated successfully, but these errors were encountered: