ZRANGEBYLEX key min max [WITHSCORES] [LIMIT ...] [NOCASE]
How it works... To start the most important thing:
> ZRANGEBYLEX zset (aaa (bbb
-ERR Not all the elements in the sorted set have the same score.
So this is the requisite of the command, and it can do a super fast
check about this indeed, just check the first and last element of the
skiplist, and check if the score is equal.
Now assuming they are equal, what the command does is trivial, it
returns all the elements in the specified range, unless LIMIT is
specified of course.
Now how ranges are specified? I'm cloning your range structure, as
'zrangelexspec', but form the user point of view this is how it works:
The first character of the min or max specify if it is an open, closed
interval, or a string infinity or a string minus infinity:
"(foo" -> "foo" open interval
"[foo" -> "foo" closed interval
"-" -> the smallest string possible
"+" -> the biggest string possible
So for instance ZRANGEBYLEX zset - + will just return everything, and so forth.
> ZRANGEBYLEX zset foo bar
-ERR Syntax error
> ZRANGEBYLEX zset (foo +bar
-ERR Syntax error
I assume this works like SORT .. ALPHA regarding UTF-8 support?
SORT .. ALPHA
Seldaek: yes, for now we'll not support other than ASCII comparison, this is very important until we find an appropriate solution, since for instance you want to be able to put IP address in binary form, for instance, and then get ranges, without any collation doing strange things.
However if you want a different collation, you can do it in this way, for every element instead of adding the element itself you add instead:
This way you can control the collation yourself.
Sounds like a very good idea, especially for storing timeseries, which should be a breeze with this
It would be even better to be able to use long instead of string for keys, for storage purposes, but then it probably wouldn't be exactly a zset anymore.
+1 This would be an incredibly helpful feature!
Anyone using sorted sets for indexing string values (myself included) has to resort to inefficient workarounds to do range queries. Right now I'm using something along the lines of what is described here. Basically you do a ZADD and ZRANK to determine what would be the rank of the start and end string and then use ZRANGE to get the values.
Aside from the performance drawback, it's error-prone since elements could have been added/removed from the set in between the time you did ZRANK and the time you did ZRANGE. Resorting to scripting or using WATCH could avoid that issue, but that's nowhere near as easy as a builtin solution. And for sets which are changed frequently WATCH wouldn't suffice at all.
Alternative solutions involve using raw byte values or an order-preserving hash function to calculate scores. These are limited to some number of characters (I think you can get a max of 10-11 characters using some clever tricks) because of the limited precision of scores. These also usually limit the character set you can use for your strings.
A solution along the lines of what you're describing would be awesome.
Is anyone currently working on this feature and can I help?
@stephenalexbrowne could your technique be put into a lua script and run inside the server?
Yes, I haven't tried but it should be possible. However...
Haven't done much scripting with Redis personally, so correct me if I'm wrong.
Implemented in commit 3a6a1e4