-
Notifications
You must be signed in to change notification settings - Fork 694
GEODE-9329: Implement radish leaderboard data structs #6548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEODE-9329: Implement radish leaderboard data structs #6548
Conversation
onichols-pivotal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- the inclusion of OrderStatisticTree appears to comply with the ASF's TREATMENT OF THIRD-PARTY WORKS guidelines
| import org.apache.geode.redis.internal.netty.Coder; | ||
|
|
||
| public class RedisSortedSet extends AbstractRedisData { | ||
| private Object2ObjectOpenCustomHashMap<byte[], byte[]> members; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Object2DoubleOpenCustomHashMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hashmap is used for quick lookup of scores associated with members. The score values are not used for sorting, they are just sent back to the client. So we're using Object2Object of byte[]'s to avoid the cost of converting to/from Doubles.
(We do convert to double for the OrderStatistics[Tree|Set], because we have to for sorting.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it might be better in the long run to have the members map be <byte[], OrderedSetEntry>. My reasoning is that with the current implementation, every time we do a zrank we create a new OrderedSetEntry just to get the index of an equal but already existing OrderedSetEntry in the scoreSet. If we added the OrderedSetEntry we create when we do a zadd to both data structures, then we could avoid an extra allocation for each zrank (and possibly other future commands) at the expense of a little extra memory footprint. I think for this to work properly though, we'd need both a Double score and byte[] scoreBytes field in OrderedSetEntry otherwise the whole point of the map would be kind of defeated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. We could avoid storing the byteArray in the OrderedSetEntry, and pay the performance cost of converting from Double to byteArray for things like ZSCORE. But in this case it's probably better to spend a little memory overhead, in order to avoid every bit of performance drag we can.
|
|
||
| static class OrderedSetEntry implements Comparable<OrderedSetEntry> { | ||
| public byte[] member; | ||
| public Double score; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a double (lowercase, primitive)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These members can be private and final I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the private and final. However, when sorting we use Double's compareTo(), which is not available on the double primitive.
| this.score = makeDoubleWhileHandlingInfinity(score); | ||
| } | ||
|
|
||
| private Double makeDoubleWhileHandlingInfinity(byte[] score) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these conversions belong in the Coder class? I could see having double Coder.bytesToDouble(byte[]) and byte[] Coder.doubleToBytes(double) methods. Lower case primitive doubles please!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure; Coder would need to handle +- infinity but that kind of implementation detail is best encapsulated in a class like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coder already has bytesToDouble() and doubleToBytes() methods, so there's no need to reimplement that here. It's possible that the implementation in Coder needs a tweak, but I think Hale may be working on that as part of #6534
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not making an changes to coder. I have a spike on these changes, but they depend on this PR being merged first. So I'll pick that up once this is in.
|
This pull request introduces 1 alert when merging 5ff0749 into 44e5d4e - view on LGTM.com new alerts:
|
|
This pull request introduces 1 alert when merging 19a9fa6 into 44e5d4e - view on LGTM.com new alerts:
|
| } | ||
|
|
||
| @Override | ||
| public boolean equals(Object o) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this earlier, but leaving this here as well. hashCode() and equals() for this class should obey the set contract. See the javadocs for Set.equals() and Set.hashCode()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're tweaked, and pass all existing tests on my machine. We'll see what LGTM has to say.
|
This pull request introduces 1 alert when merging fda8b61 into e39c4c5 - view on LGTM.com new alerts:
|
nonbinaryprogrammer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well done with all of this! as far as I can tell the tree handling stuff is solid, I've just got a couple of questions and suggestions
.../distributedTest/java/org/apache/geode/redis/internal/executor/sortedset/ZRankDUnitTest.java
Outdated
Show resolved
Hide resolved
|
|
||
| @Test | ||
| public void shouldReturnNil_givenNonexistentMember() { | ||
| jedis.zadd("key", 1.0, "member"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to pull "key" and "member" into constants since they're used all over
...st/java/org/apache/geode/redis/internal/executor/sortedset/AbstractZRankIntegrationTest.java
Outdated
Show resolved
Hide resolved
...ith-redis/src/main/java/org/apache/geode/redis/internal/executor/sortedset/ZAddExecutor.java
Outdated
Show resolved
Hide resolved
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
| this.score = makeDoubleWhileHandlingInfinity(score); | ||
| } | ||
|
|
||
| private Double makeDoubleWhileHandlingInfinity(byte[] score) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not making an changes to coder. I have a spike on these changes, but they depend on this PR being merged first. So I'll pick that up once this is in.
...compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
Outdated
Show resolved
Hide resolved
| * @author Rodion "rodde" Efremov | ||
| * @version 1.6 (Feb 11, 2016) | ||
| */ | ||
| @SuppressWarnings("all") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to be a little more targeted in suppressing warnings? this seems overly forgiving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I would like to see this suppression removed and if there's no way to fix one of the warnings in this file, to suppress only that warning.
...compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be able to do a full review today, but I wanted to make some initial comments. Also, for the PR title/commit message, could we follow the convention described in our wiki? (so, GEODE-9329: Implement...)
LICENSE
Outdated
| - OrderStatisticTree (https://github.com/coderodde/OrderStatisticTree) | ||
| Copyright (c) 2021 Rodion Efremov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small thing, but I was wondering if there should be a comma before Copyright to be consistent with the lines above it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
| private static final String LOCAL_HOST = "127.0.0.1"; | ||
| private static final String KEY = "key"; | ||
| private static final int SET_SIZE = 1000; | ||
| private static final int JEDIS_TIMEOUT = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are constants in RedisClusterStartupRule that could be used here in lieu of LOCAL_HOST and JEDIS_TIMEOUT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jens and I reworked the DUnit tests, and got rid of all but ZAddDUnitTest and ZRemDUnitTest. But I'm using those constants in those tests now.
| private static final int REDIS_CLIENT_TIMEOUT = | ||
| Math.toIntExact(GeodeAwaitility.getTimeout().toMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are constants in RedisClusterStartupRule that could be used here in lieu of REDIS_CLIENT_TIMEOUT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| @Test | ||
| public void shouldError_givenWrongNumberOfArguments() { | ||
| assertExactNumberOfArgs(jedis, Protocol.Command.ZSCORE, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be ZRANK instead of ZSCORE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dang you're good!
|
|
||
| // get the ranks of the members | ||
| Iterator<String> membersIterator = map.keySet().iterator(); | ||
| String[] members = new String[10]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we reuse the ENTRY_COUNT variable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in a couple places.
| rankMap.put(rank, memberName.getBytes(StandardCharsets.UTF_8)); | ||
| memberList.add(memberName.getBytes(StandardCharsets.UTF_8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the Coder.stringToBytes() method for these instead of .getBytes? I think we're trying to make that consistent throughout the Radish module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, updated. Thanks!
| Double.valueOf(subCommandString); | ||
| } catch (NumberFormatException nfe) { | ||
| // Use regex to validate score is a valid floating point number | ||
| if (!subCommandString.matches("^[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?$")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an advantage to using the regex over Double.parseDouble? If not, it seems clearer and less error-prone to just try parsing the double. I realize we ignore the output of that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the intent is to avoid the whole conversion, and just do the regex. Looking at, e.g. OpenJDK's implementation of parseDouble, though, they pretty much do a regex anyway. There might possibly be one extra stack frame, but it shouldn't matter particularly. So I swapped it for just Double.parseDouble().
| case "-inf": | ||
| case "inf": | ||
| case "+inf": | ||
| scoreFound = true; | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to rebase now that @nonbinaryprogrammer's PR for infinity handling in ZADD is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did that this morning. Hopefully things will be cleaner.
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of comments inline, sorry if it's overwhelming. Would it also be possible to rearrange the contents of OrderStatisticsTree a little so that fields are at the top rather than a quarter of the way down the class, and so that inner classes appear at the bottom? it would improve the readability a lot, I think.
.../distributedTest/java/org/apache/geode/redis/internal/executor/sortedset/ZRankDUnitTest.java
Outdated
Show resolved
Hide resolved
.../distributedTest/java/org/apache/geode/redis/internal/executor/sortedset/ZRankDUnitTest.java
Outdated
Show resolved
Hide resolved
.../distributedTest/java/org/apache/geode/redis/internal/executor/sortedset/ZRankDUnitTest.java
Outdated
Show resolved
Hide resolved
.../distributedTest/java/org/apache/geode/redis/internal/executor/sortedset/ZRankDUnitTest.java
Outdated
Show resolved
Hide resolved
...st/java/org/apache/geode/redis/internal/executor/sortedset/AbstractZRankIntegrationTest.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @Test | ||
| public void testConcurrentOrIllegalStateOnRemove() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test name seems to imply the test is doing something different from what it's actually doing. The test throws an IllegalStateException because remove() is called before next() has been called, not because of the call to add(). If next() is called on each iterator before the add() call, then the calls to remove() throw ConcurrentModificationException, which is what I think the test was originally intended to show.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, but the subsequent test does the ConcurrentModificationException test.
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
...-redis/src/test/java/org/apache/geode/redis/internal/collections/OrderStatisticTreeTest.java
Outdated
Show resolved
Hide resolved
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more things to clean up.
| new ConcurrentLoopingThreads(SET_SIZE, | ||
| (i) -> jedis.zadd(key, memberScoreMap1), | ||
| (i) -> jedis.zadd(key, memberScoreMap2)).runInLockstep(); | ||
| clusterStartUp.crashVM(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this 3 be tied directly to the 4 used to initialize the RedisClusterStartupRule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now relative to a constant.
|
|
||
| server1.stop(); | ||
| server2.stop(); | ||
| server3.stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't added by this PR, but since you're changing this file, it's not necessary to call stop() on the servers, as this is handled automatically by RedisClusterStartupRule. With these calls removed, the server1, server2 and server3 fields are no longer needed and can be removed entirely.
| assertThat(score).isEqualTo(memberScoreMap2.get(member2)); | ||
| } | ||
|
|
||
| throw lastException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could hit an NPE if maxRetries is less than 1. A check before the for loop can fix this:
if (maxRetries < 1) {
maxRetries = 1;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled with an assert. If someone ever fed it anything less than 1, it should be a failure condition.
| jedis.zadd(KEY, map); | ||
|
|
||
| // get the ranks of the members | ||
| Iterator<String> membersIterator = map.keySet().iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is no longer used and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditched.
| if (!c.getClass().equals(HashSet.class)) { | ||
| c = new HashSet<>(c); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that we're copying into a HashSet here? It should be fine to call contains() on any Collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't seem to be needed. We've already made enough changes from the upstream implementation, once this PR is accepted I'm hoping to do a PR back to the original dev.
| if (insertionMode) { | ||
| // Whenever fixing after insertion, at most one rotation is | ||
| // required in order to maintain the balance. | ||
| return true; | ||
| } | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to just return insertionMode;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but I kept the comment. :)
| return getSortedSetSize() - initialSize + changesCount; | ||
| } | ||
|
|
||
| private void validateScoreIsDouble(byte[] score) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is no longer used and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a leftover from the multiple rebases, thanks for catching it!
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the feedback!
Adding the OrderedStatisticsTree from https://github.com/coderodde/OrderStatisticTree to the geode codebase. Adding the appropriate information to the LICENSE file. Adding a new quick check test and benchmark for this class
…ations/conversions
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of small comments after the rebase.
| if (isPositiveInfinity(value)) { | ||
| return POSITIVE_INFINITY; | ||
| } | ||
| if (isNegativeInfinity(value)) { | ||
| return NEGATIVE_INFINITY; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to move this implementation into Coder.bytesToDouble() so that it's available everywhere?
| if (bytes == null) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This null check is not necessary, as equalsIgnoreCaseBytes() handles nulls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it therefore also not be necessary in isPositiveInfinity() and isNegativeInfinity() below it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good catch, thanks for cleaning that up.
…update CoderTest for NaN handling
| return NEGATIVE_INFINITY; | ||
| } | ||
| if (isNaN(bytes)) { | ||
| throw new NumberFormatException(ERROR_NOT_A_VALID_FLOAT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, I think this should return Double.NaN rather than throwing, as that is a valid value that a Double can take. It'll be the responsibility of the caller to determine what to do with the NaN value in that case.
Approving reviews available
Adds the OrderStatisticsTree/Set data type to support Redis-style Sorted Set data types and operations. To facilitate testing, this also implements the ZRANK command. (And thus, will remove the need for GEODE-9183.)