Skip to content

Conversation

@ringles
Copy link
Contributor

@ringles ringles commented Jun 2, 2021

Adds the OrderStatisticsTree/Set data type to support Redis-style Sorted Set data types and operations. To facilitate testing, this also implements the ZRANK command. (And thus, will remove the need for GEODE-9183.)

Copy link
Contributor

@onichols-pivotal onichols-pivotal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import org.apache.geode.redis.internal.netty.Coder;

public class RedisSortedSet extends AbstractRedisData {
private Object2ObjectOpenCustomHashMap<byte[], byte[]> members;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Object2DoubleOpenCustomHashMap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hashmap is used for quick lookup of scores associated with members. The score values are not used for sorting, they are just sent back to the client. So we're using Object2Object of byte[]'s to avoid the cost of converting to/from Doubles.

(We do convert to double for the OrderStatistics[Tree|Set], because we have to for sorting.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might be better in the long run to have the members map be <byte[], OrderedSetEntry>. My reasoning is that with the current implementation, every time we do a zrank we create a new OrderedSetEntry just to get the index of an equal but already existing OrderedSetEntry in the scoreSet. If we added the OrderedSetEntry we create when we do a zadd to both data structures, then we could avoid an extra allocation for each zrank (and possibly other future commands) at the expense of a little extra memory footprint. I think for this to work properly though, we'd need both a Double score and byte[] scoreBytes field in OrderedSetEntry otherwise the whole point of the map would be kind of defeated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. We could avoid storing the byteArray in the OrderedSetEntry, and pay the performance cost of converting from Double to byteArray for things like ZSCORE. But in this case it's probably better to spend a little memory overhead, in order to avoid every bit of performance drag we can.


static class OrderedSetEntry implements Comparable<OrderedSetEntry> {
public byte[] member;
public Double score;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a double (lowercase, primitive)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These members can be private and final I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the private and final. However, when sorting we use Double's compareTo(), which is not available on the double primitive.

this.score = makeDoubleWhileHandlingInfinity(score);
}

private Double makeDoubleWhileHandlingInfinity(byte[] score) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these conversions belong in the Coder class? I could see having double Coder.bytesToDouble(byte[]) and byte[] Coder.doubleToBytes(double) methods. Lower case primitive doubles please!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; Coder would need to handle +- infinity but that kind of implementation detail is best encapsulated in a class like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coder already has bytesToDouble() and doubleToBytes() methods, so there's no need to reimplement that here. It's possible that the implementation in Coder needs a tweak, but I think Hale may be working on that as part of #6534

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not making an changes to coder. I have a spike on these changes, but they depend on this PR being merged first. So I'll pick that up once this is in.

@lgtm-com
Copy link

lgtm-com bot commented Jun 2, 2021

This pull request introduces 1 alert when merging 5ff0749 into 44e5d4e - view on LGTM.com

new alerts:

  • 1 for Inconsistent equals and hashCode

@lgtm-com
Copy link

lgtm-com bot commented Jun 2, 2021

This pull request introduces 1 alert when merging 19a9fa6 into 44e5d4e - view on LGTM.com

new alerts:

  • 1 for Inconsistent equals and hashCode

}

@Override
public boolean equals(Object o) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this earlier, but leaving this here as well. hashCode() and equals() for this class should obey the set contract. See the javadocs for Set.equals() and Set.hashCode()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're tweaked, and pass all existing tests on my machine. We'll see what LGTM has to say.

@lgtm-com
Copy link

lgtm-com bot commented Jun 2, 2021

This pull request introduces 1 alert when merging fda8b61 into e39c4c5 - view on LGTM.com

new alerts:

  • 1 for Inconsistent equals and hashCode

@jdeppe-pivotal jdeppe-pivotal added the redis Issues related to the geode-for-redis module label Jun 3, 2021
Copy link
Contributor

@nonbinaryprogrammer nonbinaryprogrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well done with all of this! as far as I can tell the tree handling stuff is solid, I've just got a couple of questions and suggestions


@Test
public void shouldReturnNil_givenNonexistentMember() {
jedis.zadd("key", 1.0, "member");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to pull "key" and "member" into constants since they're used all over

this.score = makeDoubleWhileHandlingInfinity(score);
}

private Double makeDoubleWhileHandlingInfinity(byte[] score) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not making an changes to coder. I have a spike on these changes, but they depend on this PR being merged first. So I'll pick that up once this is in.

* @author Rodion "rodde" Efremov
* @version 1.6 (Feb 11, 2016)
*/
@SuppressWarnings("all")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to be a little more targeted in suppressing warnings? this seems overly forgiving

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I would like to see this suppression removed and if there's no way to fix one of the warnings in this file, to suppress only that warning.

Copy link
Member

@sabbey37 sabbey37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't be able to do a full review today, but I wanted to make some initial comments. Also, for the PR title/commit message, could we follow the convention described in our wiki? (so, GEODE-9329: Implement...)

LICENSE Outdated
Comment on lines 292 to 293
- OrderStatisticTree (https://github.com/coderodde/OrderStatisticTree)
Copyright (c) 2021 Rodion Efremov
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing, but I was wondering if there should be a comma before Copyright to be consistent with the lines above it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment on lines 41 to 44
private static final String LOCAL_HOST = "127.0.0.1";
private static final String KEY = "key";
private static final int SET_SIZE = 1000;
private static final int JEDIS_TIMEOUT =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are constants in RedisClusterStartupRule that could be used here in lieu of LOCAL_HOST and JEDIS_TIMEOUT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jens and I reworked the DUnit tests, and got rid of all but ZAddDUnitTest and ZRemDUnitTest. But I'm using those constants in those tests now.

Comment on lines 46 to 47
private static final int REDIS_CLIENT_TIMEOUT =
Math.toIntExact(GeodeAwaitility.getTimeout().toMillis());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are constants in RedisClusterStartupRule that could be used here in lieu of REDIS_CLIENT_TIMEOUT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Test
public void shouldError_givenWrongNumberOfArguments() {
assertExactNumberOfArgs(jedis, Protocol.Command.ZSCORE, 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ZRANK instead of ZSCORE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang you're good!


// get the ranks of the members
Iterator<String> membersIterator = map.keySet().iterator();
String[] members = new String[10];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we reuse the ENTRY_COUNT variable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in a couple places.

Comment on lines 118 to 119
rankMap.put(rank, memberName.getBytes(StandardCharsets.UTF_8));
memberList.add(memberName.getBytes(StandardCharsets.UTF_8));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the Coder.stringToBytes() method for these instead of .getBytes? I think we're trying to make that consistent throughout the Radish module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, updated. Thanks!

Double.valueOf(subCommandString);
} catch (NumberFormatException nfe) {
// Use regex to validate score is a valid floating point number
if (!subCommandString.matches("^[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?$")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage to using the regex over Double.parseDouble? If not, it seems clearer and less error-prone to just try parsing the double. I realize we ignore the output of that though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the intent is to avoid the whole conversion, and just do the regex. Looking at, e.g. OpenJDK's implementation of parseDouble, though, they pretty much do a regex anyway. There might possibly be one extra stack frame, but it shouldn't matter particularly. So I swapped it for just Double.parseDouble().

Comment on lines 80 to 93
case "-inf":
case "inf":
case "+inf":
scoreFound = true;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to rebase now that @nonbinaryprogrammer's PR for infinity handling in ZADD is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that this morning. Hopefully things will be cleaner.

Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of comments inline, sorry if it's overwhelming. Would it also be possible to rearrange the contents of OrderStatisticsTree a little so that fields are at the top rather than a quarter of the way down the class, and so that inner classes appear at the bottom? it would improve the readability a lot, I think.

}

@Test
public void testConcurrentOrIllegalStateOnRemove() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name seems to imply the test is doing something different from what it's actually doing. The test throws an IllegalStateException because remove() is called before next() has been called, not because of the call to add(). If next() is called on each iterator before the add() call, then the calls to remove() throw ConcurrentModificationException, which is what I think the test was originally intended to show.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, but the subsequent test does the ConcurrentModificationException test.

@nonbinaryprogrammer nonbinaryprogrammer changed the title Geode 9329 implement radish leaderboard data structs GEODE-9329: implement radish leaderboard data structs Jun 8, 2021
@ringles ringles changed the title GEODE-9329: implement radish leaderboard data structs GEODE-9329: Implement radish leaderboard data structs Jun 9, 2021
Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more things to clean up.

new ConcurrentLoopingThreads(SET_SIZE,
(i) -> jedis.zadd(key, memberScoreMap1),
(i) -> jedis.zadd(key, memberScoreMap2)).runInLockstep();
clusterStartUp.crashVM(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this 3 be tied directly to the 4 used to initialize the RedisClusterStartupRule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now relative to a constant.


server1.stop();
server2.stop();
server3.stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't added by this PR, but since you're changing this file, it's not necessary to call stop() on the servers, as this is handled automatically by RedisClusterStartupRule. With these calls removed, the server1, server2 and server3 fields are no longer needed and can be removed entirely.

assertThat(score).isEqualTo(memberScoreMap2.get(member2));
}

throw lastException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could hit an NPE if maxRetries is less than 1. A check before the for loop can fix this:

    if (maxRetries < 1) {
      maxRetries = 1;
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled with an assert. If someone ever fed it anything less than 1, it should be a failure condition.

jedis.zadd(KEY, map);

// get the ranks of the members
Iterator<String> membersIterator = map.keySet().iterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is no longer used and can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditched.

Comment on lines 121 to 123
if (!c.getClass().equals(HashSet.class)) {
c = new HashSet<>(c);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that we're copying into a HashSet here? It should be fine to call contains() on any Collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't seem to be needed. We've already made enough changes from the upstream implementation, once this PR is accepted I'm hoping to do a PR back to the original dev.

Comment on lines 580 to 585
if (insertionMode) {
// Whenever fixing after insertion, at most one rotation is
// required in order to maintain the balance.
return true;
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to just return insertionMode;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but I kept the comment. :)

return getSortedSetSize() - initialSize + changesCount;
}

private void validateScoreIsDouble(byte[] score) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is no longer used and can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a leftover from the multiple rebases, thanks for catching it!

Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the feedback!

Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small comments after the rebase.

Comment on lines 365 to 370
if (isPositiveInfinity(value)) {
return POSITIVE_INFINITY;
}
if (isNegativeInfinity(value)) {
return NEGATIVE_INFINITY;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move this implementation into Coder.bytesToDouble() so that it's available everywhere?

Comment on lines 427 to 429
if (bytes == null) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This null check is not necessary, as equalsIgnoreCaseBytes() handles nulls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it therefore also not be necessary in isPositiveInfinity() and isNegativeInfinity() below it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch, thanks for cleaning that up.

return NEGATIVE_INFINITY;
}
if (isNaN(bytes)) {
throw new NumberFormatException(ERROR_NOT_A_VALID_FLOAT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, I think this should return Double.NaN rather than throwing, as that is a valid value that a Double can take. It'll be the responsibility of the caller to determine what to do with the NaN value in that case.

@ringles ringles requested review from nonbinaryprogrammer and removed request for nonbinaryprogrammer June 22, 2021 18:10
@ringles ringles dismissed nonbinaryprogrammer’s stale review June 22, 2021 18:37

Approving reviews available

@ringles ringles merged commit 4f106c6 into apache:develop Jun 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

redis Issues related to the geode-for-redis module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants