Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TokenAware search order #47

Closed
JohnBatali opened this issue May 2, 2012 · 1 comment
Closed

TokenAware search order #47

JohnBatali opened this issue May 2, 2012 · 1 comment

Comments

@JohnBatali
Copy link

I'm looking at the implementation of the getPartition method in TokenPartitionedTopology.

This method takes a BigInteger "token" as its argument and is supposed to return the HostConnectionPoolPartition that "owns" the token.

It seems to me that the the method, as implemented, is inconsistent with the description on the Cassandra wiki page:

Each Cassandra server [node] is assigned a unique Token that determines what keys it is the first replica for. If you sort all nodes' Tokens, the Range of keys each is responsible for is (PreviousToken, MyToken], that is, from the previous token (exclusive) to the node's token (inclusive). The machine with the lowest Token gets both all keys less than that token, and all keys greater than the largest Token; this is called a "wrapping Range."

The implementation of TokenPartitionedTopology.getPartition returns the partition whose id is the maximum value, over all partition ids, less than or equal to the argument token.

For examples, if the partitions have the ids: [ 0, 14, 25]:

getPartition(9) returns the partition whose id = 0 ( 0 is the maximum value in {0, 14, 25} less than 9)
getPartition(19) returns the partition whose id = 14 (14 is the maximum value in {0, 14, 25} less than 19)
getPartition(29) returns the partition whose id = 25 (25 is the maximum value in {0, 14, 25} less than 29)

This doesn't seem consistent with the description above, where, as I read it, the assignments in the examples should be:

getPartition(9) should return the partition whose id = 14 ( 0 < 9 <= 14)
getPartition(19) should return the partition whose id = 25 (14 < 19 <= 25)
getPartition(29) should return the partition whose id = 0 (because of "wrapping range")

(If I have something confused, let me know, the rest of this message just builds on the confusion.)

The Cassandra wiki page suggests, but does not seem to explicitly require, that the smallest token assigned to a node have the value 0.
(In particular, the passage I quoted above suggests that the lowest token might be greater than zero.)

If the smallest partition id is greater than 0, the implementation of TokenPartitionedTopology.getPartition assigns lesser tokens to the partition with the highest id:

For partitions with ids: [ 3, 14, 25]:

getPartition(1) returns the partition whose id = 25) (?? should the partition whose id = 0 ("wrapping range"))
getPartition(9) returns the partition whose id = 3) (?? should be the partition whose id = 14 ( 3 < 9 <= 14))
getPartition(19) returns the partition whose id = 14) (?? should be the partition whose id = 25 (14 < 19 <= 25))
getPartition(29) returns the partition whose id = 25) (?? should be the partition whose id = 0 ("wrapping range"))

(Actually the first example in this last set will throw an exception as I describe in my next issue.)

John Batali
Knewton

@elandau
Copy link
Contributor

elandau commented Jul 13, 2012

Fixed in 1.0.4

@elandau elandau closed this as completed Jul 13, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants