Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Round robin fixes #1336

Closed
wants to merge 2 commits into from
Closed

Conversation

NWilson
Copy link

@NWilson NWilson commented Jul 30, 2019

There were some nasty issues with the round-robin host policies!

The round-robin host policy wasn't actually returning all the nodes to each iterator, you'd get queries where it would try some nodes twice, some not at all, because the iterators were using a shared index.

The DC-aware round-robin policy was similarly broken, and wouldn't even return any remote nodes: as long as there were any local nodes, it would never query a remote node, so if you were in a cluster where the local nodes went down the driver wouldn't fall back to anything else!

I've closely copied the strategy used by the Datastax Java driver, so I think these fixes are right.

Added tests (which fail without the code changes). See individual commit messages for further details.

Added a test which fails: creating two iterators and iterating in
order ABAB would return two copies of node-1 for A, two copies of
node-2 for B. This was pretty useless, since each iterator shouldn't
return the same node twice, and should return all nodes eventually.
The bug was that the index used to return the next node was global
and shared between all iterators!

The code change causes the new test to pass.
… any remote hosts at all!

Added tests to cover missing cases and fix code.  The old code simply
never fell back to remote entries after exhausting local nodes.  The new
code does round-robin over local, then when it runs out of local, does
round-robin over remote nodes.

We also don't return remote nodes for LOCAL_ONE and LOCAL_QUORUM
queries, to mimic the datastax Java driver's behaviour.
Copy link
Member

@alourie alourie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only a small nitpick. Would like @Zariel to have a look as well.

}

hosts := r.hosts.get()
var startIdx int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think it would be cleaner here to return nil immediately if len(hosts) == 0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a stylistic change I'd be happy to do.

But are you sure? None of the existing Pick() implementations returns nil - the function they return can return nil, but the NextHost function itself is always non-nil.

I tried making the change, but one of the tests currently asserts that Pick() returns a non-nil function even when there are no hosts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, ok, that's a valid point. Ignore then ;-)

}

hosts := r.hosts.get()
var startIdx int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, ok, that's a valid point. Ignore then ;-)

@Zariel
Copy link
Member

Zariel commented Sep 22, 2019

fixed via #1351

@Zariel Zariel closed this Sep 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants