Skip to content

Conversation

kw217
Copy link
Contributor

@kw217 kw217 commented Nov 25, 2019

When a host comes back up after being down, a request processor (unlike a control connection) only notifies the load balancing policy if the host is not ignored.

Sadly some policies ignore remote hosts that are down.

In combination, this means that remote hosts can never be used by a request (processor) after having once gone down.

This fix resolves this, by always notifying a policy of all host-up events, regardless of the distance of that host.

I've tested this locally, and it has the desired effect. My test case is:

  • Start nodes in two datacenters.
  • Set up a token-aware, dc-aware policy.
  • Take down the local DC. Observe traffic still goes to the remote DC.
  • Take down the remote DC. Observe traffic fails (as expected).
  • Bring up the remote DC. Expect traffic will start going to the remote DC.

In the last step, in 2.14.0, instead I observe that traffic continues to fail. Logs and netstat reveal the control connection is up to the remote DC, but it is not used by the request processor. This PR fixes this, so that traffic does go to the remote DC.

When a host comes back up after being down, a request processor
(unlike a control connection) only notifies the load balancing
policy if the host is not ignored.

Sadly some policies ignore remote hosts that are down.

In combination, this means that remote hosts can never be used by
a request (processor) after having once gone down.

This fix resolves this, by always notifying a policy of all
host-up events, regardless of the distance of that host.
@rwincewicz
Copy link

Nice, simple fix. LGTM!

@kw217 kw217 changed the title Allow remote hosts to come back up CPP-879 Allow remote hosts to come back up Nov 27, 2019
@mpenick
Copy link
Contributor

mpenick commented Dec 9, 2019

Thanks for the PR. Sorry for the delay, I've been out. I'll work on investigating this issue today.

@mpenick mpenick merged commit 5336f35 into datastax:master Dec 9, 2019
@accelerated
Copy link
Contributor

Hi @mpenick, could we have a minor release with this PR. Seems like an important one.

@mpenick
Copy link
Contributor

mpenick commented Dec 9, 2019

Change looks great. Thanks again.

@mpenick
Copy link
Contributor

mpenick commented Dec 9, 2019

@accelerated yes, highly likely. You're right. This is release worthy.

@kw217
Copy link
Contributor Author

kw217 commented Dec 9, 2019

Brilliant - thank you both!

@mpenick
Copy link
Contributor

mpenick commented Dec 11, 2019

@kw217 @accelerated This fix is in the recently posted release v2.14.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants