Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW_NODE should be sent after listening for CQL clients has started #7301

Closed
avelanarius opened this issue Sep 29, 2020 · 2 comments
Closed
Assignees
Milestone

Comments

@avelanarius
Copy link
Member

After adding a new node to the cluster, Scylla sends a NEW_NODE event. Some clients (for example Java Driver 3.x - both Scylla and DataStax version) immediately try to connect to such new node, however they (sometimes) fail, because Scylla has not started listening for CQL clients yet.

There is a discrepancy between Scylla and Cassandra: Cassandra waits for CQL listening to start (on a new node) before sending NEW_NODE to clients. Therefore in Cassandra clients can immediately connect to the new node described in the NEW_NODE event. I have manually verified this waiting behaviour by adding a 20s sleep in Cassandra 3.11.8 before starting CQL listening (Thread.sleep(20000) before line 161 in Server.java).

Installation details
Scylla version (or git commit hash): 666.development-0.20200929.1adf2cc84
Cluster size: 1 (2 after adding second node)
OS (RHEL/CentOS/Ubuntu/AWS AMI): Fedora 32

@avelanarius avelanarius self-assigned this Sep 29, 2020
@haaawk
Copy link
Contributor

haaawk commented Sep 29, 2020

This is the reason why the on-Node-Added notification sometimes wasn't working in the replicator.

avelanarius added a commit to avelanarius/scylla that referenced this issue Sep 30, 2020
After adding a new node to the cluster, Scylla sends a NEW_NODE event
to CQL clients. Some clients immediately try to connect to the new node,
however it fails as the node has not yet started listening to CQL
requests.

In contrast, Apache Cassandra waits for the new node to start its CQL
server before sending NEW_NODE event. In practice this means that
NEW_NODE and UP events will be sent "jointly" after new node is UP.

This change is implemented in the same manner as in Apache Cassandra
code.

Fixes scylladb#7301.
avelanarius added a commit to avelanarius/scylla that referenced this issue Oct 1, 2020
After adding a new node to the cluster, Scylla sends a NEW_NODE event
to CQL clients. Some clients immediately try to connect to the new node,
however it fails as the node has not yet started listening to CQL
requests.

In contrast, Apache Cassandra waits for the new node to start its CQL
server before sending NEW_NODE event. In practice this means that
NEW_NODE and UP events will be sent "jointly" after new node is UP.

This change is implemented in the same manner as in Apache Cassandra
code.

Fixes scylladb#7301.
@slivne slivne added this to the 4.3 milestone Oct 4, 2020
@nyh
Copy link
Contributor

nyh commented Jun 20, 2021

This is an old fix, already in branches 4.3 - 4.5, so nothing remains to backport.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants