Skip to content

Commit

Permalink
transport: Delay NEW_NODE until CQL listen started
Browse files Browse the repository at this point in the history
After adding a new node to the cluster, Scylla sends a NEW_NODE event
to CQL clients. Some clients immediately try to connect to the new node,
however it fails as the node has not yet started listening to CQL
requests.

In contrast, Apache Cassandra waits for the new node to start its CQL
server before sending NEW_NODE event. In practice this means that
NEW_NODE and UP events will be sent "jointly" after new node is UP.

This change is implemented in the same manner as in Apache Cassandra
code.

Fixes scylladb#7301.
  • Loading branch information
avelanarius committed Sep 30, 2020
1 parent fd1dd0e commit d440842
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
9 changes: 9 additions & 0 deletions transport/event_notifier.cc
Expand Up @@ -248,6 +248,11 @@ void cql_server::event_notifier::on_drop_aggregate(const sstring& ks_name, const

void cql_server::event_notifier::on_join_cluster(const gms::inet_address& endpoint)
{
if (!gms::get_local_gossiper().is_cql_ready(endpoint)) {
_endpoints_pending_joined_notification.insert(endpoint);
return;
}

for (auto&& conn : _topology_change_listeners) {
using namespace cql_transport;
if (!conn->_pending_requests_gate.is_closed()) {
Expand All @@ -268,6 +273,10 @@ void cql_server::event_notifier::on_leave_cluster(const gms::inet_address& endpo

void cql_server::event_notifier::on_up(const gms::inet_address& endpoint)
{
if (_endpoints_pending_joined_notification.erase(endpoint)) {
on_join_cluster(endpoint);
}

bool was_up = _last_status_change.contains(endpoint) && _last_status_change.at(endpoint) == event::status_change::status_type::UP;
_last_status_change[endpoint] = event::status_change::status_type::UP;
if (!was_up) {
Expand Down
5 changes: 5 additions & 0 deletions transport/server.hh
Expand Up @@ -271,6 +271,11 @@ class cql_server::event_notifier : public service::migration_listener,
std::unordered_map<gms::inet_address, event::status_change::status_type> _last_status_change;
service::migration_notifier& _mnotifier;
bool _stopped = false;

// We want to delay sending NEW_NODE CQL event to clients until the new node
// has started listening for CQL requests.
// See https://github.com/scylladb/scylla/issues/7301
std::unordered_set<gms::inet_address> _endpoints_pending_joined_notification;
public:
future<> stop();
event_notifier(service::migration_notifier& mn);
Expand Down

0 comments on commit d440842

Please sign in to comment.