fix: close old discovery proxy before rebinding in NlbSimulator#852
Open
dkropachev wants to merge 2 commits intoscylladb:scylla-4.xfrom
Open
fix: close old discovery proxy before rebinding in NlbSimulator#852dkropachev wants to merge 2 commits intoscylladb:scylla-4.xfrom
dkropachev wants to merge 2 commits intoscylladb:scylla-4.xfrom
Conversation
eb87f70 to
484ad59
Compare
aa8e3d1 to
893a686
Compare
NlbSimulator.addNode() was tearing down and recreating the discovery RoundRobinProxy on every call, which failed with "Address already in use" because the new proxy tried to bind while the old one still held the port. Add addTarget()/removeTarget() to RoundRobinProxy so the discovery proxy is created once (on the first addNode) and updated in-place as nodes join or leave. Fixes scylladb#851
893a686 to
986581d
Compare
ClientRoutesTopologyMonitor.savePort() was saving the NLB proxy port (e.g. 29043) as the fallback for broadcastRpcAddress construction. This caused Metadata.findNode() to fail matching TOPOLOGY_CHANGE REMOVED_NODE events (which carry the real native transport port 9042), so decommissioned nodes were never removed from metadata. Add a nativeTransportPort field to ClientRoutesConfig (default 9042, configurable via advanced.client-routes.native-transport-port) and set it in the ClientRoutesTopologyMonitor constructor. The savePort() override is now a no-op since the port is already initialized. broadcastRpcAddress is used only for event matching — connections go through ClientRoutesEndPoint which resolves via NLB proxy addresses in the routes cache.
986581d to
2342a2b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NlbSimulator.rebuildDiscoveryProxy()creating a newRoundRobinProxyon the same port before closing the old one, causingBindException: Address already in use. AddaddTarget()/removeTarget()toRoundRobinProxyso the discovery proxy is created once and updated in-place.ClientRoutesTopologyMonitor.savePort()saving the NLB proxy port (e.g. 29043) as the fallback forbroadcastRpcAddress. This causedMetadata.findNode()to fail matchingTOPOLOGY_CHANGE REMOVED_NODEevents (which carry the real native transport port 9042), so decommissioned nodes were never removed from metadata. Fix by adding anativeTransportPortfield toClientRoutesConfig(default 9042, configurable viaadvanced.client-routes.native-transport-port) and setting it in the constructor.broadcastRpcAddressis used only for event matching — connections go throughClientRoutesEndPointwhich resolves via NLB proxy addresses in the routes cache.Test plan
ClientRoutesIT.should_survive_full_node_replacement_through_nlbpasses (previously failed with BindException ataddNode(2), then timeout after decommission)ClientRoutesTopologyMonitorTest.savePort_should_use_native_transport_port_from_configpassesFixes #851