-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADDED] $SYS server request to 'kick' or 'LDM' a client connection #4298
Conversation
Let's think a bit more about all of this for 2.11, lot more moving parts and want to have a cohesive answer across the board. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the whole I think this looks good, but it needs at least a unit test to prove it works (probably using nats.go's LameDuckModeHandler
) and preferably a second unit test to prove it's only reachable on the system account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesnt look like it gives feedback about what was done, if a server disconnected a client or not or which handled these?
Looking at the recently added reloadConfig
example it uses the zReq()
helper to craft responses for which servers handled it and errors are sent back to the caller.
From the perspective of using this (CLI) I cant see how to make this a friendly UX, getting the server info and affected changes back means i can tell a user success/fail etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - docs on exported structs and funcs would be nice, in go a function / struct doc starts with its name
Ok should be all good now. @neilalexander please merge if you are good with those last changes and the new test. |
Test & race fix look good, please squash down to one commit and I'll hit merge in the morning! |
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Hi what would be the nats version to have this feature? |
2.10.x will have that feature. |
Hi I used 2.10.3-alpine image for my nats statefulset, and used nats-pub tool to send this message "Published [$SYS.REQ.SERVER.NAVK5EQIMAJ4S7SMDSERITSXY2Q4MCDAMSFJYCURWDQC4OQTVVJ6A.KICK] : '{"name":"rebalance-worker"}'" |
This is an internal feature and works by CID. |
Oh that means it has no released image yet? Is there a estimated time for the release? |
It's in there. You have to supply CID. |
Sorry could you elaborate? CID do you mean by container id? I was following the above "$SYS.REQ.SERVER.%s.KICK (%s is the server_id)" |
The payload of the request should be Where xxx is the connection id (a number) for that connection to the server. The initial version of the PR also allowed to kick by name but that was removed before the PR merge as it's too much of a potential footgun (since it would kick all of the connections with the name) |
You can use “nats server request kick” and its help will ask the required info. Add —trace to see how it works |
Oh I was looking to kick all connection to a server - given the server id, not just a particular connection between a client and a server. The reason is after spinning up new NATS server replicas, I want to disconnect all the existing connections to the original replica and have them auto distribute evenly across all replicas. Is this feature not supported yet? |
Just shut down the server? If you don’t want connection on it at all there is no point in hanging around. So shut it with the LDM signal to the process |
Since these are system wide operations, the controls at an individual server level will be very low level and raw. The plan is to provide system level directives through the NATS cli to balance things like connections and JetStream leaders. |
hi team, Do we handle when client still reconnect to the previous server? |
Clients who have a list of servers - either in co fig or using topology discovery - will connect to a random next server but not the one they were just on. |
Resolves #NNN
git pull --rebase origin main
)Resolves #1556
Changes proposed in this pull request:
Adds tw new $SYS server API endpoints:
$SYS.REQ.SERVER.%s.KICK
(where %s is the server_id) which 'kicks' (effectiveley 'rebalance' as the client application reconnects itself right away (potentially to another server in the cluster)). The service takes a JSON payload containing either an "id" or a "name" field. "id" disconnects the client connection id, "name" disconnects all of the clients connected to the server with that name.$SYS.REQ.SERVER.%s.LDM
(where %s is the server_id) and takes a JSON payload containing either an "id" or a "name" field. "id" sends an LDM Info message to the client connection id, "name" sends an LDM Info message to all of the clients connected to the server with that name.This features allow administrators to manually 're-balance' client connections between the servers in the cluster (e.g. after a rolling upgrade of the servers where one server ends up with no client connections after the upgrade), by kicking some of the client connections from one of the 'overloaded' (in comparison to other servers) servers in the cluster, causing them to re-estalibsh their connection to (hopefully) another server.