Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web terminal does not re-connect when agent is restarted (spot VM) #5292

Closed
Tracked by #5318
bpmct opened this issue Dec 5, 2022 · 2 comments · Fixed by #5886
Closed
Tracked by #5318

Web terminal does not re-connect when agent is restarted (spot VM) #5292

bpmct opened this issue Dec 5, 2022 · 2 comments · Fixed by #5886

Comments

@bpmct
Copy link
Member

bpmct commented Dec 5, 2022

Summary from @kylecarbs:

  1. When an agent disconnects, we can't close all Coder server -> workspace connections because it could be intermittent. e.g. if you have multiple Coder replicas, one shuts off due to an upgrade but that was where the agent was connected, we don't want to lose all other client connections.
  2. The agent should have an "instance ID" associated with itself to identify when a new instance popped up. When it does, we should publish to all replicas to close connections that are for the old identifier.
@bpmct
Copy link
Member Author

bpmct commented Dec 5, 2022

I also ran into this with a template that creates workspaces as Kubernetes deployments, and then I deleted the pod. It re-created and the agent connected, but the web terminal never connected. I was able to connect over SSH okay. Agent logs

@bpmct bpmct mentioned this issue Dec 6, 2022
5 tasks
@bpmct
Copy link
Member Author

bpmct commented Jan 25, 2023

I'm also noticing this when I run sudo reboot on an AWS VM.

coadler added a commit that referenced this issue Jan 26, 2023
If an agent went away and reconnected, the wsconncache connection would
be polluted for about 10m because there would be two peers with the
same IP. The old peer always had priority, which caused the dashboard to
try and always dial the old peer until it was removed.

Fixes: #5292
coadler added a commit that referenced this issue Jan 26, 2023
If an agent went away and reconnected, the wsconncache connection would
be polluted for about 10m because there would be two peers with the
same IP. The old peer always had priority, which caused the dashboard to
try and always dial the old peer until it was removed.

Fixes: #5292
bpmct pushed a commit that referenced this issue Jan 26, 2023
If an agent went away and reconnected, the wsconncache connection would
be polluted for about 10m because there would be two peers with the
same IP. The old peer always had priority, which caused the dashboard to
try and always dial the old peer until it was removed.

Fixes: #5292
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant