New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[supervisor] Trigger instance update if missing after port exposure #7058
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Associated issue: #6778 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report
@@ Coverage Diff @@
## main #7058 +/- ##
===========================================
+ Coverage 19.04% 36.94% +17.89%
===========================================
Files 2 19 +17
Lines 168 4623 +4455
===========================================
+ Hits 32 1708 +1676
- Misses 134 2781 +2647
- Partials 2 134 +132
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add couple tests for it by mocking the api service in ports_test.go.
errchan = make(chan error, 1) | ||
) | ||
errchan := make(chan error, 1) | ||
g.exposedPorts = make(chan []ExposedPort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would imply that now we can have only one listener
if err != nil { | ||
return err | ||
} | ||
res := getExposedPorts(wsInfo.LatestInstance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better if we do sync on level of APIoverJSONRPC
in such way we don't need to worry about changing semantic of its clients.
i.e. add APIoverJSONRPC.SyncInstance
, you can look how we appliy similar logic in the frontend, there are 2 important things:
- avoid multiple concurrent syncs, i.e. last one wins for isntance like here:
gitpod/components/gitpod-protocol/src/gitpod-service.ts
Lines 529 to 549 in b4fa0fc
private sync(): void { this.cancelSync(); this.syncTokenSource = new CancellationTokenSource(); const token = this.syncTokenSource.token; this.syncQueue = this.syncQueue.then(async () => { if (token.isCancellationRequested) { return; } try { const info = await this.service.server.getWorkspace(this._info.workspace.id); if (token.isCancellationRequested) { return; } this._info = info; this.source = 'sync'; this.onDidChangeEmitter.fire(undefined); } catch (e) { console.error('failed to sync workspace instance:', e) } }) } - resolve conflicts between instance updates and sync, if sync seen newer data then we shoudl ignore new instance update, it can be done based on phases like here:
gitpod/components/gitpod-protocol/src/gitpod-service.ts
Lines 568 to 571 in b4fa0fc
if (instance.id !== this.info.latestInstance?.id) { return false; } return phasesOrder[instance.status.phase] < phasesOrder[this.info.latestInstance.status.phase];
Anton and I discussed that we want to postpone this change for now and start with adding more logging (#7083). |
@akosyakov I had a look into |
Do you mean to do force getWorkspace on reconnect similarly how we do it in the local companion? |
Still not 100% sure this caused the "port exposure" troubles, but qualifies as general error handler we talked about here. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I close it for now. We would like to add some syncing but on reconnection to Gitpod Server. |
Description
When for some reason the server stops sending instance updates ports are getting stuck in
detecting
. This PR checks 1 minute after port exposure if we got the port exposure information from the server and if not it asks actively for an instance update.@akosyakov @csweichel After implementing this, I'm not sure if this is 100 % that was you had in mind. I'm very open to any suggestions for other implementations.
Related Issue(s)
Mitigate/Fixes #6778
See also: #7054
How to test
I tested this PR by adding this change:
This change ignores all instance updates from
server
. With this we see the same as we see in prod: The ports get stuck in “detecting”. After 1 minute,supervisor
logs “we haven't seen an instance update with port exposure info after 1 minute” and “detecting” disappears.Release Notes