Skip to content

fix(frontend): clean up websocket state when returning to the dashboard#5565

Merged
kunwp1 merged 8 commits into
apache:mainfrom
kunwp1:fix/websocket-state-cleanup-3120
Jun 9, 2026
Merged

fix(frontend): clean up websocket state when returning to the dashboard#5565
kunwp1 merged 8 commits into
apache:mainfrom
kunwp1:fix/websocket-state-cleanup-3120

Conversation

@kunwp1

@kunwp1 kunwp1 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Websocket-derived front-end state (the connection itself, plus the execution status, console output, and results built from its events) lives in singletons outside the workspace. It was never torn down in two cases, so stale state carried over:

  1. Returning to the dashboard and re-entering a workflow reused the previous socket. The connection-tracking fields (currentConnectedWid / currentConnectedCuid) also survived, so the reconnect guard saw them unchanged, skipped reconnecting, and reused the stale socket. (Websocket state does not clean up properly #3120 — the case Prevent re-opening websocket connections on metadata update #3093 did not cover.)
  2. Switching computing units inside the workspace left the previous unit's console, results, and execution status on screen.

This PR clears that state at both points.

Workspace exit: WorkspaceComponent.ngOnDestroy() now tears everything down:

Call (new) Resets
ComputingUnitStatusService.disconnect() closes the socket, clears operator status, stops the unit poll, resets the connection-tracking fields and the selected unit
ExecuteWorkflowService.resetExecutionAndWorkers() execution status and worker assignments
WorkflowConsoleService.clearConsoleMessages() console output
WorkflowResultService.clearResults() result caches and table stats

Unit switch: ComputingUnitStatusService emits a reset signal when it reconnects to a different unit, and WorkspaceComponent clears the same execution / console / result state in response. As a result, switching units now discards the previous unit's results and console instead of leaving them on screen.

The remaining websocket-event consumers need no teardown: OperatorReuseCacheStatusService is stateless, and udf-debug.service's state lives in the TexeraGraph, already reset by clearWorkflow().

Any related issues, documentation, discussions?

Closes #3120. Related: #3093 (earlier partial fix for the in-canvas socket re-open).

How was this PR tested?

Test with this workflow
Untitled workflow (14).json

Screen.Recording.2026-06-07.at.11.48.54.PM.mov

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

@github-actions github-actions Bot added fix frontend Changes related to the frontend GUI labels Jun 8, 2026
@codecov-commenter

codecov-commenter commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.28%. Comparing base (564ccdb) to head (d7f62eb).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5565      +/-   ##
============================================
+ Coverage     52.17%   52.28%   +0.10%     
  Complexity     2482     2482              
============================================
  Files          1068     1068              
  Lines         41311    41353      +42     
  Branches       4439     4441       +2     
============================================
+ Hits          21556    21622      +66     
+ Misses        18490    18461      -29     
- Partials       1265     1270       +5     
Flag Coverage Δ *Carryforward flag
access-control-service 64.61% <ø> (ø) Carriedforward from 564ccdb
agent-service 33.76% <ø> (ø) Carriedforward from 564ccdb
amber 53.28% <ø> (ø) Carriedforward from 564ccdb
computing-unit-managing-service 1.65% <ø> (ø) Carriedforward from 564ccdb
config-service 56.06% <ø> (ø) Carriedforward from 564ccdb
file-service 38.32% <ø> (ø) Carriedforward from 564ccdb
frontend 46.69% <100.00%> (+0.27%) ⬆️
pyamber 90.72% <ø> (ø) Carriedforward from 564ccdb
python 90.75% <ø> (ø) Carriedforward from 564ccdb
workflow-compiling-service 58.69% <ø> (ø) Carriedforward from 564ccdb

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…rd (apache#3120)

WorkflowWebsocketService, ComputingUnitStatusService, and the execution-session services (WorkflowStatusService, ExecuteWorkflowService, WorkflowConsoleService, WorkflowResultService) are all root-scoped singletons, so their state outlived the workspace. WorkspaceComponent.ngOnDestroy cleared the workflow graph but never tore down the websocket session, leaving the socket open and the derived state (connection tracking, worker count, execution status, console output, result caches) populated. Re-entering a workflow then reused the stale session; for the same wid (the wid -> null -> wid pattern) the reconnect guard even short-circuited, the case PR apache#3093's in-canvas guard did not cover.

Tear down every piece of websocket-derived state on workspace destroy: ComputingUnitStatusService.disconnect() (close socket, clear status, stop polling, reset connection tracking and selected unit), WorkflowWebsocketService.closeWebsocket() (reset numWorkers), ExecuteWorkflowService.resetState() (execution state + worker assignments), WorkflowConsoleService.clearConsoleMessages(), and WorkflowResultService.clearResults() (result caches + table-stats ReplaySubject). WorkspaceComponent.ngOnDestroy invokes all of them so re-entering any workflow (same wid included) starts from a clean session.
@kunwp1 kunwp1 force-pushed the fix/websocket-state-cleanup-3120 branch from 46c2b1f to 19ffa9a Compare June 8, 2026 06:16
…nit switch

Follow-up to apache#3120 review feedback:

- Extend the session-state teardown (execution status, console output, results)
  to the in-canvas computing-unit switch, which previously cleared only operator
  status and left the prior unit's console/results/execution badge on screen.
  ComputingUnitStatusService now emits getConnectionResetStream() when it tears
  down an active connection to reconnect to a different unit; WorkspaceComponent
  subscribes and runs a shared resetWorkflowSessionState() helper that
  ngOnDestroy also uses.
- Rename ExecuteWorkflowService.resetState() to resetExecutionAndWorkers() to
  distinguish it from the existing resetExecutionState().
@kunwp1 kunwp1 requested a review from aglinxinyuan June 8, 2026 06:54
kunwp1 added 2 commits June 7, 2026 23:59
…ache#3120)

Tighten the verbose doc comments (and one test comment) added in this PR.
No behavior change.
@Yicong-Huang Yicong-Huang changed the title fix(frontend): clean up websocket state when returning to the dashboard (#3120) fix(frontend): clean up websocket state when returning to the dashboard Jun 8, 2026
@aglinxinyuan aglinxinyuan requested a review from Copilot June 9, 2026 08:19

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent stale websocket-derived frontend state (socket connection, execution status, console output, and cached results) from persisting across (a) leaving the workspace back to the dashboard and (b) switching computing units within the workspace, addressing #3120.

Changes:

  • Adds teardown/reset APIs across websocket, execution, console, and result services (e.g., disconnect(), clearResults(), clearConsoleMessages(), resetExecutionAndWorkers()).
  • Hooks workspace lifecycle (WorkspaceComponent.ngOnDestroy) and computing-unit switching (connectionResetSubject) to clear websocket-derived session state.
  • Adds/extends unit tests to cover the new reset/teardown behaviors.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
frontend/src/app/workspace/service/workflow-websocket/workflow-websocket.service.ts Resets cached numWorkers when the websocket is closed.
frontend/src/app/workspace/service/workflow-websocket/workflow-websocket.service.spec.ts Tests that cached worker count is reset on close.
frontend/src/app/workspace/service/workflow-result/workflow-result.service.ts Introduces clearResults() to drop cached results and reset table stats.
frontend/src/app/workspace/service/workflow-result/workflow-result.service.spec.ts Adds tests for clearResults() cache/stat resets.
frontend/src/app/workspace/service/workflow-console/workflow-console.service.ts Introduces clearConsoleMessages() to clear console state and notify subscribers.
frontend/src/app/workspace/service/workflow-console/workflow-console.service.spec.ts Adds tests for clearConsoleMessages().
frontend/src/app/workspace/service/execute-workflow/execute-workflow.service.ts Introduces resetExecutionAndWorkers() to clear execution + worker assignment state.
frontend/src/app/workspace/service/execute-workflow/execute-workflow.service.spec.ts Adds test for resetExecutionAndWorkers().
frontend/src/app/workspace/component/workspace.component.ts Resets websocket-derived state on workspace destroy and computing-unit switches.
frontend/src/app/workspace/component/workspace.component.spec.ts Adds tests asserting teardown/reset on destroy and unit switch.
frontend/src/app/common/service/computing-unit/computing-unit-status/computing-unit-status.service.ts Adds reset signal stream and disconnect() to fully clear connection-tracking + polling.
frontend/src/app/common/service/computing-unit/computing-unit-status/computing-unit-status.service.spec.ts Adds coverage for disconnect/reconnect and connection-reset signaling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kunwp1 and others added 3 commits June 9, 2026 02:06
…itch

Address review feedback on the websocket-state-cleanup change:

- WorkflowResultService.clearResults() now emits a cleared signal that
  ResultPanelComponent uses to tear down stale result/console frames; clearing
  the caches alone does not re-render a still-highlighted operator.
- ExecuteWorkflowService.resetExecutionAndWorkers() broadcasts the reset on
  executionStateStream so MenuComponent and ResultPanelComponent drop the
  previous unit's execution status instead of leaving it on screen.
- ComputingUnitStatusService clears status and emits the connection-reset
  signal when switching units even if the socket already dropped (e.g. the
  prior unit was terminated), not only while currently connected.
- Tear down the service in the computing-unit-status spec so the interval poll
  started by selectComputingUnit() doesn't outlive the test.

@aglinxinyuan aglinxinyuan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — verified the teardown covers both paths (returning to the dashboard and switching computing units in-canvas). Routing the reset through updateExecutionState and the new resultClearedStream is the right call: clearing the caches alone wouldn't re-render an operator that stays highlighted across a switch, so the previous unit's frame would otherwise linger.

One optional follow-up, not blocking: the emit side of resultClearedStream is covered, but nothing tests the consumer in ResultPanelComponent.registerResultClearedHandler() — a small spec asserting it clears currentOperatorId and tears down the panel frames would pin the part that actually fixes the bug. Fine to do as a follow-up.

@kunwp1

kunwp1 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Thank you for the review. Added the test you mentioned.

@kunwp1 kunwp1 enabled auto-merge June 9, 2026 09:37
@kunwp1 kunwp1 added this pull request to the merge queue Jun 9, 2026
Merged via the queue into apache:main with commit ad10b7f Jun 9, 2026
16 checks passed
@kunwp1 kunwp1 deleted the fix/websocket-state-cleanup-3120 branch June 9, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Websocket state does not clean up properly

4 participants