Skip to content

Conversation

@emosbaugh
Copy link
Member

@emosbaugh emosbaugh commented Nov 12, 2025

What this PR does / why we need it:

This PR refactors the state machine implementation to fix state management issues in the v3 API. The key changes include:

  • Enhanced state transitions: State transitions now accept event data as a third parameter, allowing contextual information (errors, preflight output) to be passed to event handlers
  • Improved status management: Status retrieval moved from manager interfaces to store implementations for better consistency and separation of concerns
  • Better error handling: Error messages now propagate through the state machine to event handlers, enabling accurate metrics reporting
  • Consistent defer patterns: Fixed goroutine patterns to ensure locks are released and status is updated properly on both success and failure paths
  • Preflight result clearing: Added logic to clear stale preflight results before running new checks
  • Enhanced test coverage: Added comprehensive tests for reporting handlers across install and upgrade flows

Which issue(s) this PR fixes:

[sc-130867]

Does this PR require a test?

Yes - Added new test files:

  • api/controllers/kubernetes/install/reporting_handlers_test.go
  • api/controllers/linux/install/reporting_handlers_test.go
  • api/controllers/linux/upgrade/reporting_handlers_test.go

Also updated existing tests to align with the new state machine signature and status management patterns.

Does this PR require a release note?

NONE

Does this PR require documentation?

NONE

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

This PR has been released (on staging) and is available for download with a embedded-cluster-smoke-test-staging-app license ID.

Online Installer:

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci/appver-dev-c0b0e96" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Airgap Installer (may take a few minutes before the airgap bundle is built):

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci-airgap/appver-dev-c0b0e96?airgap=true" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Happy debugging!

@emosbaugh emosbaugh marked this pull request as ready for review November 13, 2025 21:25
@emosbaugh emosbaugh requested a review from sgalsaleh November 13, 2025 21:27

if err := c.setAppUpgradeStatus(types.StateSucceeded, "Upgrade complete"); err != nil {
return fmt.Errorf("set status to succeeded: %w", err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Error Handler Fails Post-Success

If setAppUpgradeStatus fails after successfully transitioning to StateSucceeded, the error-handling defer attempts to transition from StateSucceeded to StateAppUpgradeFailed. This transition is likely invalid in the state machine, causing the error handler itself to fail. The status update should either not return an error that triggers the defer, or the defer should check the current state before attempting transitions.

Fix in Cursor Fix in Web

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good feedback

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind. im reverting

sgalsaleh
sgalsaleh previously approved these changes Nov 13, 2025
sgalsaleh
sgalsaleh previously approved these changes Nov 13, 2025
@emosbaugh emosbaugh changed the title chore(v3): fix state management chore(v3): fix state management race condition Nov 13, 2025
@emosbaugh emosbaugh merged commit cc85f17 into main Nov 14, 2025
139 of 152 checks passed
@emosbaugh emosbaugh deleted the emosbaugh/sc-130867/fix-state-management-2 branch November 14, 2025 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants