Skip to content

Replace PauseState flag with PAUSE_REQUESTED state machine state#10265

Open
dandavison wants to merge 3 commits into
feature/activity-operator-cmdsfrom
feature/activity-operator-cmds-pause-requested
Open

Replace PauseState flag with PAUSE_REQUESTED state machine state#10265
dandavison wants to merge 3 commits into
feature/activity-operator-cmdsfrom
feature/activity-operator-cmds-pause-requested

Conversation

@dandavison
Copy link
Copy Markdown
Contributor

@dandavison dandavison commented May 14, 2026

This is a proposed change to #10106.

See https://github.com/temporalio/temporal/pull/10106/changes#r3238854955


Note

Medium Risk
Changes core activity state machine and task validation around pausing, retries, and timeouts; subtle regressions could leave activities stuck or incorrectly timed out/canceled. Covered by new standalone activity tests, but behavior changes touch multiple transitions and token validation paths.

Overview
Refactors activity pausing to use a new internal status, ACTIVITY_EXECUTION_STATUS_PAUSE_REQUESTED, instead of relying on PauseState as a flag while the activity is STARTED.

Updates transitions and validators so PAUSE_REQUESTED keeps worker tokens/attempt-scoped timers valid (heartbeat/start-to-close, completion/failure/cancel paths), and so retries/timeouts while pause-requested land the activity in PAUSED rather than re-dispatching. API surfaces are adjusted accordingly (PendingActivityState mapping, DescribeActivityExecution run state, and RecordActivityTaskHeartbeat.ActivityPaused).

Adds standalone activity tests covering start-to-close timeout, heartbeat timeout, and options updates while PAUSE_REQUESTED, and updates cancel/unpause expectations under the new state model.

Reviewed by Cursor Bugbot for commit 055b7ed. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 3b0b77a. Configure here.

Comment thread chasm/lib/activity/activity_tasks.go
a.emitOnUnpausedMetrics(event.metricsHandler)
return nil
},
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransitionUnpausedToStarted doesn't clear PauseState field

Low Severity

TransitionUnpausedToStarted (PAUSE_REQUESTED → STARTED) only emits metrics but never sets a.PauseState = nil. The sibling TransitionUnpaused (PAUSED → SCHEDULED) clears PauseState via the unpause() helper. After unpausing a PAUSE_REQUESTED activity, stale pause metadata (identity, reason, time, request ID) remains in the persisted ActivityState proto despite the activity no longer being paused.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3b0b77a. Configure here.

@dandavison dandavison force-pushed the feature/activity-operator-cmds-pause-requested branch from 3b0b77a to e28990c Compare May 14, 2026 12:35
Eliminate the dual representation of "paused" (status PAUSED vs. PauseState
flag on STARTED/SCHEDULED) by introducing PAUSE_REQUESTED as a real internal
status. Status becomes the single source of truth for whether the activity
is paused; PauseState remains as audit data (identity, reason, time,
request_id) but no longer drives behavior.

Key changes:
- Add ACTIVITY_EXECUTION_STATUS_PAUSE_REQUESTED to the proto enum.
- New TransitionPauseRequested (STARTED -> PAUSE_REQUESTED), no stamp bump.
- New TransitionUnpauseRequested (PAUSE_REQUESTED -> STARTED).
- New TransitionRescheduledPaused (PAUSE_REQUESTED -> PAUSED) for the retry-
  while-paused path; eliminates the zone-2 (SCHEDULED + flag) intermediate.
- Add PAUSE_REQUESTED to source sets of Completed, Failed, Terminated,
  CancelRequested, TimedOut.
- validateActivityTaskToken accepts PAUSE_REQUESTED so the worker's token
  stays valid while paused-while-running.
- Drop the PauseState == nil check from the dispatch and schedule-to-start
  task validators; their pause guard is now redundant with the status check.
- Heartbeat response: ActivityPaused = (status == PAUSE_REQUESTED).
- buildActivityExecutionInfo: drop the synthetic PAUSE_REQUESTED runState
  derivation; the status maps directly via internalStatusToRunState.
- handleReset: PAUSE_REQUESTED is treated like STARTED (defer via flags).
  Drops the SCHEDULED+PauseState special case.
- Remove the post-terminal PauseState = nil cleanups; PauseState is now
  audit data and may persist into terminal states.

Tests not updated.
@dandavison dandavison force-pushed the feature/activity-operator-cmds-pause-requested branch from e28990c to 3d8f7cf Compare May 14, 2026 12:41
Adds three sub-tests to TestPauseActivityExecution that exercise the
PAUSE_REQUESTED + worker-does-not-yield interaction. Each was missing
before; verified by reverting the corresponding fix and observing the
specific test fail:

- StartToCloseTimeoutWhilePauseRequested:
  Pause a STARTED activity; the worker stops responding; the
  StartToCloseTimeoutTask must still fire (validator must accept
  PAUSE_REQUESTED) and the retry must land in PAUSED.

- HeartbeatTimeoutWhilePauseRequested:
  Same pattern for HeartbeatTimeoutTask.

- UpdateOptionsPreservesTimeoutsWhilePauseRequested:
  Pause a STARTED activity, call UpdateActivityExecutionOptions with a
  shorter StartToCloseTimeout; the handler must re-emit a fresh timeout
  task for PAUSE_REQUESTED activities, not only STARTED/CANCEL_REQUESTED.
  Otherwise the stamp bump silently strips timeout enforcement from the
  running worker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant