Skip to content

Fix CLI Ctrl+C/SIGTERM shutdown: responsive cancellation and double-signal bug#17588

Open
JamesNK wants to merge 7 commits into
mainfrom
fix/cli-ctrlc-shutdown-17569
Open

Fix CLI Ctrl+C/SIGTERM shutdown: responsive cancellation and double-signal bug#17588
JamesNK wants to merge 7 commits into
mainfrom
fix/cli-ctrlc-shutdown-17569

Conversation

@JamesNK
Copy link
Copy Markdown
Member

@JamesNK JamesNK commented May 28, 2026

Description

Fixes the CLI's Ctrl+C/SIGTERM handling to be responsive during all phases and prevents a double-signal bug that caused immediate force-kill instead of graceful shutdown.

Problems Fixed

  1. Sluggish Ctrl+C during AppHost startup: When a user pressed Ctrl+C during AppHost startup (before the AppHost connects back to the CLI), the CLI would wait for the full 5-second startup timeout before exiting.

  2. Double-signal bug causing force-kill: Both PosixSignalRegistration(SIGINT) and Console.CancelKeyPress were registered simultaneously. On Linux, when SIGINT arrives both handlers fire for the same signal, each calling Cancel(), so _cancelCalled reaches 2 immediately — triggering the force-kill path instead of graceful shutdown.

  3. No way to force-exit: If graceful shutdown was taking too long, there was no mechanism for a second Ctrl+C to force immediate termination.

Changes

src/Aspire.Cli/ConsoleCancellationManager.cs (new file in this PR):

  • Central signal handler managing Ctrl+C, SIGINT, SIGTERM, and SIGQUIT with a shared CancellationTokenSource
  • First signal requests cooperative cancellation with a configurable timeout
  • Second signal forces immediate termination via ProcessTerminationCompletionSource
  • Fix: Console.CancelKeyPress is now only registered on platforms without PosixSignalRegistration (Android, iOS, tvOS, Browser) — eliminates the double-signal bug
  • Added SIGQUIT registration for Windows Ctrl+Break parity
  • Supports SetLogger() for diagnostic observability and SetStartedHandler() for timeout tracking

src/Aspire.Cli/Program.cs:

  • Creates ConsoleCancellationManager early in startup and wires it through the command pipeline
  • Uses Task.WhenAny(handlerTask, processTerminationCompletionSource.Task) so forced termination is observed immediately

src/Aspire.Cli/Commands/RunCommand.cs:

  • CancelAppHostStartupAsync now receives the cancellation token and passes it to WaitAsync, so the startup-timeout wait exits immediately on Ctrl+C

tests/Aspire.Cli.Tests/ConsoleCancellationManagerTests.cs (new):

  • Unit tests for single-signal cooperative cancellation, double-signal force termination, and timeout behavior

tests/Aspire.Cli.Tests/Commands/RunCommandTests.cs:

  • Tests verifying Ctrl+C during AppHost startup cancels immediately

tests/Shared/Hex1bAutomatorTestHelpers.cs:

  • Removed old WaitForSuccessPromptAsync (which would hang on errors instead of failing fast)
  • Renamed WaitForSuccessPromptFailFastAsyncWaitForSuccessPromptAsync (now the only variant, always fails fast on ERR prompts)
  • Removed duplicate RunCommandFailFastAsync (was identical to RunCommandAsync)

E2E test files (59 files):

  • Updated all call sites from WaitForSuccessPromptFailFastAsyncWaitForSuccessPromptAsync
  • Updated all call sites from RunCommandFailFastAsyncRunCommandAsync

Fixes #17569

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
    • No

Copilot AI review requested due to automatic review settings May 28, 2026 07:31
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17588

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17588"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves Aspire CLI termination responsiveness when users press Ctrl+C during run/AppHost startup, aiming to avoid multi-second waits and provide a “second signal” forced termination path.

Changes:

  • Refactors ConsoleCancellationManager so the first signal triggers non-blocking cancellation and schedules forced termination asynchronously; subsequent signals trigger immediate forced-termination signaling.
  • Threads the command cancellation token into RunCommand.CancelAppHostStartupAsync so its startup-cancellation wait exits promptly on Ctrl+C.
  • Adds unit coverage for the cancellation manager behavior and a regression test ensuring run exits promptly when cancelled during the startup-timeout cancellation path.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/Aspire.Cli.Tests/ConsoleCancellationManagerTests.cs Adds tests for first/second signal behavior, async timeout forcing, and non-blocking cancel.
tests/Aspire.Cli.Tests/Commands/RunCommandTests.cs Adds regression test to ensure Ctrl+C cancels out of the startup-cancellation timeout promptly.
src/Aspire.Cli/Program.cs Wires a logger into cancellation handling for diagnostics.
src/Aspire.Cli/ConsoleCancellationManager.cs Makes signal handling non-blocking; adds async forced-termination timeout and second-signal behavior; adds logging hook.
src/Aspire.Cli/Commands/RunCommand.cs Passes the command cancellation token through to WaitAsync during startup cancellation.

Comment thread src/Aspire.Cli/ConsoleCancellationManager.cs
Comment thread src/Aspire.Cli/Program.cs
Comment thread src/Aspire.Cli/Program.cs
JamesNK added 3 commits May 28, 2026 20:28
- Make ConsoleCancellationManager.Cancel() non-blocking so Ctrl+C handler
  returns immediately
- Pass cancellation token through to WaitAsync in CancelAppHostStartupAsync
  so Ctrl+C exits promptly instead of waiting for the full 5s timeout
- Support second Ctrl+C for immediate force exit (Environment.Exit)
- Add logging support to ConsoleCancellationManager via SetLogger()
- Add comprehensive unit tests for ConsoleCancellationManager
- Add integration test for RunCommand cancellation during startup timeout

Fixes #17569
@JamesNK JamesNK force-pushed the fix/cli-ctrlc-shutdown-17569 branch from 06a5d7d to 754458f Compare May 28, 2026 12:29
Replace WaitForSuccessPromptAsync with WaitForSuccessPromptFailFastAsync
across all CLI E2E tests. The fail-fast variant detects error prompts
immediately and throws instead of hanging for up to 500s waiting for a
success prompt that will never arrive. This prevents 10+ minute CI
timeouts when a command fails with a non-zero exit code.
@JamesNK JamesNK requested review from eerhardt and radical as code owners May 28, 2026 14:07
- Fix ConsoleCancellationManager double-signal bug: move Console.CancelKeyPress
  to else branch so it only registers on platforms without PosixSignalRegistration.
  Previously both handlers fired for the same SIGINT, causing immediate force-kill.
- Add SIGQUIT/Ctrl+Break registration for Windows parity.
- Remove old WaitForSuccessPromptAsync (no fail-fast) and rename
  WaitForSuccessPromptFailFastAsync to WaitForSuccessPromptAsync.
- Remove duplicate RunCommandFailFastAsync (identical to RunCommandAsync).
@JamesNK JamesNK changed the title Fix CLI Ctrl+C shutdown taking too long during AppHost startup Fix CLI Ctrl+C/SIGTERM shutdown: responsive cancellation and double-signal bug May 28, 2026
@JamesNK JamesNK added this to the 13.4 milestone May 28, 2026
@JamesNK
Copy link
Copy Markdown
Member Author

JamesNK commented May 28, 2026

PR Testing Report

PR Information

CLI Version Verification

  • Installed Version: 13.5.0-dev (local debug build from PR branch)
  • Status: ✅ Verified (CI native artifact still building; used local build from checked-out PR branch)

Test Scenarios Executed

Scenario 1: Normal Start and Stop

Objective: Verify basic aspire start / aspire stop lifecycle works correctly with the refactored signal handler
Status: ✅ Passed

Steps:

  1. Created new Aspire starter project (aspire new aspire-starter)
  2. Ran aspire start --apphost <path>
  3. Verified AppHost started successfully with dashboard URL
  4. Ran aspire stop --apphost <path>
  5. Verified clean shutdown

Observations:

  • AppHost started and reported dashboard URL normally
  • Stop command sent signal and confirmed successful shutdown
  • No regressions in the normal (non-signal) path

Scenario 2: ConsoleCancellationManager Unit Tests

Objective: Verify all signal handling logic — first signal cooperative cancellation, second signal force-kill, timeout behavior
Status: ✅ Passed (10/10 tests)

Tests executed:

  • FirstSignal_RequestsCancellation
  • FirstSignal_TokenIsCancelled
  • SecondSignal_ForcesImmediateTermination
  • FirstSignal_WithNoHandler_ForcesTerminationAfterTimeout
  • FirstSignal_HandlerCompletesWithinTimeout_DoesNotForceTermination
  • FirstSignal_HandlerExceedsTimeout_ForcesTermination
  • Cancel_IsNonBlocking
  • MultipleSignals_OnlyFirstAndSecondHaveEffect
  • Dispose_AllowsSubsequentCancelWithoutException
  • Token_RemainsAccessibleAfterDispose

Scenario 3: Ctrl+C During Startup Timeout

Objective: Verify that when Ctrl+C fires during AppHost startup, the CLI exits promptly rather than blocking for the full 5-second CancelAppHostStartupAsync timeout
Status: ✅ Passed

Test: RunCommand_WhenCancelledDuringStartupTimeout_ExitsWithoutWaitingForFullTimeout

Observations:

  • Confirms the cancellationToken plumbing to WaitAsync exits immediately on cancellation
  • Without the fix, this would block for the full 5-second timeout

Scenario 4: Startup Timeout Regression Tests

Objective: Ensure existing startup timeout behavior is preserved
Status: ✅ Passed (2/2 tests)

Tests executed:

  • RunCommand_WhenAppHostStartupTimesOut_DisplaysTimeoutGuidance
  • RunCommand_StartupTimeoutBudgetIncludesBuildAndBackchannelWaits

Summary

Scenario Status Notes
Normal start/stop ✅ Passed No regressions
ConsoleCancellationManager (10 tests) ✅ Passed All signal logic verified
Ctrl+C during startup ✅ Passed Prompt exit confirmed
Startup timeout regression ✅ Passed Existing behavior preserved

Overall Result

✅ PR VERIFIED — All 13 targeted tests pass. The normal start/stop lifecycle works correctly. Signal handling changes are well-covered by unit tests that directly exercise the Cancel() method and verify timing constraints.

/// Sets the logger instance used for diagnostic messages during signal handling.
/// Call this once the logging infrastructure is available.
/// </summary>
internal void SetLogger(ILogger logger) => Volatile.Write(ref _logger, logger);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a volatile write?

Copy link
Copy Markdown
Member Author

@JamesNK JamesNK May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI was worried about it being used at the same time it is set, and memory barrier issues from ARM. I humoured it.

// Schedule the forced-completion timeout asynchronously so the signal handler
// returns immediately. This allows Program.Main's Task.WhenAny to observe
// handlerTask completion without being blocked by the signal handler thread.
_ = ForceTerminationAfterTimeoutAsync(forcedTerminationExitCode);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure returned Task is running (hot)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It either completes immediately or Task.Delay is called. It won't sync block.

@github-actions
Copy link
Copy Markdown
Contributor

CLI E2E Tests unknown — 107 passed, 0 failed, 2 unknown (commit c864c12)

View all recordings
Status Test Recording Job Artifacts
AddPackageInteractiveWhileAppHostRunningDetached Recording #78335154414 Logs
AddPackageWhileAppHostRunningDetached Recording #78335154414 Logs
AgentCommands_AllHelpOutputs_AreCorrect Recording #78335153506 Logs
AgentInitCommand_DefaultSelection_InstallsDefaultSkills Recording #78335153506 Logs
AgentInitCommand_MigratesDeprecatedConfig Recording #78335153506 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp Recording #78335153918 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp_DevLocalhost Recording #78335153918 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp_Isolated Recording #78335153918 Logs
AllPublishMethodsBuildDockerImages Recording #78335153731 Logs
AspireAddAndStartWorkAgainstLegacyAppHostTs Recording #78335154201 Logs
AspireAddPackageVersionToDirectoryPackagesProps Recording #78335153379 Logs
AspireInitSingleFileAppHostRunsViaDotnetRunAppHost Recording #78335155009 Logs
AspireInitWithExistingAppHostDirRecreatesMissingNuGetConfigAndPreservesFiles Recording #78335154710 Logs
AspireInitWithSolutionFileGeneratesAppHostThatBuildsAgainstChannelHive Recording #78335154710 Logs
AspireStartUpdatesStaleTypeScriptAppHostPath Recording #78335154638 Logs
AspireUpdateRemovesAppHostPackageVersionFromDirectoryPackagesProps Recording #78335153379 Logs
AspireUpdateRemovesOrphanAppHostPackageVersionWhenSdkAlreadyCurrent Recording #78335153379 Logs
Banner_DisplayedOnFirstRun Recording #78335153682 Logs
Banner_DisplayedWithExplicitFlag Recording #78335153682 Logs
Banner_NotDisplayedWithNoLogoFlag Recording #78335153682 Logs
CertificatesClean_RemovesCertificates Recording #78335154403 Logs
CertificatesTrust_WithNoCert_CreatesAndTrustsCertificate Recording #78335154403 Logs
CertificatesTrust_WithUntrustedCert_TrustsCertificate Recording #78335154403 Logs
ConfigSetGet_CreatesNestedJsonFormat Recording #78335154519 Logs
CreateAndRunAspireStarterProject Recording #78335154250 Logs
CreateAndRunAspireStarterProjectWithBundle Recording #78335153746 Logs
CreateAndRunEmptyAppHostProject Recording #78335154928 Logs
CreateAndRunJavaEmptyAppHostProject Recording #78335153859 Logs
CreateAndRunJsReactProject Recording #78335153643 Logs
CreateAndRunPythonReactProject Recording #78335155042 Logs
CreateAndRunTypeScriptEmptyAppHostProject Recording #78335154029 Logs
CreateAndRunTypeScriptStarterProject Recording #78335153088 Logs
CreateJavaAppHostWithViteApp Recording #78335154187 Logs
CreateTypeScriptAppHostWithViteApp_AllowsGuestAppPackageManagerToDiffer Recording #78335153776 Logs
CreateTypeScriptAppHostWithViteApp_UsesConfiguredToolchain Recording #78335153776 Logs
DashboardRunWithAgentMcpListTracesReturnsNoTraces Recording #78335154691 Logs
DashboardRunWithAgentMcpListTracesReturnsNoTraces_DevLocalhost Recording #78335154691 Logs
DashboardRunWithOtelTracesReturnsNoTraces Recording #78335154691 Logs
DashboardRunWithOtelTracesReturnsNoTraces_DevLocalhost Recording #78335154691 Logs
DeployK8sBasicApiService Recording #78335154275 Logs
DeployK8sWithExternalHelmChart Recording #78335154589 Logs
DeployK8sWithGarnet Recording #78335154712 Logs
DeployK8sWithMongoDB Recording #78335153985 Logs
DeployK8sWithMySql Recording #78335153089 Logs
DeployK8sWithPostgres Recording #78335153095 Logs
DeployK8sWithRabbitMQ Recording #78335154625 Logs
DeployK8sWithRedis Recording #78335153548 Logs
DeployK8sWithSqlServer Recording #78335154848 Logs
DeployK8sWithValkey Recording #78335153371 Logs
DeployTypeScriptAppToKubernetes Recording #78335153851 Logs
DescribeCommandResolvesReplicaNames Recording #78335153906 Logs
DescribeCommandShowsRunningResources Recording #78335153906 Logs
DetachFormatJsonProducesValidJson Recording #78335154605 Logs
DetachFormatJsonProducesValidJsonWhenRestartingExistingInstance Recording #78335154605 Logs
DoPublishAndDeployListStepsWork Recording #78335154730 Logs
DocsCommand_RendersInteractiveMarkdownFromLocalSource Recording #78335153645 Logs
DoctorCommand_DetectsDeprecatedAgentConfig Recording #78335153506 Logs
DoctorCommand_TypeScriptAppHostReportsMissingConfiguredToolchain Recording #78335154837 Logs
DoctorCommand_WithSslCertDir_ShowsTrusted Recording #78335154837 Logs
DoctorCommand_WithoutSslCertDir_ShowsPartiallyTrusted Recording #78335154837 Logs
GatewayWithoutExternalEndpoint_FailsPublishWithGuidance Recording #78335153320 Logs
GeneratedAspireDevScript_StartsWatchMode_WithConfiguredToolchain Recording #78335153776 Logs
GlobalMigration_HandlesCommentsAndTrailingCommas Recording #78335154519 Logs
GlobalMigration_HandlesMalformedLegacyJson Recording #78335154519 Logs
GlobalMigration_PreservesAllValueTypes Recording #78335154519 Logs
GlobalMigration_SkipsWhenNewConfigExists Recording #78335154519 Logs
GlobalSettings_MigratedFromLegacyFormat Recording #78335154519 Logs
IngressWithoutExternalEndpoint_FailsPublishWithGuidance Recording #78335153320 Logs
InitTypeScriptAppHost_AugmentsExistingViteRepoInWorkspaceSubdirectory Recording #78335153776 Logs
InteractiveCSharpInitCreatesExpectedFiles Recording #78335154303 Logs
InvalidAppHostPathWithComments_IsHealedOnRun Recording #78335154222 Logs
JavaScriptHostingApisRunFromTypeScriptAppHost Recording #78335153731 Logs
LatestCliCanStartStableChannelAppHost Recording #78335154250 Logs
LatestCliCanStartStableChannelTypeScriptAppHost Recording #78335154250 Logs
LegacySettingsMigration_AdjustsRelativeAppHostPath Recording #78335154638 Logs
LogsCommandShowsResourceLogs Recording #78335153798 Logs
OtelLogsReturnsStructuredLogsFromStarterApp Recording #78335153733 Logs
OtelLogsReturnsStructuredLogsFromStarterAppIsolated Recording #78335153733 Logs
PsCommandListsRunningAppHost Recording #78335153183 Logs
PsFormatJsonOutputsOnlyJsonToStdout Recording #78335153183 Logs
PublishJavaScriptPatternsGeneratesExpectedDockerComposeArtifacts Recording #78335153973 Logs
PublishWithConfigureEnvFileUpdatesEnvOutput Recording #78335153973 Logs
PublishWithDockerComposeServiceCallbackSucceeds Recording #78335153973 Logs
PublishWithoutOutputPathUsesAppHostDirectoryDefault Recording #78335153973 Logs
ResourceCommand_FailedExecution_DisplaysAppHostLogPathAndLogContainsEntries Recording #78335154933 Logs
ResourceCommand_SetAndDeleteParameterUpdatesDescribeOutput Recording #78335154933 Logs
RestoreGeneratesSdkFiles Recording #78335153700 Logs
RestoreGeneratesSdkFiles_WithConfiguredToolchain Recording #78335154920 Logs
RestoreRefreshesGeneratedSdkAfterAddingIntegration Recording #78335154920 Logs
RestoreSupportsConfigOnlyHelperPackageAndCrossPackageTypes Recording #78335154190 Logs
RunFromParentDirectory_UsesExistingConfigNearAppHost Recording #78335154003 Logs
RunReportsSyntaxErrorsForDotNetAppHost Recording #78335155124 Logs
RunReportsSyntaxErrorsForTypeScriptAppHost Recording #78335155124 Logs
SecretCrudOnDotNetAppHost Recording #78335154527 Logs
SecretCrudOnTypeScriptAppHost Recording #78335154806 Logs
StagingChannel_ConfigureAndVerifySettings_ThenSwitchChannels Recording #78335154002 Logs
StartAndWaitForTypeScriptSqlServerAppHostWithNativeAssets Recording #78335153600 Logs
StartReportsSyntaxErrorsForDotNetAppHost Recording #78335155124 Logs
StartReportsSyntaxErrorsForTypeScriptAppHost Recording #78335155124 Logs
StopAllAppHostsFromAppHostDirectory Recording #78335153903 Logs
StopJavaPolyglotAppHostUsingApphostDirectory Recording #78335154851 Logs
StopNonInteractiveSingleAppHost Recording #78335153903 Logs
StopTypeScriptPolyglotAppHostUsingApphostDirectory Recording #78335154795 Logs
StopWithNoRunningAppHostExitsSuccessfully Recording #78335154414 Logs
UnAwaitedChainsCompileWithAutoResolvePromises Recording #78335154920 Logs
UpdateProjectChannelToStable_CSharpEmptyAppHost_PreservesAspireConfigChannel Recording #78335153928 Logs
UpdateProjectChannelToStable_CSharpSingleFileInit_PreservesAspireConfigChannel Recording #78335153928 Logs
UpdateProjectChannelToStable_TypeScriptSingleFileInit_PreservesAspireConfigChannel Recording #78335153928 Logs
UpdateProjectChannelToStable_TypeScript_PreviewsStablePackagesAndPreservesChannel Recording #78335153928 Logs

📹 Recordings uploaded automatically from CI run #26584075410

@davidfowl
Copy link
Copy Markdown
Contributor

davidfowl commented May 28, 2026

Tried this, still takes forever (5 seconds is forever):

Screen.Recording.2026-05-28.104823.mp4

It also feels like forever because there's no feedback.

@alexTr3
Copy link
Copy Markdown

alexTr3 commented May 28, 2026

@davidfowl
With this will we be able to trigger a ResourceKnownCommand stop(graceful, sigterm) and kill(ungraceful) on any resources when testing??

@davidfowl
Copy link
Copy Markdown
Contributor

This is purely for the CLI itself, nothing to do with graceful shutdown of resources, which happens when you stop a resource from the dashboard or cli, but isn't graceful in the debugger in VS.

@alexTr3
Copy link
Copy Markdown

alexTr3 commented May 28, 2026

This is purely for the CLI itself, nothing to do with graceful shutdown of resources, which happens when you stop a resource from the dashboard or cli, but isn't graceful in the debugger in VS.

but it should be supported... We should be able to trigger a stop cmd that would send a sigterm to a resource when testing our distributed app..

You should at least rename KnownRessourcesCommand to Kill instead of Stop.. since this is what you truly are doing.. killing the process

@davidfowl
Copy link
Copy Markdown
Contributor

but it should be supported... We should be able to trigger a stop cmd that would send a sigterm to a resource when testing our distributed app..

aspire resource stop.

Lets not use this pr to discuss feature requests though

@adamint
Copy link
Copy Markdown
Member

adamint commented May 28, 2026

Will this fix #17625?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CLI Ctrl+C shutdown can take too long during AppHost startup

6 participants