Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent operations from getting stuck in the Starting state #6686

Merged
merged 2 commits into from
May 21, 2024

Conversation

schmittjoseph
Copy link
Member

@schmittjoseph schmittjoseph commented May 20, 2024

Summary

When an operation is first created it enters a Starting state and typically signals a task completion source once it has started and the operation transitions to a Running state. If an error occurs while running the operation transitions to the Failed state.

There are 2 edge cases that could cause an operation to get indefinitely stuck in the Starting state:

  1. If the operation encounters an error (i.e. throws an exception) before it has a chance to transition to the Running state, the operation will be stuck in the Starting state and never transition to Failed.
  2. If the operation has a bug and completes without ever signaling the started task completion source (e.g. it should transition from Starting -> Completed), the operation will be stuck in the Starting state.

This PR fixes these issues and adds test coverage for EgressOperationService. Since there previously was no test coverage, I've had to do some minor refactoring to enable easier testing.

Closes #6683

Release Notes Entry

Fix an issue where operations could finish, or fail, but their state would be still be reported as Starting (issue #6683).

@schmittjoseph schmittjoseph marked this pull request as ready for review May 20, 2024 20:50
@schmittjoseph schmittjoseph requested a review from a team as a code owner May 20, 2024 20:50
@schmittjoseph schmittjoseph added the update-release-notes Pull requests that should be mentioned in the release notes label May 20, 2024
@schmittjoseph
Copy link
Member Author

schmittjoseph commented May 20, 2024

NOTE: 6.X.X does not have the concept of the Starting operation state so this doesn't need to be backported there.

@schmittjoseph schmittjoseph added the bug Something isn't working label May 20, 2024
…eration/EgressOperationServiceTests.cs

Co-authored-by: Justin Anderson <jander-msft@users.noreply.github.com>
@schmittjoseph schmittjoseph merged commit 3cb7701 into dotnet:main May 21, 2024
26 checks passed
@schmittjoseph
Copy link
Member Author

/backport to release/8.0

@schmittjoseph schmittjoseph deleted the fix-state-transition branch May 21, 2024 18:35
Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/dotnet-monitor/actions/runs/9179706258

Copy link
Contributor

@schmittjoseph backporting to release/8.0 failed, the patch most likely resulted in conflicts.

Please backport manually using one of the below commands, followed by git am --continue once the merge conflict has been resolved.

PowerShell

(Invoke-WebRequest "https://github.com/dotnet/dotnet-monitor/commit/3cb77016018922dac35906436ff9cf4601e8f1cc.patch").Content | git am --3way --ignore-whitespace --exclude="documentation/**.md" --keep-non-patch

Bash

curl -sSL "https://github.com/dotnet/dotnet-monitor/commit/3cb77016018922dac35906436ff9cf4601e8f1cc.patch" | git am --3way --ignore-whitespace --exclude="documentation/**.md" --keep-non-patch

git am error output:

$ git am --3way --ignore-whitespace --exclude="documentation/**.md" --keep-non-patch changes.patch

Applying: Prevent operations from getting stuck in the `Starting` state (#6686)
Using index info to reconstruct a base tree...
M	src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs
M	src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagnosticsControllerBase.cs
Falling back to patching base and 3-way merge...
Auto-merging src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagnosticsControllerBase.cs
CONFLICT (content): Merge conflict in src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagnosticsControllerBase.cs
Auto-merging src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 Prevent operations from getting stuck in the `Starting` state (#6686)
Error: The process '/usr/bin/git' failed with exit code 128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working update-release-notes Pull requests that should be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Operations can indefinitely be stuck in Starting state
2 participants