New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout support to SilentProcessRunner via special variable to terminate stuck processes #468
Conversation
…riable "Octopus.Action.Script.Timeout"
source/Calamari.Shared/Integration/Processes/SilentProcessRunner.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. A few changes please.
source/Calamari.Shared/Integration/Processes/SilentProcessRunner.cs
Outdated
Show resolved
Hide resolved
source/Calamari.Shared/Integration/Processes/SilentProcessRunner.cs
Outdated
Show resolved
Hide resolved
source/Calamari.Shared/Integration/Processes/SilentProcessRunner.cs
Outdated
Show resolved
Hide resolved
source/Calamari.Tests/Fixtures/Integration/Process/CommandLineRunnerFixture.cs
Outdated
Show resolved
Hide resolved
These kinds of timeouts have a habit of not working on Linux, a test should uncover that though. |
…on workflow. Add *Nix test to CommandLineRunnerFixture that is functional equivalent of Windows timeout test.
Do you mean the timeout test command or the use of timeout in the Process.WaitForExit call? In *Nix the timeout one does indeed behave different, so the test would have failed. If needed, I'll have to spin up a linux tentacle container to test the Process.WaitForExit though. |
Thanks for taking the time @droyad :) Really appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made requested changes
Closing in favor of #688 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I commented but didn't submit
@@ -149,11 +152,21 @@ static SilentProcessRunner() | |||
process.BeginOutputReadLine(); | |||
process.BeginErrorReadLine(); | |||
|
|||
process.WaitForExit(); | |||
// TimeSpan.TotalMilliseconds can have a value > int.MaxValue, so we just assume wait for ever if this happens. | |||
var timeoutMilliseconds = timeout == null || timeout.Value.TotalMilliseconds > int.MaxValue ? -1 : (int)(timeout.Value.TotalMilliseconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work:
var timeoutMilliseconds = timeout == null || timeout.Value.TotalMilliseconds > int.MaxValue ? -1 : (int)(timeout.Value.TotalMilliseconds); | |
var timeoutMilliseconds = (int) (timeout ?? Timeout.InfiniteTimeSpan).TotalMilliseconds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout.InfiniteTimeSpan
is -1
Could use a documentation update if accepted?
Issue
Tentacles wait indefinitely on actions which run scripts. For the most part, this is desirable by design and handled here. There are many valid cases were a timeout on scripts should be supported to prevent tentacles from being indefinitely stuck processing action scripts (and thus preventing other operations such as clean up via Script Console, Runbooks, or other deployments).
One example of this is described in this SO post and this IIS post when calling
CommitChanges
on IIS's ServerManager objects. This is important for Octopus scripts such as IISWebSite_BeforePostDeploy that utilize the WebAdministration PowerShell module and are run on systems with high load. Cmdlets from this module internally callappcmd
, which then calls theCommitChanges
method of theServerManager
object.CommitChanges
and appcmd may exit before theSet-ItemProperty
method has called a Wait on it. As a result, random deployments can get stuck waiting for appcmd.We have seen this behavior in many site deployments when using virtual paths.
Samples
We captured some sample dumps and images showing this behavior. Images are included below:
Debugging Calamari dump showing that it gets stuck in
SilentProcessRunner
Powershell dump showing that it gets stuck in
Set-ItemProperty
setting a virtual pathProcessExplorer investigation to confirm IISWebSite_BeforePostDeploy is stuck on appcmd.
PR Changes
This PR adds the special variable
Octopus.Action.Script.Timeout
to allow a millisecond timeout to be defined via a process's variables that will be used to limit the amount of time a process can wait. If no timeout is defined, the default behavior of waiting indefinitely will be used. When it is defined, a message will be displayed on the Verbose log indicating a timeout value was used.If the timeout is hit
Tests
A test was added using Windows' Timeout command to test that the timeout values are honored. Also deployed a Calamari package with the changes internally to our deployment server and confirmed it honors the timeout values when present.
Output after timeout (redacted):