Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for agent hang when process cannot be killed #4074

Merged
merged 10 commits into from Dec 7, 2022
Merged

Fix for agent hang when process cannot be killed #4074

merged 10 commits into from Dec 7, 2022

Conversation

merlynomsft
Copy link
Contributor

@merlynomsft merlynomsft commented Dec 2, 2022

This PR fixes an issue where the agent will hang and fail to upload logs if the process executed cannot be killed during cancellation. The pipeline hangs until timeoutInMinutes + cancelTimeoutInMinutes

The log fails to upload and a blank log is shown in the log viewer UX:
image

Analysis
Killing process during cancellation is 'best effort' and may fail if, for some reason, the process cannot be killed. In that case, the agent waits forever in a loop for the process to exit. Fix is to exit the loop after attempting to kill the process.

Testing
This issue is not easy to reproduce, because it requires the agent to launch a process that cannot be killed
I simulated this condition by modifying ProcessInvoker to skip killing the process

**Knob / Enabling **
The behavior is disabled by default. This change is enabled by setting knob: VSTSAGENT_CONTINUE_AFTER_CANCEL_PROCESSTREEKILL_ATTEMPT

VSO 2009025

@merlynomsft merlynomsft changed the title Log upload hang fix Fix for agent hang when process cannot be killed Dec 2, 2022
@merlynomsft merlynomsft added the bug label Dec 2, 2022
@merlynomsft merlynomsft marked this pull request as ready for review December 2, 2022 07:47
@merlynomsft merlynomsft requested review from a team as code owners December 2, 2022 07:47
src/Agent.Sdk/ProcessInvoker.cs Show resolved Hide resolved
src/Agent.Sdk/ProcessInvoker.cs Show resolved Hide resolved
src/Agent.Sdk/ProcessInvoker.cs Show resolved Hide resolved
src/Agent.Sdk/Knob/AgentKnobs.cs Show resolved Hide resolved
src/Agent.Sdk/ProcessInvoker.cs Outdated Show resolved Hide resolved
Copy link
Contributor

@DenisRumyantsev DenisRumyantsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merlynomsft thank you so much for your work!
I artificially reproduced the issue to test these changes.
The agent is working properly and the issue is resolved.

@LiliaSabitova LiliaSabitova merged commit 35cc15c into microsoft:master Dec 7, 2022
@merlynomsft
Copy link
Contributor Author

Please note currently this behavior is opt-in. To do so, set the following environment variable on the agent:

VSTSAGENT_CONTINUE_AFTER_CANCEL_PROCESSTREEKILL_ATTEMPT: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants