New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agents very slow #8380
Comments
can you please provide repro steps ? what is "fetch step-template commit id" ? |
Hey, as of what can be seen, it looks like it is a networking issue most likely and not the image issue, could you provide repro steps? |
@ilia-shipitsin @mikhailkoliada |
@alminveh , I added step like you described to my pipeline GUI shows ~35s |
@ilia-shipitsin is that with 20230918.1.0 image? Because with all the runs I have with 20230912.1.0 it took either one or two seconds. |
I collected logs, I see that "windows update" takes 15 sec it did not suppose to happen, because we disable windows update: https://github.com/actions/runner-images/blob/main/images/win/scripts/Installers/Finalize-VM.ps1#L94 interesting thing is "windows update medic service" which seems to enable windows update back |
I second that (alminveh filing). Previously (20230910 runner) would take 10-20 sec to execute 8 times a Powershell script which changes ACL of a file, but since Sep 21 the new runner 20230918 takes 3 min (and timeout) running the same tests. Thank you |
@iglendd , sorry, I cannot say either your issue is similar or not. Feel free to open separate issue with full repro steps. We'll investigate and mention if issues are related. |
Hi team, is there any ETA plan to fix this issue? thanks. |
let me summarize findings
what was found something has changed in powershell.exe (which is Powershell 5.1) behaviour. It "analyzes" all possible modules during startup and builds a cache. Somehow it started to take ~20 sec (we did not change modules a lot, maybe few minor changes if any at all) https://learn.microsoft.com/en-us/powershell/scripting/windows-powershell/wmf/whats-new/release-notes?view=powershell-7.3#module-analysis-cache - this is eating ~20 sec if you can switch to "pwsh" - it would be a workaround (in short, if you use GHA - you can, in ADO tasks are built on top of "powershell.exe", not possible to switch). we'll try to escalate to powershell team. no idea (yet) how to fix that. |
I compared current image against previous, I haven't found anything that looks suspicious |
What we see in our runs is not that something adds 20-30 seconds (or minutes) to execution time but overall execution is just way slower. Our test run tasks (WebDriverIO tests) that use to take between 2.5 and 3 hours now consistently run for 6+ hours (and fail with agent timeout every time), shorter test runs doubled in time needed to complete (~10 min -> ~20 min, ~1 hour -> 2+ hours, etc.). |
@alminveh , initially you reported that "even simple Powershell task takes 20-30 sec instead of 1 sec". I confirm that. If your tests under the hood invoke powershell many times, I suspect it is the same issue. If not, please provide repro steps (better in separate issue) |
good news, |
I confirm that everything is back to normal with 20230924.1.0. Thank you @ilia-shipitsin . |
Current image version: '20230924.1.0' Gives me random slowdowns of tasks. I've seen:
Explicitly stopping StorSVC prior to checkout normalizes things:
if StorSVC wants to stop, usually it does, but I'm seeing a few instances of:
I understand from @ilia-shipitsin that stopping Windows Update will take a bit more force. See: https://jessehouwing.net/az-cli-performance-azure-pipelines-and-github-runner/ |
feel free to play with disabling StorSvc for measurements. we plan to release an image with disabled StorSvc next week: #8388 |
Underlying problem must be solved, closing now, feel free to open separate tickets for something else |
Description
Since September 21 agents are very slow. Even simple tasks that usually take 1-2 seconds now take 25-30 seconds. A comparison of some tasks from two runs can be seen below:
On the left side we have a run before September 21 (simple tasks taking 1-2 seconds to complete), on the right side we have the same tasks running on September 25 (same tasks taking 25-30 seconds to complete.
Platforms affected
Runner images affected
Image version and build link
Issue started from version 20230918.1.0
Is it regression?
20230912.1.0
Expected behavior
Tasks are fast to execute
Actual behavior
Task execution is very slow
Repro steps
The text was updated successfully, but these errors were encountered: