-
-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GHA/windows: mitigate drastic runtests perf drop under MSYS2 #16217
Conversation
Switching back to the default shell doesn't help. Maybe we could install yet another copy of MSYS2 for vcpkg and dl-mingw jobs? I don't know. This is extremely exhausing and time consuming, and apparently neverending. edit:
msys2/msys2-runtime#230 (Perl pipe issue report from October, still open) Possibly related, this fixes ARM deadlocks fixed by GfW 2.47.1(1), but for x86_64, on a quick glance: Possibly interesting: |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
e3b82d6
to
9aa0480
Compare
Seen with MSVC 2019: ``` D:\a\curl\curl\lib\vtls\wolfssl.c(773): error C2220: the following warning is treated as an error [D:\a\curl\curl\bld\lib\libcurl_object.vcxproj] D:\a\curl\curl\lib\vtls\wolfssl.c(773): warning C4706: assignment within conditional expression [D:\a\curl\curl\bld\lib\libcurl_object.vcxproj] ``` https://github.com/curl/curl/actions/runs/13190321645/job/36821938202?pr=16217#step:9:30
Silence checksrc where it detects it as EQUALSNULL.
Follow-up to 1bf774d curl#16217 Follow-up to 5f9411f curl#15380
Follow-up to 1bf774d curl#16217 Follow-up to 5f9411f curl#15380
The reason I choose Git for Windows for the workaround was that it's quick to install It may be interesting that the MSYS2 installed via I tried diffing the runtime sources listed in the PR message, e.g. 0bc1222b and 2644508f |
I think the 0bc1222b revision was a red herring (probably a bug in the packaging), I think I'll have to go by the date to figure out exactly what revision was used. Not horribly difficult, just an extra step. |
The revision of the 2024-12-05 build of msys2-runtime was really 8d847f4 |
I seem to have hit a wall now. I think I upset Github actions running that workflow so many times. Now I can't run any workflows (at least not on Windows runners), I get a message about
But in my account settings, all usages show 0. (I don't have any payments set up). What I found so far is that I have isolated the slowdown to msys2-runtime (I ran the 2024-12-08 msys2 installer and it was "fast", I updated all packages except msys2-runtime and it stayed "fast", and I updated all packages including msys2-runtime and it slowed down). BTW, updating perl resulted in a couple of test failures. Do you by chance use |
Whatever limit I hit seems to have reset now 😌 . Might have to give this a rest for a while so I can do other things with GHA 😁 |
Based on the timing (affected Git for Windows when it first updated to Cygwin 3.5, and upstream MSYS2 only when it updated from 3.5.4 to 3.5.7) my first suspect commit was msys2/msys2-runtime@c7fe29f. I was about to build msys2-runtime with that commit reverted, and inject the built artifacts into the workflow when all my GHA runs started failing. |
Yep
And looking for that I found another bug waiting to happen: #######################################################################
# return Cygwin pid from virtual pid
#
sub winpid_to_pid {
my $vpid = $_[0];
if(($^O eq 'cygwin' || $^O eq 'msys') && $vpid > 65536) {
my $pid = Cygwin::winpid_to_pid($vpid - 65536);
if($pid) {
return $pid;
} else {
return $vpid
}
}
return $vpid;
} Cygwin recently increased its max pid (cygwin/cygwin@363357c), so the hardcoded 65536 will no longer be correct as of 3.6.0. It should be getting the max pid by calling C api |
I think it may help to delete in your fork all workflows/jobs besides just one dl-mingw job for testing. |
Nice, looks like we'll have to do some fixing there. /cc @mback2k @dfandrich edit: The C code complementing the Perl PID logic: (this is for native Windows, so no way to query Lines 281 to 296 in d1fc1c4
PR for the tests (and some more): #16411 |
To throw another fun wrinkle into your world, I think GfW is contemplating dropping perl at some point. git-for-windows/git#5393 |
Splendid :) Thanks for the heads up! For curl we are happy with MSYS2 and don't need GfW except for this workaround |
Maybe you could take your issue to cygwin@cygwin.com with this knowledge, since we pretty well know now it is a regression in an upstream Cygwin commit? I don't know that they'll jump right on it at the moment though, they're kind of focused on regressions between 3.5.7 and 3.6.0 right now. (the commit I reverted is necessary to avoid a different bug so it's not just a matter of reverting it) |
Another option, I don't know if it's any better for you as a workaround, is to just downgrade msys2-runtime: |
Wow, nice! I hope your revert makes its way to the binaries soon. Taking to Cygwin, I'm not sure I'd like to do that. I have little insight Hopefully someone else can take this based on your research and Thanks for looking into this, much appreciated, in the name of all |
I like it better than the GfW workaround. Shorter, safer, I'll give it a spin once merged the double quote PR. |
You can squeeze more performance out of pacman if you
before calling it. That check is surprisingly expensive under MSYS2, and all it does is make sure the operation is aborted early if there's not enough free disk space for it. Very unlikely to be the case in this CI scenario, and if it is, it's no great loss if you end up with a corrupted install vs failing early. setup-msys2 action does this same change for the same reason. |
Excellent, I'll try it! |
MSYS/MSYS2 and Cygwin are the same platform. Adjust code where they were treated differently. - drop separate `MSYS` from buildinfo flags. Our code is using the `CYGWIN` variable and CMake (since v3.21) sets it also for `MSYS`. - fix test1158 and test1186 to exclude them for all Win32 targets, instead of just MSYS test envs. To align behavior between MSYS and Cygwin envs. Required for recent MSYS2 releases which reports itself as Cygwin, and no longer MSYS, which broke the previous exclusion logic. - follow Cygwin bumping its `MAX_PID` value, to avoid PID collisions. https://cygwin.com/git/?p=newlib-cygwin.git;a=commit;h=363357c023ce01e936bdaedf0f479292a8fa4e0f Reported-by: Jeremy Drake Bug: #16217 (comment) Ref: https://www.msys2.org/news/#2025-02-14-moving-msys2-closer-to-cygwin Closes #16411
I was just trying to merge cygwin's main branch with git-for-windows, since they are planning to release a new Cygwin 3.6.0 in the next few weeks and I wanted to see what breaks. It looks like the pipe code implicated in the commit I found has changed a lot between 3.5 and 3.6, hopefully the new code performs better (and works correctly in the case the commit was trying to fix in the first place) I saw 2:50 on the run tests step with my hack-merged 3.6 runtime: https://github.com/jeremyd2019/curl/actions/runs/13464176764/job/37626204204 (jeremyd2019@0a7c629) |
We recently switched to a known good version of Git for Windows to avoid the MSYS2/Cygwin runtime performance regression. MSYS2 is closer to the source of the MSYS2/Cygwin projects. Its known good version is newer. Installing the downgrade is faster and safer. It also allows to restore the scripts to their original iteration, making the workaround easier to drop once the perf issue is fixed upstream. Therefore, switch back to using MSYS2, and install the runtime downgrade before running curl tests. Also disable `pacman`'s `CheckSpace` for best performance. Jeremy identified to the root cause of the perf regression in this Cygwin commit (from 2024-09-17): https://cygwin.com/git/?p=newlib-cygwin.git;a=commit;h=c7fe29f5cb85242ae2607945762f7e0b9af02513 Co-authored-by: Jeremy Drake Patch: jeremyd2019@95a404e Ref: #16217 (comment) Ref: #16217 (comment) Follow-up to 116950a #16265 Follow-up to 1bf774d #16217 Follow-up to 5f9411f #15380 Closes #16424
Cygwin MAX_PID is 64K until 3.6 bumps it to 4M like Linux Required for orgs running new AMD EPYC servers with hundreds of MT cores |
@BrianInglis: Thanks! We're hopefully covered after merging 4842f22.
@jeremyd2019: Looking good! I'm glad for the results you're getting. We're also experiencing some flakiness and crashes/hangs in |
I run the Cygwin curl tests in CI and local with unlimited job slots |
The flakiness affects native Windows builds with test run parallelism enabled. I think You may also try configuring with |
Thanks for those tips which I will try out in some tests, but |
You're welcome! A missing entry means test parallelism is not enabled. Try the |
Today GHA Windows runner images (all versions) deployed an upgrade
(20250127.1.0 -> 20250203.1.0) that upgraded the default MSYS2, which
now seems to feature the October 2024 issue that caused curl runtests
run times increasing ~2.5x. It also causes test987 to fail, and vcpkg
jobs hitting their time limits and fail. Reliability also got a hit.
In October this issue came with a Git for Windows upgrade, and likely
the MSYS2 runtime update within it. It affected vcpkg jobs only, and
I mitigated it by switching them to use the default MSYS2 shell and
runtime (at
C:\msys64
):5f9411f #15380
After today's update this mitigation no longer works. The issue also
affects
dl-mingw
jobs now, though to a lesser extent than vcpkg ones.Tried switching back to Git for Windows which received several updates
since October, but the performance issue is still present.
I managed to mitigate the slowdown in vcpkg by lowering test parallelism
to
-j4
(from-j8
), after which the jobs are about half the speedthan before, and fit their time limits.
dl-mingw
builds run slower by1-1.5 minutes per job, they were already using
-j4
.Example jobs:
Before (ALL GOOD):
https://github.com/curl/curl/actions/runs/13167230443/job/36750175428 installed MSYS2, mingw (-j8): 3m50s (OK)
https://github.com/curl/curl/actions/runs/13167230443/job/36750158662 default MSYS2, dl-mingw (-j4): 4m22s (OK)
https://github.com/curl/curl/actions/runs/13167230443/job/36750163392 default MSYS2, vcpkg (-j8): 3m27s (OK)
runner: https://github.com/actions/runner-images/blob/win22/20250127.1/images/windows/Windows2022-Readme.md
C:\msys64:
System: MSYS_NT-10.0-20348 fv-az1115-916 3.5.4-0bc1222b.x86_64 2024-12-05 09:27 UTC x86_64 Msys
msys2/msys2-runtime@0bc1222b
edit: really → msys2/msys2-runtime@8d847f4
After:
https://github.com/curl/curl/actions/runs/13186498273/job/36809747078 installed MSYS2, mingw (-j8): 3m48s (OK)
https://github.com/curl/curl/actions/runs/13186498273/job/36809728481 default MSYS2, dl-mingw (-j4): 5m56s (SLOW)
https://github.com/curl/curl/actions/runs/13186498273/job/36809736429 default MSYS2, vcpkg (-j8): 9m1s (SLOW)
runner: https://github.com/actions/runner-images/blob/win22/20250203.1/images/windows/Windows2022-Readme.md
C:\msys64:
System: MSYS_NT-10.0-20348 fv-az1115-498 3.5.7-2644508f.x86_64 2025-01-30 09:08 UTC x86_64 Msys
msys2/msys2-runtime@2644508f
windows-2025 image:
C:\msys64:
System: MSYS_NT-10.0-26100 fv-az2043-515 3.5.7-2644508f.x86_64 2025-01-30 09:08 UTC x86_64 Msys
windows-2019 image:
C:\msys64:
System: MSYS_NT-10.0-17763 fv-az1434-677 3.5.7-2644508f.x86_64 2025-01-30 09:08 UTC x86_64 Msys
This PR:
final: https://github.com/curl/curl/actions/runs/13186498273/job/36809736429 GfW, vcpkg (-j4): ~7m (SLOW)
test: https://github.com/curl/curl/actions/runs/13187992987/job/36814644852?pr=16217, GfW, vcpkg (-j8): ~11m (SLOWER)
Before and after (unused) Git for Windows (SLOW as tested in this PR):
C:\Program Files\Git
System: MINGW64_NT-10.0-20348 fv-az1760-186 3.5.4-395fda67.x86_64 2024-11-25 09:49 UTC x86_64 Msys
msys2/msys2-runtime@395fda67 (fork)
Before and after (used) MSYS2 installed via msys2/setup-msys2 (OK):
D:\a_temp\msys64
System: MINGW64_NT-10.0-20348 fv-az836-378 3.5.4-0bc1222b.x86_64 2024-12-05 09:27 UTC x86_64 Msys
Perl pipe issue report from October, still open:
msys2/msys2-runtime#230
ARM deadlock fixed by GfW 2.47.1(1), but for x86_64, on a quick glance:
msys2/msys2-runtime@290bea9
Possibly interesting:
msys2/msys2-autobuild#62