New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI, MAINT: Windows 3.11 CI failure with file access issue #19852
Comments
Adding a partial log:
Doesn't look familiar I'm afraid. |
Quoting myself from #19797
Basically, there seems to be a combination of factors such that several paths in our meson build try to independently and simultaneously build scipy-openblas. Or perhaps it's an outright meson bug |
There's nothing to build there, there's detection but no building. The paths don't really make sense either, why would CMake be used here all of a sudden? And this version is old and used to work fine:
This is the one regular CI job that still uses the openblas tarballs rather than the wheels. Detection actually looks fine:
What seems to be happening is that the sdist gets created successfully, but on cleaning up the build dir that meson-python creates, something is holding on to a file handle and then things go haywire. Maybe a Meson 1.3.1 issue, or due to a change in GHA CI setup for Windows, or .... What changed here most recently is that gh-19724 disabled build isolation for this job 3 days ago. But CI passed on that PR - if it's the cause, then it's intermittent. EDIT: the failures on this job only started yesterday. |
I'll note that GHA did have a significant outage earlier today, so perhaps it was that. I re-ran the failed job linked above (https://github.com/scipy/scipy/actions/runs/7477725340/job/20364680808), and it's past the point now where it failed last time around. Yet another suspect: new Windows GHA runner image: https://github.com/actions/runner-images/blob/main/images/windows/Windows2019-Readme.md. That's the newest change; none of our build deps had a release recent enough. |
As an aside, the CI definition seems a bit... broken. Well, really, powershell is a bit broken. If a command fails that other commands rely on, the job shouldn't continue by trying to delvewheel + install the result. |
Yes indeed. I can never remember the bad Powershell syntax for anything, and that includes halt-on-error. When we had CI jobs on Azure it was figured out at some point, but the whole thing was unreadable. |
Still happening intermittently, so frequency is probably related to both the size of the build dir/definitions and the details of the Windows image (responsiveness etc.). Fix in mesonbuild/meson#12726 looks promising, thanks @thalassemia! I'll see if I can change the job to avoid this problem in the first place in the meantime. |
This avoids looking for scipy-openblas with CMake (which we never want), and avoids looking for it twice (that was an oversight). As a result, we should be robust to whatever is the underlying problem of the CI failures reported in scipygh-19852 are. Closes scipygh-19852 [skip cirrus] [skip circle]
If anyone is curious, the underlying issue was probably mesonbuild/meson-python#559. |
Awesome, thanks for getting to the bottom of that. |
This avoids looking for scipy-openblas with CMake (which we never want), and avoids looking for it twice (that was an oversight). As a result, we should be robust to whatever is the underlying problem of the CI failures reported in scipygh-19852 are. Closes scipygh-19852 [skip cirrus] [skip circle]
This is affecting both the
maintenance/1.12.x
branch (discussed at: #19797 (comment) ; sample log: https://github.com/scipy/scipy/actions/runs/7467279245/job/20323369865?pr=19797) and themain
branch: #19849 (log: https://github.com/scipy/scipy/actions/runs/7477725340/job/20351079843?pr=19849)From a release management standpoint, this actually gives me some confidence that the matter is not a blocker to proceed with
1.12.0
RC2 (i.e., not specific to that branch).The text was updated successfully, but these errors were encountered: