Skip to content

CI: Rebase NumPy compiled extension test modules on Cygwin#23073

Merged
mattip merged 18 commits intonumpy:mainfrom
DWesl:patch-2
Jan 30, 2023
Merged

CI: Rebase NumPy compiled extension test modules on Cygwin#23073
mattip merged 18 commits intonumpy:mainfrom
DWesl:patch-2

Conversation

@DWesl
Copy link
Copy Markdown
Contributor

@DWesl DWesl commented Jan 23, 2023

This seems to be causing fork failures on Cygwin. Let's see if this reduces the number of failures.

Fixes #23070?

This seems to be causing fork failures on Cygwin.  Let's see if this reduces the number of failures.
@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 23, 2023

Four hours to see if all extension modules can find their dependencies. That step really shouldn't take longer than the test that loads all the modules, and definitely not more than twice as long.

@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 24, 2023

Six hundred errors on the last two attempts, which is less than ideal given that the original problem was eight. (EDIT: 8-41 in the last few PRs)

DWesl added 2 commits January 23, 2023 19:25
Let's see if this shows why everything's got a fork() failure now.
DWesl added 2 commits January 26, 2023 08:29
It should be plenty of time; each of these commands should complete in maybe five seconds per file on a slow day.
Let's see if this fixes the 8-50 fork failures.
@seberg
Copy link
Copy Markdown
Member

seberg commented Jan 27, 2023

On the cygwin help it still mentions to check for bad "bloda" software running (mostly anti-virus stuff), no idea if that can help...

But besides that, I am wondering if we have to opt for some hot-fix to get CI green again soon.

@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 27, 2023

On the cygwin help it still mentions to check for bad "bloda" software running (mostly anti-virus stuff), no idea if that can help...

It can, but I don't see anything obvious:
https://github.com/actions/runner-images/blob/main/images/win/Windows2022-Readme.md

But besides that, I am wondering if we have to opt for some hot-fix to get CI green again soon.

pytestmark = pytest.mark.xfail(sys.platform == "cygwin", message="Random fork() failures", error=BlockingIOError)

at module level in a couple of places should work to get CI green, but I haven't tested that.

Forgot to check this earlier.
DWesl added 2 commits January 27, 2023 12:19
Hopefully this will keep the main memory space clear enough to allow all tests to succeed.
This should help debugging a bit.
@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 27, 2023

The equivalent of -j 4 may separate the extension modules enough to work without the overhead of --forked (almost five hours to run just the F2Py tests seems like too much).

DWesl added 4 commits January 27, 2023 19:02
Hopefully this helps, since --forked takes almost five hours.
Let's see if this eliminates the fork failures.

It's back to around where it was before I started tinkering, so this approach may not work.
Tests hang, which is less than ideal.
Hopefully this one works well.
@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 29, 2023

The tests for every submodule but f2py are working, it's just that last submodule causing problems. We could drop those tests from CI runs, or I could port over the F2Py xfail list from #23114 and see if the combination allows CI to pass.

@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 29, 2023

On the cygwin help it still mentions to check for bad "bloda" software running (mostly anti-virus stuff), no idea if that can help...

It can, but I don't see anything obvious: https://github.com/actions/runner-images/blob/main/images/win/Windows2022-Readme.md

I found the image description:
https://github.com/actions/runner-images/blob/win22/20230123.1/images/win/Windows2022-Readme.md
The changes on Jan 24 (vcpkg, Kubectl, and Pulumi are the ones that catch my attention, the others don't look relevant) don't have obvious antivirus or DLL-handling changes.
The changes on Jan 19 (Vcpkg and Pester are the ones I don't know) also doesn't show anything obvious.
The Jan 12 changes are probably too far back to be relevant to this, since it got noticed on Jan 22

But besides that, I am wondering if we have to opt for some hot-fix to get CI green again soon.

pytestmark = pytest.mark.xfail(sys.platform == "cygwin", message="Random fork() failures", error=BlockingIOError)

at module level in a couple of places should work to get CI green, but I haven't tested that.

#23114 now up: it takes a few more places than I thought it would if running all the tests together.

Also adjust CI so they don't immediately collide with NumPy.
I forgot to do that last time, which caused problems.
@DWesl
Copy link
Copy Markdown
Contributor Author

DWesl commented Jan 29, 2023

Or write the locations of the NumPy extension modules to the rebase database so the F2Py tests don't put their modules right on top of those, guaranteeing fork() failures.

Copy link
Copy Markdown
Member

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, unbreaks the cygwin build. I am a little concerned about listing the non-f2py directories to test, but maybe this is good enough for now?

# Not sure if that will run the top-level tests.
shell: "C:\\tools\\cygwin\\bin\\bash.exe -o igncr -eo pipefail {0}"
run: |
for submodule in array_api compat core distutils fft lib linalg ma matrixlib polynomial random tests typing;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is the best path forward here:

  • a list of directories to use, which will break the next time we add/remove a top-level directory
  • some ugly path parsing like find numpy -maxdepth 1 -type d | cut -f2 -d/ | grep -v f2py | grep -v numpy | grep -v "^_" to get all the directories except f2py
  • add a command line switch to runtests.py to allow skipping a subdirectory
  • move the build to meson, and add a command line option to dev.py to skip a subdirectory

What you have is the simplest, but will be fragile. On the other hand, how often do we change the top-level directories?

Maybe is there a way to verify that your list is complete with a helper script?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might work runtests.py ... -- --deselect f2py

Copy link
Copy Markdown
Contributor Author

@DWesl DWesl Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for name in numpy/*; 
do 
    if [ -d "${name}" -a "${name}" != numpy/f2py -a "${name:6:1}" != "_" ] ;
    then
        echo ${name##numpy/}; 
    fi;
done

or

find numpy/* -maxdepth 0 -type d -a ! -name f2py -a ! -name _\* | cut -f2 -d/

would also work. I think a helper script to check the list is complete would need the same path parsing.

I don't think this separation is needed anymore, so I'll go back to the simple invocation and see if that works.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All submodule tests run at the same time again; PR title changed to match.

Ideally this works nicely and I can change the PR name.
If not, I put the split back, then the parallelization if that still doesn't work.
This assumes NumPy is rebased before tests run,
but does not assume the locations are in the database.
@DWesl DWesl changed the title CI: Split up NumPy compiled extension test modules CI: Rebase NumPy compiled extension test modules Jan 30, 2023
@DWesl DWesl changed the title CI: Rebase NumPy compiled extension test modules CI: Rebase NumPy compiled extension test modules on Cygwin Jan 30, 2023
Copy link
Copy Markdown
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me, not super pretty, but also reasonably little additional code in the test.

Thanks for so valiantly tracking this down!

@mattip mattip merged commit 9d2c019 into numpy:main Jan 30, 2023
@mattip
Copy link
Copy Markdown
Member

mattip commented Jan 30, 2023

Thanks @DWesl

@DWesl DWesl deleted the patch-2 branch January 8, 2025 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cygwin CI job fails with f2py errors

3 participants