Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test NumPy against OpenBLAS weekly builds #24250

Merged
merged 11 commits into from
Jul 31, 2023

Conversation

honno
Copy link
Contributor

@honno honno commented Jul 24, 2023

This PR tests NumPy against OpenBLAS weekly builds, as well as stable builds like before.

Things to note:

cc @mattip @steppi @rgommers

@honno honno force-pushed the test-openblas-nightlies branch 2 times, most recently from 90034ad to 0373773 Compare July 24, 2023 13:20
@rgommers rgommers added this to the 2.0.0 release milestone Jul 24, 2023
@rgommers
Copy link
Member

@honno the failures look relevant.

@mattip
Copy link
Member

mattip commented Jul 24, 2023

The macos-arm64 wheel build failure is due to the new code, it seems the getopts parsing is not right on macOS.

@steppi
Copy link
Contributor

steppi commented Jul 24, 2023

The macos-arm64 wheel build failure is due to the new code, it seems the getopts parsing is not right on macOS.

Right, this is why I reverted using getopt in the SciPy version of this PR scipy/scipy#18889 (comment)

@honno
Copy link
Contributor Author

honno commented Jul 24, 2023

The macos-arm64 wheel build failure is due to the new code, it seems the getopts parsing is not right on macOS.

Right, this is why I reverted using getopt in the SciPy version of this PR scipy/scipy#18889 (comment)

Fixed, all green now 😅

.github/workflows/linux_meson.yml Show resolved Hide resolved
tools/openblas_support.py Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
tools/openblas_support.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@honno honno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per Matti's suggestions I've simplified a few things... and we should actually be testing against the weekly builds of OpenBLAS now 😅

.github/workflows/linux_meson.yml Show resolved Hide resolved
tools/openblas_support.py Outdated Show resolved Hide resolved
@honno honno requested a review from mattip July 25, 2023 16:24
@mattip
Copy link
Member

mattip commented Jul 26, 2023

Two last things:

  • the jobs are called "meson_spin (true)" and "meson_spin (false)". Could you think of a way to get better names?
  • Is there a check you can point to that the correct OpenBLAS version is being used? The azure builds use this, which might need to grow a --nightly option that would verify that the OpenBLAS version is not the one in the ``OPENBLAS_LONG` value:

    numpy/azure-pipelines.yml

    Lines 180 to 185 in 511f77d

    - bash: |
    python -m pip install threadpoolctl
    python tools/openblas_support.py --check_version
    displayName: 'Verify OpenBLAS version'
    condition: eq(variables['USE_OPENBLAS'], '1')

@honno
Copy link
Contributor Author

honno commented Jul 26, 2023

the jobs are called "meson_spin (true)" and "meson_spin (false)". Could you think of a way to get better names?

Yep, so TIL name attribute in jobs determines the... name, and we can do a hacky condition in a GitHub expression (see actions/runner#409 (comment)) to change it depending on say matrix values. Currently we get

Test Linux (stable OpenBLAS)
Test Linux (nightly OpenBLAS)

Is there a check you can point to that the correct OpenBLAS version is being used?

I played around with things, no success so far. Notably openblas_support.py::test_version() uses import numpy, which gives us an ImportError, seemingly by virtue of how NumPy is built in this workflow. Also its a pain to explore/debug how test_version() works locally. Might write a separate script for now that just prints the version.

@mattip
Copy link
Member

mattip commented Jul 26, 2023

Notably openblas_support.py::test_version() uses import numpy, which gives us an ImportError, seemingly by virtue of how NumPy is built in this workflow.

Since this is using the spin workflow, you will need to use spin python ... to execute the check

@honno
Copy link
Contributor Author

honno commented Jul 27, 2023

which might need to grow a --nightly option that would verify that the OpenBLAS version is not the one in the OPENBLAS_LONG value:

The OpenBLAS nightly build seemingly has the same "version" as the stable build, looking at this run

ValueError: nightly OpenBLAS version should not be 0.3.23.dev, got [{'user_api': 'blas', 'internal_api': 'openblas', 'num_threads': 2, 'prefix': 'libopenblas', 'filepath': '/usr/lib/libopenblasp-r0.3.23.dev.so', 'version': '0.3.23.dev', 'threading_layer': 'pthreads', 'architecture': 'SkylakeX'}]

I checked against OPENBLAS_V, as thats what openblas_support.py::test_version() uses—I might of misunderstand you though. I assume the version attribute (however threadpoolctl gets it) would be whatever the latest released version is anywho. I'm not sure what to look for really...

Reproducing building with nightly OpenBLAS builds locally has been a huge time sink already, been hard to experiment 🙃

@rgommers
Copy link
Member

That test_version check is horrible, I suggest we ignore it. Looking at the CI logs, it's clear that the "stable" job does python tools/openblas_support.py while the "nightly" job does python tools/openblas_support.py --nightly. If we'd just uncomment a print statement in download_openblas to verify the exact artifact we're downloading, we should be all good here.

@honno
Copy link
Contributor Author

honno commented Jul 31, 2023

If we'd just uncomment a print statement in download_openblas to verify the exact artifact we're downloading, we should be all good here.

Introduced a new comment as wasn't sure what to do with # print(f"Downloading {length} from {filename}", file=sys.stderr). Looks like we're downloading the right files!

https://github.com/numpy/numpy/actions/runs/5716123665/job/15486923504?pr=24250#step:4:63
https://github.com/numpy/numpy/actions/runs/5716123665/job/15486923255?pr=24250#step:4:63

@rgommers
Copy link
Member

The Circle CI build wasn't touched in this PR and is only failing because this branch is behind. The Cirrus CI failures are unrelated too. Still, a few too many failures to confidently merge - let's retry that now (I merged main into this PR).

@rgommers
Copy link
Member

Okay, the intel_spr_sde_test failures are clearly unrelated, and the rest all looks good now. In it goes. Thanks @honno, @mattip & @steppi!

@rgommers rgommers merged commit d99dd6d into numpy:main Jul 31, 2023
43 of 53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants