Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Windows NumPy 2.x int shims #19724

Merged

Conversation

tylerjereddy
Copy link
Contributor

[skip cirrus]

@tylerjereddy tylerjereddy added needs-work Items that are pending response from the author maintenance Items related to regular maintenance tasks labels Dec 21, 2023
@@ -27,7 +27,7 @@ class TRLIBQuadraticSubproblem(BaseQuadraticSubproblem):
if fwork_view.shape[0] > 0:
fwork_ptr = &fwork_view[0]
ctrlib.trlib_krylov_prepare_memory(itmax, fwork_ptr)
self.iwork = np.zeros([iwork_size], dtype=int)
self.iwork = np.zeros([iwork_size], dtype=np.dtype("long"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me.

@andyfaff
Copy link
Contributor

I'm happy with the optimize changes.

@andyfaff
Copy link
Contributor

Although we're not going to get visibility in CI on whether this all works unless we amend a windows job to install numpy2.0.

@andyfaff
Copy link
Contributor

andyfaff commented Dec 21, 2023

How about we amend the Windows job full_build_sdist_wheel to install numpy as:

python -m pip install --pre --upgrade --timeout=60 -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy

?

I think that may be worth doing in this PR. We can always change it later.

@@ -141,4 +143,7 @@ def lambertw(z, k=0, tol=1e-8):
>>> -lambertw(-np.log(0.5)) / np.log(0.5)
(0.64118574450498589+0j)
"""
# TODO: special expert should inspect this
# interception; better place to do it?
k = np.asarray(k, dtype=np.dtype("long"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the function signature in special/functions.json and lambertw.h k is definitely expected to be a long. Therefore this change seems to be correct to me. It's easier to do this kind of manipulation in Python rather than in lower level code.

**not a special expert.

Copy link
Contributor

@steppi steppi Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That it's only an issue for lambertw and not any of the other special ufuncs which have an input of type long seems to relate to lambertw being the only of these written in C++, the others are all written in Cython.

There's 32 such functions.

[['_lambertw', '_special.h++', 'Dld->D'],
 ['_spherical_in', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_in', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_in_d', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_in_d', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_jn', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_jn', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_jn_d', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_jn_d', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_kn', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_kn', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_kn_d', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_kn_d', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_yn', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_yn', '_spherical_bessel.pxd', 'ld->d'],
 ['_spherical_yn_d', '_spherical_bessel.pxd', 'lD->D'],
 ['_spherical_yn_d', '_spherical_bessel.pxd', 'ld->d'],
 ['eval_chebyc', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_chebys', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_chebyt', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_chebyu', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_gegenbauer', 'orthogonal_eval.pxd', 'ldd->d'],
 ['eval_genlaguerre', 'orthogonal_eval.pxd', 'ldd->d'],
 ['eval_hermite', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_hermitenorm', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_jacobi', 'orthogonal_eval.pxd', 'lddd->d'],
 ['eval_laguerre', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_legendre', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_sh_chebyt', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_sh_chebyu', 'orthogonal_eval.pxd', 'ld->d'],
 ['eval_sh_jacobi', 'orthogonal_eval.pxd', 'lddd->d'],
 ['eval_sh_legendre', 'orthogonal_eval.pxd', 'ld->d']]

Is Cython doing something to handle this? The plan though is to have all of these translated into C++ before we branch for 1.13. lambertw and the spherical bessel functions have convenient wrappers so they can support kwargs with default arguments, but the orthogonal polynomial evaluation functions are called directly. We could write wrappers for those and do the conversion in Python like you've done here, but the functions would lose ufunc methods and attributes, which some might complain about.

I think we could change the ufunc generation infrastructure to explicitly use int64 types instead of long but I don't know offhand how to put everything together to make that work. I think it's fine to just do the conversion in Python for now, and I'll work on updating the ufunc infrastructure before rewriting orthogonal_eval.pxd.

@lucascolley lucascolley added this to the 1.13.0 milestone Dec 21, 2023
@tylerjereddy
Copy link
Contributor Author

tylerjereddy commented Dec 23, 2023

I pushed a commit to add the suggested NumPy pre-release install line to one of the Windows CI tests (didn't check that on my fork...), and also another shim that brings this branch down to a single failure alongside NumPy pre-release for me locally on Windows:

scipy\linalg\tests\test_decomp.py::TestSchur::test_simple

I think Schur decompositions may have tricky numerical properties, but the tolerance is quite far off:

(looks like the copy-paste from Windows terminal is formatted a bit weird...)

________________________________________________________________________ TestSchur.test_simple _________________________________________________________________________ scipy\linalg\tests\test_decomp.py:1798: in test_simple                                                                                                                       self.check_schur(a, tc, zc, rtol=1e-14, atol=5e-15)                                                                                                                          a          = [[8, 12, 3], [2, 9, 3], [10, 3, 6]]                                                                                                                         self       = <scipy.linalg.tests.test_decomp.TestSchur object at 0x000001F4FDEB4490>                                                                                     t          = array([[17.78252605,  6.20963206,  5.1693123 ],                                                                                                            [ 0.        ,  2.60873697,  7.65954228],                                                                                                                                 [ 0.        , -1.49024897,  2.60873697]])                                                                                                                                 tc         = array([[17.782528 +0.j        ,  6.0448318-0.30250442j,                                                                                                     -2.1454806-4.903747j  ],                                                                                                                                                [ 0.       +0.j       ....4206657j ],                                                                                                                                    [ 0.       +0.j        ,  0.       +0.j        ,                                                                                                                           2.6087368-3.3785524j ]], dtype=complex64)                                                                                                                               z          = array([[-0.65829284, -0.5572288 ,  0.50610928],                                                                                                            [-0.3732151 , -0.34228403, -0.86229469],                                                                                                                                 [-0.65372855,  0.75653005, -0.01735692]])                                                                                                                                 zc         = array([[-0.60887337-0.2502455j , -0.5073556 +0.21033552j,                                                                                                   -0.42224124-0.2944693j ],                                                                                                                                               [-0.34519696-0.1418...606346j],                                                                                                                                          [-0.6046518 -0.24851048j,  0.6920515 -0.01528251j,                                                                                                                         0.27425295-0.1351129j ]], dtype=complex64)                                                                                                                      scipy\linalg\tests\test_decomp.py:1785: in check_schur                                                                                                                       assert_allclose(u @ t @ u.conj().T, a, rtol=rtol, atol=atol,                                                                                                                 a          = [[8, 12, 3], [2, 9, 3], [10, 3, 6]]                                                                                                                         atol       = 5e-15                                                                                                                                                       rtol       = 1e-14                                                                                                                                                       self       = <scipy.linalg.tests.test_decomp.TestSchur object at 0x000001F4FDEB4490>                                                                                     t          = array([[17.782528 +0.j        ,  6.0448318-0.30250442j,                                                                                                     -2.1454806-4.903747j  ],                                                                                                                                                [ 0.       +0.j       ....4206657j ],                                                                                                                                    [ 0.       +0.j        ,  0.       +0.j        ,                                                                                                                           2.6087368-3.3785524j ]], dtype=complex64)                                                                                                                               u          = array([[-0.60887337-0.2502455j , -0.5073556 +0.21033552j,                                                                                                   -0.42224124-0.2944693j ],                                                                                                                                               [-0.34519696-0.1418...606346j],                                                                                                                                          [-0.6046518 -0.24851048j,  0.6920515 -0.01528251j,                                                                                                                         0.27425295-0.1351129j ]], dtype=complex64)                                                                                                                      ..\..\..\..\..\miniconda3\envs\py_311_new\Lib\contextlib.py:81: in inner                                                                                                     return func(*args, **kwds)                                                                                                                                           E   AssertionError:                                                                                                                                                      E   Not equal to tolerance rtol=1e-14, atol=5e-15                                                                                                                        E   Schur decomposition does not match 'a'                                                                                                                               E   Mismatched elements: 9 / 9 (100%)                                                                                                                                    E   Max absolute difference among violations: 4.91512492e-06                                                                                                             E   Max relative difference among violations: 1.43931471e-06                                                                                                             E    ACTUAL: array([[ 8.000003+1.430511e-06j, 12.000004-1.668930e-06j,                                                                                                   E            3.000004-4.768372e-07j],                                                                                                                                    E          [ 2.000002+1.072884e-06j,  9.000001+4.768372e-07j,...                                                                                                         E    DESIRED: array([[ 8, 12,  3],                                                                                                                                       E          [ 2,  9,  3],                                                                                                                                                 E          [10,  3,  6]])                                                                                                                                                        args       = (<function assert_allclose.<locals>.compare at 0x000001F4FDF584A0>, array([[ 8.000003 +1.4305115e-06j, 12.000004 -1.66...         6.0000043+0.0000000e+00j]], dtype=complex64), array([[ 8, 12,  3],                                                                                                                                [ 2,  9,  3],                                                                                                                                                            [10,  3,  6]]))                                                                                                                                                           func       = <function assert_array_compare at 0x000001F4FBA58400>                                                                                                       kwds       = {'equal_nan': True, 'err_msg': "Schur decomposition does not match 'a'", 'header': 'Not equal to tolerance rtol=1e-14, atol=5e-15', 'strict': False, ...}                                                                                                                                                                            self       = <contextlib._GeneratorContextManager object at 0x000001F4FBA46DD0>    

Could be my version of scipy-openblas32 alongside NumPy pre-release on Windows? Didn't check too closely on that yet, if we're worried about it..

@tylerjereddy
Copy link
Contributor Author

tylerjereddy commented Dec 24, 2023

To allow testing with NumPy pre-release on the Windows CI job that a reviewer requested be modified, I've additionally disabled build isolation for PyPA build step there. I'm not sure if that drifts away from the original intention of that job, since we're basically ignoring pyproject.toml there for the sdist/wheel build now, but if folks want pre-release testing that's probably what I have to do.

@tylerjereddy tylerjereddy force-pushed the treddy_issue_19605_win_int_types branch 2 times, most recently from f516118 to 34f454c Compare December 24, 2023 18:06
@tylerjereddy
Copy link
Contributor Author

Ok, now CI is reproducing the 1 Windows failure with NumPy pre-release that I reproduced locally in linalg/tests/test_decomp.py::TestSchur::test_simple. Perhaps we should ping a linalg person on that one, but I think I'll hold off on the ping given time of year/Holiday. That's progress at least. I'd say the changes to the sdist/wheel test job are not ideal, but seems "ok" for the current temporary purpose to repro for NumPy 2.x testing for now I suppose.

@tylerjereddy tylerjereddy changed the title WIP, MAINT: Windows NumPy 2.x int shims MAINT: Windows NumPy 2.x int shims Dec 29, 2023
@tylerjereddy tylerjereddy removed the needs-work Items that are pending response from the author label Dec 29, 2023
@tylerjereddy
Copy link
Contributor Author

CI looks good here now (Circle failure is just rebase/merge mechanics for gh-19750 I think), includes full test suite passing on Windows with NumPy pre-release.

I'll remove the WIP, though may still benefit from review/comment clean up a bit I suppose.

@tylerjereddy
Copy link
Contributor Author

@andyfaff how is this looking now? Maybe I should check with @ilayn on the additional Schur decomposition shim?

@ilayn
Copy link
Member

ilayn commented Jan 2, 2024

We don't even need to map it to long integer since all will be mapped to LAPACK flavors, d, D, f , F afterwards by f2py. Hence if this is needed for NumPy 2.0 looks good, otherwise we can cast it to inexact type of np.float64. Both are valid to me.

@andyfaff
Copy link
Contributor

andyfaff commented Jan 2, 2024

The cirrus cron job that ran on Jan 1 worked for all the wheels. However the last GH Cron job failed for all the windows wheels. Whilst I think it would work now (because you added the windows check for numpy2 in this PR) I still would like to see a complete wheel build for this PR. Either that, or we merge and see if all the Cron jobs now go green.

@andyfaff
Copy link
Contributor

andyfaff commented Jan 2, 2024

Is this PR marked for backporting?

@tylerjereddy tylerjereddy force-pushed the treddy_issue_19605_win_int_types branch from 9430f02 to 28b71a8 Compare January 2, 2024 21:00
@tylerjereddy
Copy link
Contributor Author

I didn't mark the other Windows int ones for backporting, but if you really want we could try to backport them all. The main reason I wasn't too aggressive on that was that NumPy 2.x support was slated for 1.13.0, which should happen soon after 1.12.0 anyway. But we can do it if you think that's better to support right away for RC2.

I've pushed an empty commit after rebasing to start the wheel builds here.

@tylerjereddy
Copy link
Contributor Author

Half the Windows wheel builds pass and the other half fail, depending on Python version, but it doesn't look related to my shims and I'm also not convinced we should ever be using pytest release candidates in our wheel builds like here (pytest-8.0.0rc1).

I think I'm currently -0.5 on the backporting, just because the surface area of the disruption might increase to two separate minor releases with the same integer shims, and Windows stuff is annoying to debug, so isolating any surprises related to that to 1.13.0 might be helpful, but I don't really mind either way.

@andyfaff
Copy link
Contributor

andyfaff commented Jan 2, 2024

Are the 1.12 release wheels going to be built against numpy 2.0? If yes, then won't the entire test suite have to pass, or will we disable/ignore the failing tests?

@tylerjereddy
Copy link
Contributor Author

Are the 1.12 release wheels going to be built against numpy 2.0?

No, we're using the stable release of NumPy and the usual approach of pinning to an oldest supported version, for one final release. Then in 1.13.0 we'll switch to the "new way."

* related to scipygh-19605

* this drops the number of test suite failure from 213 to 11
on SciPy `main` alongside NumPy `main`, but has only been
tested in that one platform/scenario, and contains `TODO` comments
where I felt I was cheating a little too much

[skip cirrus]
* add a `stats` integer type shim that solves most of
the remaining failures on Windows with NumPy `main`

* I only see one remaining test failure locally:
`scipy\linalg\tests\test_decomp.py::TestSchur::test_simple`

* address a reviewer request to test pre-release NumPy
on Windows in CI

[skip cirrus]
* disable build isolation in the Windows CI job
that a reviewer requested be modified to test
with NumPy pre-release

[skip cirrus]
* shim the handling of integer arrays in `schur`
to restore the old Windows long behavior with
NumPy 2.x

[skip cirrus]
* rebase the PR and build wheels as test

[wheel build]
@tylerjereddy tylerjereddy force-pushed the treddy_issue_19605_win_int_types branch from 28b71a8 to b7a2f35 Compare January 6, 2024 20:15
@tylerjereddy
Copy link
Contributor Author

There was a merge conflict that directly affected one of my integer shims, so I've tried to resolve that, but it does affect the meaning so I'm going to lean on the new CI job to check that we're still ok on Windows with NumPy pre-release..

I don't think there are any substantive reviewer concerns outstanding here, so I may be tempted to merge if we're still good on Windows with NumPy 2 dev. I think the CI changes should be reverted when NumPy 2.0.0 comes out, which shouldn't be that long anyway..

@tylerjereddy
Copy link
Contributor Author

the merge conflict was in scipy/stats/_stats_py.py, fyi

@tylerjereddy
Copy link
Contributor Author

My current impression is that those two failures are not related, probably some Windows + NumPy pre-release stuff a bit like #19804?

@h-vetinari
Copy link
Member

probably some Windows + NumPy pre-release stuff a bit like #19804?

It's possible that something got stricter in pytest 8 of course, but it looks very different from the multiple-warnings issue in #19804:

      assert mod not in mods
  E   AssertionError
          mod        = <module 'scipy.conftest' from 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\cibw-run-xjxg7tr2\\cp39-win_amd64\\venv-test\\lib\\site-packages\\scipy\\conftest.py'>
          mods       = [<module 'scipy.conftest' from 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\cibw-run-xjxg7tr2\\cp39-win_amd64\\venv-test\\lib\\site-packages\\scipy\\conftest.py'>]

These two are the same but apparently shouldn't be. I didn't see any obviously related changes in pytest for that, but found that this is pytest-dev/pytest#9765.

@tylerjereddy
Copy link
Contributor Author

Ok, I guess my point is that I'm probably not causing the problems with my diff.


- name: Build
run: |
python -m build -Csetup-args="-Duse-pythran=false"
python -m build --no-isolation -x -Csetup-args="-Duse-pythran=false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess my point is that I'm probably not causing the problems with my diff.

Actually, it's quite likely that you are causing this with this change; I only understood this relationship now after reading some more of the pytest issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow, your comment is on a diff for a CI job that is passing, and has been passing for some time in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two CI jobs here are failing (though in jobs that aren't always run, so I don't know if they fail without the diff here; tried and failed to check when they were last run, and if pytest 8 was present then).

Both failing jobs are pulling in pytest 8.0.0rc1, and the pytest issue describes that the problem is related to not using isolation, which this PR does. These are the ingredients for my suspicion.

In the pytest issue you mention that the job without isolation is passing, but perhaps it's not using the pytest rc? Also, my suspicion would be easy to test by simply removing --no-isolation here and seeing if the wheel builds pass.

The alternative would be to simply restrict to pytest <8 in the wheel builds now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I get your point now; the wheel builds are not built by this workflow, right? I got tripped up by the naming, as .github/workflows/windows.yml is pretty generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I didn't touch the wheel builds in this PR.

To be fair, you aren't entirely wrong to suspect disabled build isolation, it should be off in the wheel builds too, because we are intentionally pulling in the NumPy pre-release, both in the wheel builds and in the CI job that I modified at a reviewer's request here.

As for your ideas, yeah you'll see above that I say:

... I'm also not convinced we should ever be using pytest release candidates in our wheel builds like here (pytest-8.0.0rc1).

Perhaps I'll try pushing that in, though I'm not sure it should really be needed to move forward with fixing the few hundred test failures here.

It isn't clear to me that build isolation itself is involved, but the NumPy pre-release with Python versions below 3.11 seems to be a common issue.

* pin the version of `pytest` used for wheel builds,
because a release candidate is causing trouble on
Windows

[wheel build]
@tylerjereddy
Copy link
Contributor Author

Ok, I pushed a commit that attempts to hard-pin pytest version used for cibuildwheel via the pyproject.toml section. If CI is green after that, hopefully we can just open an issue to investigate later and move on, since that's a release candidate of pytest only. Fingers crossed...

@tylerjereddy tylerjereddy merged commit 2d6be1c into scipy:main Jan 7, 2024
45 checks passed
@tylerjereddy tylerjereddy deleted the treddy_issue_19605_win_int_types branch January 7, 2024 17:02
@tylerjereddy
Copy link
Contributor Author

Ok, CI was passing including wheel builds, and there are no outstanding reviewer comments. I'll open an issue with follow-up actions next. Ping me directly if there are concerns.. but I think it makes sense to get our wheel builds passing with NumPy pre-release for community testing on Windows now.

@tylerjereddy
Copy link
Contributor Author

I squash-merged in case we really want to backport, though intention is for 1.13.0 I think

thalassemia pushed a commit to thalassemia/scipy that referenced this pull request Jan 10, 2024
* related to scipygh-19605

* fixes 213 test failures on Windows alongside NumPy `main`
by restoring the old behavior for the long/int dtype with NumPy

* temporarily adds a Windows CI job with NumPy pre-release
to guard against regressions
thalassemia pushed a commit to thalassemia/scipy that referenced this pull request Jan 11, 2024
* related to scipygh-19605

* fixes 213 test failures on Windows alongside NumPy `main`
by restoring the old behavior for the long/int dtype with NumPy

* temporarily adds a Windows CI job with NumPy pre-release
to guard against regressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants