Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add patch to SciPy-bundle 2024.05 that fixes numpy test failures on RISC-V #20847

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

bedroge
Copy link
Contributor

@bedroge bedroge commented Jun 18, 2024

See numpy/numpy#26743.

Before:

=========================== short test summary info ============================
FAILED core/tests/test_numeric.py::TestBoolCmp::test_float - AssertionError:
FAILED core/tests/test_umath.py::TestFPClass::test_fpclass[-4] - AssertionErr...
FAILED core/tests/test_umath.py::TestFPClass::test_fpclass[-2] - AssertionErr...
FAILED core/tests/test_umath.py::TestFPClass::test_fpclass[-1] - AssertionErr...
FAILED core/tests/test_umath.py::TestFPClass::test_fpclass[1] - AssertionError:
FAILED core/tests/test_umath.py::TestFPClass::test_fp_noncontiguous[f] - Asse...
= 6 failed, 32076 passed, 1746 skipped, 1305 deselected, 31 xfailed, 3 xpassed, 18 warnings in 1008.04s (0:16:48) =

After:

= 32082 passed, 1746 skipped, 1305 deselected, 31 xfailed, 3 xpassed, 18 warnings in 997.29s (0:16:37) =

@bedroge bedroge added bug fix EESSI Related to EESSI project riscv labels Jun 18, 2024
@bedroge
Copy link
Contributor Author

bedroge commented Jun 18, 2024

@boegelbot please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20847 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20847 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 4413

Test results coming soon (I hope)...

- notification for comment with ID 2176370352 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/16360268607328a10a159ba9a5a67c7a for a full test report.

@bedroge
Copy link
Contributor Author

bedroge commented Jun 18, 2024

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@bedroge: Request for testing this PR well received on login1

PR test command 'EB_PR=20847 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_20847 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13770

Test results coming soon (I hope)...

- notification for comment with ID 2176527637 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@bedroge
Copy link
Contributor Author

bedroge commented Jun 18, 2024

Test report by @bedroge
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
starfive - Linux Debian GNU/Linux n/a, RISC-V-64, UNKNOWN, Python 3.10.9
See https://gist.github.com/bedroge/70e6640bae6d152d57947a439aa8418d for a full test report.

Build and tests of numpy succeeded, but now scipy seems to have test failures.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
cns2 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/d5757b2d61d34d06807c47ad305e0643 for a full test report.

@bedroge
Copy link
Contributor Author

bedroge commented Jun 19, 2024

Test report by @bedroge FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) starfive - Linux Debian GNU/Linux n/a, RISC-V-64, UNKNOWN, Python 3.10.9 See https://gist.github.com/bedroge/70e6640bae6d152d57947a439aa8418d for a full test report.

Build and tests of numpy succeeded, but now scipy seems to have test failures.

Ah, the output just stopped here:

scipy/linalg/tests/test_matfuncs.py::TestSignM::test_defective2 PASSED   [ 19%]
scipy/linalg/tests/test_matfuncs.py::TestSignM::test_defective3 PASSED   [ 19%]
scipy/linalg/tests/test_matfuncs.py::TestLogM::test_nils

and couldn't really find any clues, but it turns out it ran out of memory:

[Tue Jun 18 17:30:32 2024] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),task=python,pid=1215221,uid=1000
[Tue Jun 18 17:30:32 2024] Out of memory: Killed process 1215221 (python) total-vm:9440708kB, anon-rss:7407096kB, file-rss:8kB, shmem-rss:0kB, UID:1000 pgtables:14872kB oom_score_adj:0

Tried it a second time, but then the same thing happened again.

@SebastianAchilles
Copy link
Member

On a HiFive Unmatched (SiFive U740) I am getting this error:

Time for 1000x1000 matrix dot product: 594 msec >= 500 msec => ERROR

This might be okay, since the SoC is slower. Setting --try-amend=blas_test_time_limit=650 did not work for me.

@SebastianAchilles SebastianAchilles added this to the release after 4.9.2 milestone Jul 2, 2024
Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a step in the right direction for RISC-V, and should be fine elsewhere, so I'll go ahead and merge this (even if it's not sufficient to let all tests pass on RISC-V)

@boegel
Copy link
Member

boegel commented Jul 3, 2024

Going in, thanks @bedroge!

@boegel boegel merged commit bfdf5fe into easybuilders:develop Jul 3, 2024
9 checks passed
@SebastianAchilles SebastianAchilles added the 2024a issues & PRs related to 2024a common toolchains label Jul 3, 2024
@bedroge bedroge deleted the fix_numpy_test_failures_riscv branch July 4, 2024 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024a issues & PRs related to 2024a common toolchains bug fix EESSI Related to EESSI project riscv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants