Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{tools}[foss/2023a] jax v0.4.24 w/ CUDA 12.1.1 #19841

Conversation

ThomasHoffmann77
Copy link
Contributor

(created using eb --new-pr)

@ThomasHoffmann77 ThomasHoffmann77 marked this pull request as draft February 13, 2024 10:57
@ThomasHoffmann77 ThomasHoffmann77 changed the title {tools]{foss/2023a} jax v0.4.24 w/ CUDA 12.1.1 {tools}[foss/2023a] jax v0.4.24 w/ CUDA 12.1.1 Feb 13, 2024
@ThomasHoffmann77 ThomasHoffmann77 marked this pull request as ready for review February 13, 2024 13:05
@branfosj
Copy link
Member

branfosj commented Feb 17, 2024

Test report by @branfosj
SUCCESS
Build succeeded (with --ignore-test-failure) for 1 out of 1 (2 easyconfigs in total)
bask-pg0309u36a - Linux RHEL 8.9, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-40GB, 535.154.05, Python 3.6.8
See https://gist.github.com/branfosj/3f0ab6328e3d8187d05b5303d8d1a9a7 for a full test report.

FAILED tests/host_callback_test.py::HostCallbackTapTest::test_tap_scan_custom_jvp - AssertionError: Found
FAILED tests/host_callback_test.py::HostCallbackTapTest::test_tap_transforms_doc - AssertionError: Found
FAILED tests/logging_test.py::LoggingTest::test_no_log_spam - AssertionError:

test_tap_scan_custom_jvp: is comparing 2.0999999999999996 to 2.1

test_tap_transforms_doc: is comparing ( 0.1 0.6 ) to ( 0.1 0.6000000000000001 )

test_no_log_spam: this looks like a false positive.

ThomasHoffmann77 and others added 3 commits February 22, 2024 14:11
Co-authored-by: Jasper Grimm <65227842+jfgrimm@users.noreply.github.com>
Co-authored-by: Jasper Grimm <65227842+jfgrimm@users.noreply.github.com>
Co-authored-by: Jasper Grimm <65227842+jfgrimm@users.noreply.github.com>
@jfgrimm
Copy link
Member

jfgrimm commented Feb 22, 2024

Test report by @jfgrimm
SUCCESS
Build succeeded (with --ignore-test-failure) for 4 out of 4 (2 easyconfigs in total)
gpu22.viking2.yor.alces.network - Linux Rocky Linux 8.8, x86_64, AMD EPYC 7413 24-Core Processor, 1 x NVIDIA NVIDIA H100 PCIe, 535.86.10, Python 3.6.8
See https://gist.github.com/jfgrimm/e0177f296ddf6abe4e468a88536d9fb2 for a full test report.

I get the following tests fail (same as @branfosj):

tests/host_callback_test.py::HostCallbackTapTest::test_tap_scan_custom_jvp FAILED 
tests/host_callback_test.py::HostCallbackTapTest::test_tap_transforms_doc FAILED
tests/logging_test.py::LoggingTest::test_no_log_spam FAILED

but also, the test suite crashes before completion for me (~91%):

tests/sparse_bcoo_bcsr_test.py::BCOOTest::test_bcoo_dense_round_trip6 Fatal Python error: Aborted

@ThomasHoffmann77
Copy link
Contributor Author

Test report by @ThomasHoffmann77
SUCCESS
Build succeeded (with --ignore-test-failure) for 2 out of 2 (2 easyconfigs in total)
srv-mahamid-01.embl.de - Linux AlmaLinux 8.8, x86_64, AMD EPYC 7513 32-Core Processor, 2 x NVIDIA NVIDIA GeForce RTX 3090, 535.113.01, Python 3.6.8
See https://gist.github.com/ThomasHoffmann77/ab46a20f75c0934cb4a2958b2912e02d for a full test report.

@ThomasHoffmann77
Copy link
Contributor Author

Test report by @ThomasHoffmann77
SUCCESS
Build succeeded (with --ignore-test-failure) for 2 out of 2 (2 easyconfigs in total)
login02.cluster.embl.de - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz, 4 x NVIDIA Tesla M60, 535.129.03, Python 3.6.8
See https://gist.github.com/ThomasHoffmann77/c32991f2923614090554d2ee5db8c5a7 for a full test report.

easyblock = 'PythonBundle'

name = 'jax'
version = '0.4.24'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThomasHoffmann77 There's a jax 0.4.25 release now, can we try updating to that and see if we're still seeing failing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel I'll give 0.4.25 a try today

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this, so happy to review once you have updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this, so happy to review once you have updated.

@verdurin see #20119. (My local test runs did not finsh yet)

@akesandgren
Copy link
Contributor

We should probably close this in favor of #20119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants