-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add patches to fix test issues for PyTorch 2.1.2 with foss/2023a
+ CUDA 12.1.1
#20156
add patches to fix test issues for PyTorch 2.1.2 with foss/2023a
+ CUDA 12.1.1
#20156
Conversation
easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
Outdated
Show resolved
Hide resolved
Test report by @casparvl |
Test report by @Flamefire |
Test report by @jfgrimm |
Test report by @casparvl |
Looks like we need to increase the allowed failures to ~10. Yours report 8. 6 of them are caught in detail: test_Conv1d_pad_same_cuda_tf32, test_constant_specialization, test_delayed_optim_step_offload_true_no_shard, test_file_reader_no_memory_leak, test_file_reader_no_memory_leak, test_file_reader_no_memory_leak The first is from test_nn which is kinda know, the last 3 are known on your machine from other runs. the other 2 (4) I don't know. The full test log might be useful to enhance the RegEx to capture the other 2 tests too. I think it really helps having the individual tests listed conveniently at a single place to judge the failure (see e.g. the last 3 where you can see that it is the same cause, not only: "test_jit_foo", "test_jit_bar", "test_jit" files failed) |
@boegelbot please test @ generoso |
@casparvl: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2016786483 processed Message to humans: this is just bookkeeping information for me, |
Detailed logtest_jit
test_proxy_tensor
distributed/fsdp/test_fsdp_core
test_jit_legacy
test_jit_profiling
test_nn
|
@boegelbot please test @ jsc-zen3 |
@casparvl: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2016792987 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
Test report by @boegelbot |
@casparvl Can you upload the compressed log of the failures (#20156 (comment)) such that the easyblock can be enhanced to also detect the 2 other failing tests by name? |
I'll send it to you in a DM. I don't assume there to be much privacy-sensitive info in there, but just to be safe I'll not share it with the world ;-) |
Thanks. As for the failures:
|
I think I know: I probably didn't rebuild I'll merge this PR: there are sufficient succesful test-reports, and on my system, we have a reasonable understanding of the failing tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
Going in, thanks @Flamefire! |
Test report by @jfgrimm |
foss/2023a
+ CUDA 12.1.1
(created using
eb --new-pr
)Fixes #19946