Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{ai}[foss/2022a] PyTorch v2.1.2 #19444

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented Dec 19, 2023

(created using eb --new-pr)

Update over #19086

test_sympy_utils failure requires rebuilding with #19414

…2.0.1_avoid-test_quantization-failures.patch, PyTorch-2.0.1_fix-skip-decorators.patch, PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch, PyTorch-2.0.1_fix-vsx-loadu.patch, PyTorch-2.0.1_no-cuda-stubs-rpath.patch, PyTorch-2.0.1_skip-failing-gradtest.patch, PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch, PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch, PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch, PyTorch-2.1.0_fix-validationError-output-test.patch, PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch, PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch, PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch, PyTorch-2.1.0_remove-test-requiring-online-access.patch, PyTorch-2.1.0_skip-diff-test-on-ppc.patch, PyTorch-2.1.0_skip-dynamo-test_predispatch.patch, PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch, PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch, PyTorch-2.1.0_skip-test_wrap_bad.patch
@Flamefire Flamefire marked this pull request as draft December 19, 2023 12:11
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
i8023 - Linux Rocky Linux 8.7, x86_64, AMD EPYC 7352 24-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/Flamefire/cb55407b7d0cefa033c8cf016488af43 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
n1099 - Linux RHEL 8.7 (Ootpa), x86_64, Intel(R) Xeon(R) Platinum 8470 (icelake), Python 3.8.13
See https://gist.github.com/Flamefire/e49db6dd7c419a682fba7ba11cf1245b for a full test report.

@casparvl
Copy link
Contributor

Test report by @casparvl
FAILED
Build succeeded for 4 out of 5 (1 easyconfigs in total)
gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 535.104.12, Python 3.6.8
See https://gist.github.com/casparvl/ca999bc0ff3db36bdf4cba2a8016595b for a full test report.

@Flamefire
Copy link
Contributor Author

@casparvl I assume the failures are the same as in #19445 (comment) ?

Except for test_sympy_utils which requires rebuilding #19414

Do you see the same failures in #19086 / #19087 or are those test_jit failures new in 2.1.2?

@Flamefire
Copy link
Contributor Author

Marking this as ready as it doesn't seem to fail more than 2.1.0 so this might be better than #19086

@Flamefire Flamefire marked this pull request as ready for review December 21, 2023 15:55
@SebastianAchilles SebastianAchilles added this to the 4.x milestone Dec 22, 2023
@SebastianAchilles
Copy link
Member

@boegelbot please test @ generoso
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@SebastianAchilles: Request for testing this PR well received on login1

PR test command 'EB_PR=19444 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_19444 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 12460

Test results coming soon (I hope)...

- notification for comment with ID 1867587873 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@casparvl
Copy link
Contributor

@casparvl I assume the failures are the same as in #19445 (comment) ?

Except for test_sympy_utils which requires rebuilding #19414

Do you see the same failures in #19086 / #19087 or are those test_jit failures new in 2.1.2?

Yeah, sorry, little time to follow up from my side. Test failures where indeed the same, except the test_sympy_utils. I missed your instruction in the opening post to rebuild #19414, sorry.

I'll try to upload a test report for #19086 and #19087. I'll be going on holiday after today, so I'll just turn on those builds, and hope for the best... I'm afraid I can't look into the build failures and report more specifically then, but at least you should be able to see the summary of failures in the gist.

@SebastianAchilles
Copy link
Member

Test report by @SebastianAchilles
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
zen2-rockylinux-89 - Linux Rocky Linux 8.9, x86_64, AMD EPYC 7452 32-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/SebastianAchilles/64bdb54f7560e7806c2265a47ca4bc26 for a full test report.

@SebastianAchilles
Copy link
Member

@boegelbot please test @ jsc-zen2
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@SebastianAchilles: Request for testing this PR well received on jsczen2l1.int.jsc-zen2.easybuild-test.cluster

PR test command 'EB_PR=19444 EB_ARGS= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --mem-per-cpu=4000M --job-name test_PR_19444 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen2.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 3962

Test results coming soon (I hope)...

- notification for comment with ID 1867921830 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
cnx2 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/ee91495887dad50930874a9420fdd55e for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen2c1.int.jsc-zen2.easybuild-test.cluster - Linux Rocky Linux 8.5, x86_64, AMD EPYC 7742 64-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/boegelbot/253830054217e3f78136dfa7f1b65279 for a full test report.

@bedroge
Copy link
Contributor

bedroge commented Dec 24, 2023

Test report by @bedroge
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
v100gpu26 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz (skylake_avx512), 1 x NVIDIA GRID V100D-32Q, 535.104.05, Python 3.6.8
See https://gist.github.com/bedroge/31fa6679fdcae36dc5eddca8c5aa8112 for a full test report.

@SebastianAchilles
Copy link
Member

Test report by @SebastianAchilles
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jscclxc1.int.jsc-clx.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, Intel Xeon Processor (Cascadelake) (cascadelake), Python 3.9.18
See https://gist.github.com/SebastianAchilles/33cab1423a36db1925ca9e797fe07aa1 for a full test report.

@SebastianAchilles
Copy link
Member

Test report by @SebastianAchilles
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/SebastianAchilles/4b0be8e363477f529813d6346299150f for a full test report.

@boegel boegel modified the milestones: 4.x, next release (4.9.0?) Dec 27, 2023
Copy link
Member

@SebastianAchilles SebastianAchilles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@SebastianAchilles
Copy link
Member

Going in, thanks @Flamefire!

@SebastianAchilles SebastianAchilles merged commit b552640 into easybuilders:develop Dec 27, 2023
9 checks passed
@Flamefire Flamefire deleted the 20231219130107_new_pr_PyTorch212 branch December 27, 2023 10:09
@robogast
Copy link
Contributor

robogast commented Jan 2, 2024

legend 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants