{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 #16453

Flamefire · 2022-10-20T14:54:05Z

(created using eb --new-pr)

branfosj

lgtm

boegel · 2022-11-24T10:47:45Z

@Flamefire Do you think it makes sense to add this explicitly in these easyconfigs?

# there should be no failing tests thanks to the included patches
# if you do see failing tests, please open an issue at https://github.com/easybuilders/easybuild-easyconfigs/issues
max_failed_tests = 0

branfosj · 2022-11-24T12:17:33Z

@Flamefire Do you think it makes sense to add this explicitly in these easyconfigs?

# there should be no failing tests thanks to the included patches
# if you do see failing tests, please open an issue at https://github.com/easybuilders/easybuild-easyconfigs/issues
max_failed_tests = 0

Two points here:

Anything in the easyconfig is semi-invisible. Unless such a message gets printed back to the person building the software I am not sure it helps.
Sending them to open an issue raises the question of who is going to debug those failures. Unless I see the same failure there is little I can do - I'll just copy/paste my suggested PyTorch test failure debug instructions (PyTorch-1.10.0-foss-2021a-CUDA-11.3.1.eb build failure on A100 GPU within Singularity container #14665 (comment)).

branfosj · 2022-11-24T15:15:03Z

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0105u36b.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/ec24560002d510ba2fafa9f209f2fe72 for a full test report.

branfosj · 2022-11-24T15:31:29Z

FAILED (skipped=3602, expected failures=80, unexpected successes=2)
test_ops_gradients failed!

test_forward_mode_AD_nn_functional_max_unpool2d_cpu_float64 (__main__.TestGradientsCPU) ... unexpected success
test_forward_mode_AD_nn_functional_max_unpool3d_cpu_float64 (__main__.TestGradientsCPU) ... unexpected success

branfosj · 2022-11-24T16:38:35Z

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0103u11a.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (icelake), 1 x NVIDIA NVIDIA A100-PCIE-40GB, 470.57.02, Python 3.6.8
See https://gist.github.com/ffb50411c59c26ecc2819a3b38eb81cf for a full test report.

boegel · 2022-11-25T07:19:22Z

Test report by @boegel
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
node3902.accelgor.os - Linux RHEL 8.4, x86_64, AMD EPYC 7413 24-Core Processor (zen3), 1 x NVIDIA NVIDIA A100-SXM4-80GB, 520.61.05, Python 3.6.8
See https://gist.github.com/e0ba4d67a744cd1c9e89a478e4834354 for a full test report.

PyTorch-1.12.1-foss-2021a-CUDA-11.3.1.eb => 248 test failures, 0 test error (out of 89459)
PyTorch-1.12.1-foss-2021a.eb => 1 test failure, 0 test error (out of 88942) (distributed/test_c10d_gloo failed!)

@Flamefire For PyTorch-1.12.1-foss-2021a-CUDA-11.3.1.eb, could it be related to only having a single GPU available?

boegel · 2022-11-25T08:00:23Z

Details on failing distributed/test_c10d_gloo for CPU-only installation:

Test exited with non-zero exitcode 1. Command to reproduce: /user/gent/400/vsc40023/eb_scratch/RHEL8/zen3-ampere-ib/software/Python/3.9.5-GCCcore-10.3.0/bin/python distributed/test_c10d_gloo.py -v ProcessGroupGlooTest.test_round_robin
test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2339063
INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2339064
INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2339065
INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2339066
INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0
INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1
INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2
INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3
ERROR:torch.testing._internal.common_distributed:Caught exception:
Traceback (most recent call last):
  File "/tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/testing/_internal/common_distributed.py", line 622, in run_test
    getattr(self, test_name)()
  File "/tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/testing/_internal/common_distributed.py", line 503, in wrapper
    fn()
  File "/tmp/vsc40023/easybuild_build/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/distributed/test_c10d_gloo.py", line 1462, in test_round_robin_create_destroy
    pg = create(num=num_process_groups, prefix=i)
  File "/tmp/vsc40023/easybuild_build/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/distributed/test_c10d_gloo.py", line 1448, in create
    [
  File "/tmp/vsc40023/easybuild_build/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/distributed/test_c10d_gloo.py", line 1449, in <listcomp>
    c10d.ProcessGroupGloo(
RuntimeError: Wait timeout
Exception raised from wait at ../torch/csrc/distributed/c10d/FileStore.cpp:452 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x57 (0x14c6aedc2197 in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xd9 (0x14c6aed9898c in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: <unknown function> + 0x3dc3b4a (0x14c6b2ec6b4a in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #3: c10d::PrefixStore::wait(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&) + 0x2f (0x14c6b2eccf6f in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #4: gloo::rendezvous::PrefixStore::wait(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&) + 0x121 (0x14c6b4a12021 in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #5: gloo::rendezvous::Context::connectFullMesh(gloo::rendezvous::Store&, std::shared_ptr<gloo::transport::Device>&) + 0x14ea (0x14c6b4a0fc5a in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #6: c10d::ProcessGroupGloo::ProcessGroupGloo(c10::intrusive_ptr<c10d::Store, c10::detail::intrusive_target_default_null_type<c10d::Store> > const&, int, int, c10::intrusive_ptr<c10d::ProcessGroupGloo::Options, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroupGloo::Options> >) + 0x447 (0x14c6b2edeac7 in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0x9f3cbd (0x14c6b9302cbd in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0x36be8a (0x14c6b8c7ae8a in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #14: <unknown function> + 0x369559 (0x14c6b8c78559 in /tmp/eb-8zn00kw2/tmpreav4c99/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #56: __libc_start_main + 0xf3 (0x14c6ba2e8493 in /lib64/libc.so.6)
frame #57: _start + 0x2e (0x4006ce in /user/gent/400/vsc40023/eb_scratch/RHEL8/zen3-ampere-ib/software/Python/3.9.5-GCCcore-10.3.0/bin/python)

 exiting process 0 with exit code: 10
Process 0 terminated with exit code 10, terminating remaining processes.
ERROR

branfosj · 2022-11-25T10:18:15Z

@Flamefire For PyTorch-1.12.1-foss-2021a-CUDA-11.3.1.eb, could it be related to only having a single GPU available?

I am suspicious about this as well. The errors I am seeing on the CUDA build all look similar to:

Running distributed/fsdp/test_distributed_checkpoint ... [2022-11-24 11:32:28.143789]
Executing ['/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/Python/3.9.5-GCCcore-10.3.0/bin/python', 'distributed/fsdp/test_distributed_checkpoint.py', '-v'] ... [2022-11-24 11:32:28.143928]
test_distributed_checkpoint_state_dict_type_StateDictType_LOCAL_STATE_DICT (__main__.TestDistributedCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 733462
INFO:torch.testing._internal.common_distributed:Started process 1 with pid 733463
dist init r=0, world=2
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
dist init r=1, world=2
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1
INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes.
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes.
Process process 0:
Traceback (most recent call last):
  File "/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/Python/3.9.5-GCCcore-10.3.0/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/Python/3.9.5-GCCcore-10.3.0/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/dev/shm/branfosj/tmp-up-EL8/eb-usbph1a0/tmp_jxl19_q/lib/python3.9/site-packages/torch/testing/_internal/common_fsdp.py", line 427, in _run
    dist.barrier()
  File "/dev/shm/branfosj/tmp-up-EL8/eb-usbph1a0/tmp_jxl19_q/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 2784, in barrier
    work = default_pg.barrier(opts=opts)
RuntimeError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1191, invalid usage, NCCL version 2.10.3
ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops, too many collectives at once, mixing streams in a group, etc).
Process process 1:
Traceback (most recent call last):
  File "/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/Python/3.9.5-GCCcore-10.3.0/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/Python/3.9.5-GCCcore-10.3.0/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/dev/shm/branfosj/tmp-up-EL8/eb-usbph1a0/tmp_jxl19_q/lib/python3.9/site-packages/torch/testing/_internal/common_fsdp.py", line 427, in _run
    dist.barrier()
  File "/dev/shm/branfosj/tmp-up-EL8/eb-usbph1a0/tmp_jxl19_q/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 2784, in barrier
    work = default_pg.barrier(opts=opts)
RuntimeError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1191, invalid usage, NCCL version 2.10.3
ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops, too many collectives at once, mixing streams in a group, etc).
FAIL

boegel · 2022-11-25T10:47:39Z

Test report by @boegel
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
node3304.joltik.os - Linux RHEL 8.4, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), 1 x NVIDIA Tesla V100-SXM2-32GB, 520.61.05, Python 3.6.8
See https://gist.github.com/3bf0c77ca66537f6ddb66c757d831684 for a full test report.

Flamefire · 2022-11-25T14:01:03Z

@branfosj your log looks interesting:

Running distributed/fsdp/test_distributed_checkpoint

Did you run this on a node with only 1 GPU? Because this test is guarded by @skip_if_lt_x_gpu(2). Can you ignore the failure and run torch.cuda.is_available() and torch.cuda.device_count() on that node in Python after import torch?

branfosj · 2022-11-25T14:05:22Z

@branfosj your log looks interesting:

Running distributed/fsdp/test_distributed_checkpoint

Did you run this on a node with only 1 GPU? Because this test is guarded by @skip_if_lt_x_gpu(2). Can you ignore the failure and run torch.cuda.is_available() and torch.cuda.device_count() on that node in Python after import torch?

It is a node with 2 GPUs, however I was in a cgroup that only had access 1 of them. I'm currently testing a build with access to both GPUs and after that I'll see how those torch functions return in various cases.

Flamefire · 2022-11-25T14:22:03Z

The 1 GPU thing is the issue. I debugged it a bit and found: The test forks, waits in the barrier and then calls the wrapped test function which checks for the amount of GPUs.
So by the time the check is run it already tried to use the 2 GPUs -.-

Opened a bug with PyTorch: pytorch/pytorch#89686

~~I'll add a patch for the ECs next week.~~ Edit: Quick test, scheduled builds to run over the weekend

branfosj · 2022-11-25T16:00:32Z

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0103u14a.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (icelake), 2 x NVIDIA NVIDIA A30, 470.57.02, Python 3.6.8
See https://gist.github.com/3a5df9ad56295a169e807a0afb7a9c0f for a full test report.

branfosj · 2022-11-25T19:07:33Z

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0103u14a.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (icelake), 1 x NVIDIA NVIDIA A30, 470.57.02, Python 3.6.8
See https://gist.github.com/b453024903678edfac52cc6a9c3dab27 for a full test report.

branfosj

Failure was

Traceback (most recent call last):
  File "/dev/shm/branfosj/build-up-EL8/PyTorch/1.12.1/foss-2021a-CUDA-11.3.1/pytorch-v1.12.1/test/distributed/fsdp/test_fsdp_multiple_forward.py", line 48, in <module>
    class TestMultiForward(FSDPTest):
  File "/dev/shm/branfosj/build-up-EL8/PyTorch/1.12.1/foss-2021a-CUDA-11.3.1/pytorch-v1.12.1/test/distributed/fsdp/test_fsdp_multiple_forward.py", line 73, in TestMultiForward
    @skip_if_lt_x_gpu(2)
  File "/dev/shm/branfosj/tmp-up-EL8/eb-b6ygbfpo/tmp4zw2uabe/lib/python3.9/site-packages/torch/testing/_internal/common_distributed.py", line 132, in skip_if_lt_x_gpu
    TEST_SKIPS[f"multi-gpu-{n}"].message)
NameError: name 'n' is not defined

easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.1_fix-skip-decorators.patch

…-2021a-CUDA-11.3.1.eb

Flamefire · 2022-11-28T12:00:49Z

@branfosj Yes, C&P mistake while quickly hacking out the fix before the weekend. Did a proper fix, sent a PR upstream and updated the patch here

easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.1_fix-skip-decorators.patch

branfosj · 2022-11-28T15:42:52Z

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0105u36b.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/7c9058e9bfbac9496484f928e0c389ff for a full test report.

branfosj · 2022-11-28T20:45:27Z

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0103u04a.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (icelake), 2 x NVIDIA NVIDIA A30, 470.57.02, Python 3.6.8
See https://gist.github.com/0097203763ffd7652f47d81376f2c891 for a full test report.

Flamefire · 2022-11-29T04:01:24Z

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusml5 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/5c2b1431843d31e5d1da8a6604cbcd0b for a full test report.

Flamefire · 2022-11-29T08:14:09Z

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusa12 - Linux CentOS Linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (broadwell), 3 x NVIDIA GeForce GTX 1080 Ti, 460.32.03, Python 2.7.5
See https://gist.github.com/c279166a7c9894585837888ab0e6c1f3 for a full test report.

branfosj · 2022-11-29T15:09:01Z

Going in, thanks @Flamefire!

Flamefire · 2022-12-04T22:56:12Z

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
taurusi8016 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/47a8be219082efeece3d5bf3c4469843 for a full test report.

This comment was marked as outdated.

Sign in to view

Flamefire force-pushed the 20221020165400_new_pr_PyTorch1121 branch from fb7a1a3 to 5f0c732 Compare October 25, 2022 09:14

Flamefire mentioned this pull request Oct 25, 2022

{ai}[foss/2022a] PyTorch v1.12.0 w/ Python 3.10.4 #16401

Merged

branfosj added the update label Oct 25, 2022

Flamefire force-pushed the 20221020165400_new_pr_PyTorch1121 branch 3 times, most recently from e995abf to 0df1438 Compare November 22, 2022 12:26

boegel mentioned this pull request Nov 24, 2022

too many test failures for PyTorch/1.12.0-foss-2022a-CUDA-11.7.0 #16733

Open

branfosj previously approved these changes Nov 24, 2022

View reviewed changes

branfosj added this to the next release (4.7.0) milestone Nov 24, 2022

boegel changed the title ~~{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5~~ {devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 Nov 24, 2022

Flamefire dismissed branfosj’s stale review via 5ed035b November 25, 2022 09:37

branfosj requested changes Nov 25, 2022

View reviewed changes

easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.1_fix-skip-decorators.patch Outdated Show resolved Hide resolved

easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.1_fix-skip-decorators.patch Outdated Show resolved Hide resolved

Flamefire added 2 commits November 28, 2022 12:58

adding easyconfigs: PyTorch-1.12.1-foss-2021a.eb, PyTorch-1.12.1-foss…

fc26d3c

…-2021a-CUDA-11.3.1.eb

Remove test which times out

a51c4dc

Flamefire added 4 commits November 28, 2022 12:58

Patch failure on older CUDA drivers

d720a68

Skip test which may succeed

f223ac3

Skip c10d test which times out

6e3c238

Fix test failures on 1 GPU

ec03d0c

Flamefire force-pushed the 20221020165400_new_pr_PyTorch1121 branch from 168ff80 to ec03d0c Compare November 28, 2022 11:58

branfosj reviewed Nov 28, 2022

View reviewed changes

easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.1_fix-skip-decorators.patch Outdated Show resolved Hide resolved

Fix test failures on 1 GPU

d159a5a

branfosj approved these changes Nov 29, 2022

View reviewed changes

branfosj merged commit 2913f3d into easybuilders:develop Nov 29, 2022

Flamefire deleted the 20221020165400_new_pr_PyTorch1121 branch November 30, 2022 08:21

Flamefire mentioned this pull request Nov 30, 2022

{devel}[foss/2022a] PyTorch v1.12.1 w/ Python 3.10.4 (+ CUDA 11.7.0) #16484

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 #16453

{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 #16453

Flamefire commented Oct 20, 2022

This comment was marked as outdated.

This comment was marked as outdated.

branfosj left a comment

boegel commented Nov 24, 2022

branfosj commented Nov 24, 2022

branfosj commented Nov 24, 2022

branfosj commented Nov 24, 2022 •

edited

branfosj commented Nov 24, 2022

boegel commented Nov 25, 2022 •

edited

boegel commented Nov 25, 2022

branfosj commented Nov 25, 2022

boegel commented Nov 25, 2022

Flamefire commented Nov 25, 2022

branfosj commented Nov 25, 2022

Flamefire commented Nov 25, 2022 •

edited

branfosj commented Nov 25, 2022

branfosj commented Nov 25, 2022

branfosj left a comment

Flamefire commented Nov 28, 2022

branfosj commented Nov 28, 2022

branfosj commented Nov 28, 2022

Flamefire commented Nov 29, 2022

Flamefire commented Nov 29, 2022

branfosj commented Nov 29, 2022

Flamefire commented Dec 4, 2022

{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 #16453

{devel}[foss/2021a] PyTorch v1.12.1 w/ Python 3.9.5 + CUDA 11.3.1 #16453

Conversation

Flamefire commented Oct 20, 2022

This comment was marked as outdated.

This comment was marked as outdated.

branfosj left a comment

Choose a reason for hiding this comment

boegel commented Nov 24, 2022

branfosj commented Nov 24, 2022

branfosj commented Nov 24, 2022

branfosj commented Nov 24, 2022 • edited

branfosj commented Nov 24, 2022

boegel commented Nov 25, 2022 • edited

boegel commented Nov 25, 2022

branfosj commented Nov 25, 2022

boegel commented Nov 25, 2022

Flamefire commented Nov 25, 2022

branfosj commented Nov 25, 2022

Flamefire commented Nov 25, 2022 • edited

branfosj commented Nov 25, 2022

branfosj commented Nov 25, 2022

branfosj left a comment

Choose a reason for hiding this comment

Flamefire commented Nov 28, 2022

branfosj commented Nov 28, 2022

branfosj commented Nov 28, 2022

Flamefire commented Nov 29, 2022

Flamefire commented Nov 29, 2022

branfosj commented Nov 29, 2022

Flamefire commented Dec 4, 2022

branfosj commented Nov 24, 2022 •

edited

boegel commented Nov 25, 2022 •

edited

Flamefire commented Nov 25, 2022 •

edited