Resolving pytorch-kernels failing CI tests - Issue #321#332
Conversation
|
is torch 2.10 the correct version? |
danielholanda
left a comment
There was a problem hiding this comment.
Need confirmation from @adamlam2-amd that this is indeed the recommended install strategy before merging.
| ROCM_VERSION="$(python -c 'import importlib.metadata as m; print(m.version("rocm"))')" | ||
| python -m pip install --index-url https://rocm.nightlies.amd.com/v2/gfx1151/ "rocm[libraries,devel]==${ROCM_VERSION}" |
There was a problem hiding this comment.
@adamlam2-amd can you confirm that this is the recommended path?
@sdevinenamd, can you please confirm? |
|
so the windows version of this playbook doesnt work and the commands havent been validated either really. |
|
please wait for #246 to be merged. |
It should actually be the other way around, #246 depends on this PR (#332). Right now, CI is passing only because of the changes in this PR. The main fix here was reordering the install steps:
Previously, we were installing Example of the mismatch we were seeing:
With the updated order, everything stays aligned. |
I was able to run this on Windows, at least the CI tests are all passing there. For this playbook, CI is actually pretty close to real user steps since there’s no GUI involved, so CI. Please take a look at PR #265 to see exactly what the tests are validating. |
|
do you have a windows halobox? please test there to confirm. otherwise, for both satya and myself, we couldn't get it working. |
|
we are also using torch 2.11 |
Documentation/Instruction updates
powershellwherever appropriateROCM_BINenvironment variablePython3.12rocm-sdk-develmatches the ROCm version thattorch==2.10.0actually installstorch==2.10.0 torchaudio torchvisionrocmpackage versionrocm[libraries,devel]pinned to that exact versionenv-setup-rocm-pytorch-windowshiprtc*.dllexists in$ROCM_BIN, then copies it to the name PyTorch expectsROCM_BINto environmentpipcommandrocm-sdk-develmatches the ROCm version thattorch==2.10.0actually installsvector-addition-jit-windowshiprtc*.dllexists in$ROCM_BIN, then copies it to the name PyTorch expects.matmul-jit-windowshiprtc*.dllexists in$ROCM_BIN, then copies it to the name PyTorch expects.env-setup-rocm-pytorch-linuxrocm-sdk-develmatches the ROCm version thattorch==2.10.0actually installs