New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
W6800 and ROCm #1595
Comments
according to their docs, w6800 is supported now. |
@zrzrv5 It's not included here https://github.com/RadeonOpenCompute/ROCm#Hardware-and-Software-Support @KurtStineAI2 If waiting for that support costs you money and opportunities, don't. I'm waiting over a year already, regretting every day. Edit: |
@ROCmSupport you have committed to Navi support before end of year. What is going on? |
I just install the room 4.5 on my 6800xt. It kinda works (I guess), the latest Tensorflow docker works fine. |
Confirmed, the missing hip_binary is resolved. Simple repro import tensorflow as tf model = Sequential() This no passes with GPU devices created. |
@ianferreira Never used the dockers. Can you point me to instructions on that? |
Yes, there is rocm/tensorflow image. The run command is a bit messy but here is link https://github.com/ROCmSoftwarePlatform/tensorflow-upstream I also confirmed on bare metal it works! Many Thanks @ROCmSupport after buying two RX6800 cards at "scalper prices" I can no finally put them to use!!!!! |
@ianferreira You've put some hope into me so i tried a clean reinstall just to be sure, but i'm getting "hipErrorNoBinaryforGPU" again on baremetal. Any hints? Edit: |
@aoolmay Make sure to uninstall the amdgpu pro driver. Then follow the steps to reinstall the new version. I.e. for 20.04 $ sudo apt-get update $ wget https://repo.radeon.com/amdgpu-install/21.40/ubuntu/focal/amdgpu-install-21.40.40500-1_all.deb $ sudo apt-get install ./amdgpu-install-21.40.40500-1_all.deb $ sudo apt-get update Reboot. You have to run the drun alias for docker. If you running without docker then remember to set you LD_LiBRARY path to /opt/rocm-4.5.0/lib |
@zrzrv5 Does PyTorch work yet? |
tf2) ian@ian-TRX40-AORUS-PRO-WIFI:~/Documents$ /home/ian/.venvs/tf2/bin/python /home/ian/Documents/tensorflow/mnist.py |
@ianferreira Thanks for the previous comment, i'm fine with using docker now that i learned about it. I'll get around to finding out what's wrong with my bare metal setup in time. It's working, all i cared for. About that example though, two GPUs are shown. Are those both used for the process automatically? I'm guessing it's just debugging/info, but maybe there's more progress i missed. |
Have not gotten the docker or bare metal to work with Pytorch. Same hipErrorNoBinaryForGpu error. |
The script I used did not do multi-gpu, let me try and and make sure RCCL works.... |
Just used MirroredStrategy and seems both GPU's are working...
Seems both working...but not 100% given I dont have "NVLINK" equivalent. ======================= ROCm System Management Interface ======================= ============================ Hops between two GPUs ============================= ========================== Link Type between two GPUs ========================== ================================== Numa Nodes ================================== ian@ian-TRX40-AORUS-PRO-WIFI:~/Documents$ rocm-smi ======================= ROCm System Management Interface =======================
|
@ianferreira Really appreciate your inputs, i and probably quite a few more people would miss the NAVI enabled docker. Thanks man! |
Hi All, |
Is there an ETA on when the W6800 will support ROCm? We'd like to trial ROCm for our ML applications, but Vega chips are getting more and more difficult to find. Instinct chips are not an option at this point as we would be trialing the hardware, not investing yet.
The text was updated successfully, but these errors were encountered: