-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cifar10 - multi GPU training #162
Comments
@D1abloRUS please refer to https://github.com/RadeonOpenCompute/ROCm#supported-cpus . RX470 is of GFX8 family so we don't support them on x1 PCIe yet. |
@whchung hmm ok. What about gfx7xx? |
Unfortunately Hawaii (GFX7) family is not in the roadmap. Quite a few DNN algorithms in |
what about x8? |
x8 should work. please check: |
/cc @jlgreathouse to confirm supported hw list. |
Hi @D1abloRUS When you say "x8", "x1", etc. the major thing to ask is how are these GPUs connected to your CPU? In particular, gfx8 GPUs require PCIe Gen 3 atomics at every step between the CPU and the GPUs. Many people running multiple GPUs through x1 lanes are using PCIe switches to split off multiple ports from a single port. One of the major impediments here is that your PCIe switches must know how to properly forward PCIe atomic commands. Note that this is true for "x1" or "x8". So if your "PCIe x8" solution also has a switch in between your CPU and your GPU(s), you will also need to make sure this switch properly handles atomics. Towards that end, I'll ask:
Thanks. |
Closing this ticket as there're no more feedbacks. |
This PR is a stepping stone towards supporting generic multi-store source loop nests in affine loop fusion. It extends the algorithm to support fusion of multi-store loop nests that: 1. have only one store that writes to a function-local live out, and 2. the remaining stores are involved in loop nest self dependences or no dependences within the function. Closes #162 COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45 PiperOrigin-RevId: 273773907
This PR is a stepping stone towards supporting generic multi-store source loop nests in affine loop fusion. It extends the algorithm to support fusion of multi-store loop nests that: 1. have only one store that writes to a function-local live out, and 2. the remaining stores are involved in loop nest self dependences or no dependences within the function. Closes #162 COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45 PiperOrigin-RevId: 273773907
hi, i have 5 card amd x470, i run
python3 ./cifar10_multi_gpu_train.py --num_gpus=5
, i get visible one card in x16 pci. How work other cards in x1 pci?The text was updated successfully, but these errors were encountered: