You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I work on a supercomputer where there is login and compute nodes. This architecture is typical in the HPC world.
Login nodes are where you land with ssh and often where you compile, prepare your environment and launch compute task (via SLURM).
There is no guarantee that the login node offers the same GPUs (if any) and CPUs than the one you'll find on the compute node.
How can one specify the CPU architecture with which we want to build the CPU ops.
There already is PYTORCH_ROCM_ARCH for the GPUs, Id say we need something for CPU too instead of the march=native.
The text was updated successfully, but these errors were encountered:
Note that I used prebuilt DS and megatron was failing somewhere when it was looking for the prebuilt ops. And it was crashing "hard" no warning, no error, just a -4 return code.
If this march=native has stood the test of time, it probably means I'm an outlier, but at least, I'd like a warning telling me I'm running on an architecture that has less "affordance" when it comes to architectural SIMD extensions. And preferably, a injection point to specify which arch I'd like to build for.
Compiling on compute node is also seen as a bad practice and software that rely on march=native, seen as footgun because it breaks easily and complicates things on heterogeneous clusters.
Is your feature request related to a problem? Please describe.
I work on a supercomputer where there is login and compute nodes. This architecture is typical in the HPC world.
Login nodes are where you land with ssh and often where you compile, prepare your environment and launch compute task (via SLURM).
There is no guarantee that the login node offers the same GPUs (if any) and CPUs than the one you'll find on the compute node.
How can one specify the CPU architecture with which we want to build the CPU ops.
There already is PYTORCH_ROCM_ARCH for the GPUs, Id say we need something for CPU too instead of the
march=native
.The text was updated successfully, but these errors were encountered: