[REQUEST] How can one specify the CPU architecture to target. #5451

etiennemlb · 2024-04-23T13:58:29Z

Is your feature request related to a problem? Please describe.
I work on a supercomputer where there is login and compute nodes. This architecture is typical in the HPC world.
Login nodes are where you land with ssh and often where you compile, prepare your environment and launch compute task (via SLURM).
There is no guarantee that the login node offers the same GPUs (if any) and CPUs than the one you'll find on the compute node.

How can one specify the CPU architecture with which we want to build the CPU ops.
There already is PYTORCH_ROCM_ARCH for the GPUs, Id say we need something for CPU too instead of the march=native.

The text was updated successfully, but these errors were encountered:

loadams · 2024-04-23T16:08:48Z

@etiennemlb - are you able to use JIT compile, that would then use the arch on the compute nodes? Or do you need to prebuild the ops for some reason?

etiennemlb · 2024-04-23T17:14:24Z

Clearly, I could, and will use the "JIT" method.

Note that I used prebuilt DS and megatron was failing somewhere when it was looking for the prebuilt ops. And it was crashing "hard" no warning, no error, just a -4 return code.

If this march=native has stood the test of time, it probably means I'm an outlier, but at least, I'd like a warning telling me I'm running on an architecture that has less "affordance" when it comes to architectural SIMD extensions. And preferably, a injection point to specify which arch I'd like to build for.

Compiling on compute node is also seen as a bad practice and software that rely on march=native, seen as footgun because it breaks easily and complicates things on heterogeneous clusters.

etiennemlb added the enhancement New feature or request label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REQUEST] How can one specify the CPU architecture to target. #5451

[REQUEST] How can one specify the CPU architecture to target. #5451

etiennemlb commented Apr 23, 2024

loadams commented Apr 23, 2024

etiennemlb commented Apr 23, 2024

[REQUEST] How can one specify the CPU architecture to target. #5451

[REQUEST] How can one specify the CPU architecture to target. #5451

Comments

etiennemlb commented Apr 23, 2024

loadams commented Apr 23, 2024

etiennemlb commented Apr 23, 2024