[BUG] Duplicate/Wrong(?) Compute capability flags added

https://github.com/deepspeedai/DeepSpeed/blob/0ba235294fcbe35f5c42681bd85666dff5c48a93/op_builder/builder.py#L667-L671

adds `-gencode=` flags for the current devices.

As 
https://github.com/deepspeedai/DeepSpeed/blob/0ba235294fcbe35f5c42681bd85666dff5c48a93/op_builder/builder.py#L571-L573

clears `$TORCH_CUDA_ARCH_LIST` PyTorch will compile for the current device(s?) and add appropriate flags, again.

This results in an overly long command line:
`nvcc [...] -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 [...] -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 [...]/deepspeed/ops/csrc/fp_quantizer/fp_quantize_impl.cu -o fp_quantize_impl.cuda.o`


Depending on how nvcc handles the repetition this may lead to redundant compilation.

A better approach would be to NOT pass CUDA CC flags and instead set TORCH_CUDA_ARCH_LIST appropriately and let PyTorch handle this which also avoids 
> UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. 
>  If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].


	for cc in ccs:
	num = cc[0] + cc[1].split('+')[0]
	args.append(f'-gencode=arch=compute_{num},code=sm_{num}')
	if cc[1].endswith('+PTX'):
	args.append(f'-gencode=arch=compute_{num},code=compute_{num}')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Duplicate/Wrong(?) Compute capability flags added #7972

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if "TORCH_CUDA_ARCH_LIST" in os.environ:
	torch_arch_list = os.environ.get("TORCH_CUDA_ARCH_LIST")
	os.environ["TORCH_CUDA_ARCH_LIST"] = ""

[BUG] Duplicate/Wrong(?) Compute capability flags added #7972

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions