forked from bitsandbytes-foundation/bitsandbytes
-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Labels
Description
System Info
Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.
Reproduction
Trying to load a model with bitsandbytes fails because there is no access to rocminfo.
def get_rocm_gpu_arch() -> str:
logger = logging.getLogger(__name__)
try:
if torch.version.hip:
result = subprocess.run(["rocminfo"], capture_output=True, text=True)
match = re.search(r"Name:\s+gfx([a-zA-Z\d]+)", result.stdout)
ERROR:bitsandbytes.cuda_specs:Could not detect ROCm GPU architecture: [Errno 2] No such file or directory: 'rocminfo'
WARNING:bitsandbytes.cuda_specs:
ROCm GPU architecture detection failed despite ROCm being available.
bitsandbytes/bitsandbytes/cuda_specs.py
Line 54 in 4aad810
| result = subprocess.run(["rocminfo"], capture_output=True, text=True) |
Expected behavior
I would prefer if I could set the architecture via an environment variable and rocminfo would be the fallback option if the env var is not set.
Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.
Reactions are currently unavailable