Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baichuan fine-tune.py bug #5393

Open
Mac-hawk opened this issue Apr 10, 2024 · 1 comment
Open

Baichuan fine-tune.py bug #5393

Mac-hawk opened this issue Apr 10, 2024 · 1 comment
Labels
bug Something isn't working deepspeed-chat Related to DeepSpeed-Chat

Comments

@Mac-hawk
Copy link

when I want to use Baichuan to train,I give some args and it returns me some errors like below.

[real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/home/sunmy/anaconda3/envs/gra/lib/python3.9/site-packages/torch/cuda/init.py:141: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
Unable to find hostfile, will proceed with training with local resources only.
/home/sunmy/anaconda3/envs/gra/lib/python3.9/site-packages/torch/cuda/init.py:628: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/home/sunmy/anaconda3/envs/gra/bin/deepspeed", line 6, in
main()
File "/home/sunmy/anaconda3/envs/gra/lib/python3.9/site-packages/deepspeed/launcher/runner.py", line 418, in main
raise RuntimeError("Unable to proceed, no GPU resources available")
RuntimeError: Unable to proceed, no GPU resources available

@Mac-hawk Mac-hawk added bug Something isn't working deepspeed-chat Related to DeepSpeed-Chat labels Apr 10, 2024
@xuanhua
Copy link

xuanhua commented Apr 29, 2024

Error 803: system has unsupported display driver / cuda driver combination

It looks like you have a mismatching version between your GPU driver and your cuda, maybe you should fix this issue first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deepspeed-chat Related to DeepSpeed-Chat
Projects
None yet
Development

No branches or pull requests

2 participants