-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
训练时CUDA报错 #6
Comments
./config/accelerate_config/default_config.yaml 这里指定了使用单机8卡训练,你看下是不是你没有这么多卡 |
需要这么多张卡吗,没这么多资源怎么办 |
不一定非要8卡,你有几张卡就写几张 |
Traceback (most recent call last):
|
我训练的时候没有遇到你这个问题,也许可以参考Vision-CAIR/MiniGPT-4#237 (comment)
|
! sh scripts/finetune_model_TaiyiXL_data_catwoman.sh
运行时会出现CUDA报错问题,指定了设备也没用是为什么
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.The text was updated successfully, but these errors were encountered: