Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TESLA V100 Training LORA not use 100%GPU #2154

Closed
littleyeson opened this issue Mar 25, 2024 · 2 comments
Closed

TESLA V100 Training LORA not use 100%GPU #2154

littleyeson opened this issue Mar 25, 2024 · 2 comments

Comments

@littleyeson
Copy link

littleyeson commented Mar 25, 2024

In my LORA training process, my GPU utilization always remains around 30-40%, as well as the TDP. The training speed is extremely slow. How can I resolve this? Additionally, my machine has two other GPUs, V100 and 2080ti 22G, but it always defaults to using V100 and I am unable to select the other one
微信图片_20240325152246
微信图片_20240325152226

@littleyeson
Copy link
Author

I solved the problem of switching GPUs. In the setup.bat settings,choose 4 accelerate config,at last instead of entering 'all' for selecting GPUs, input the number of GPU IDs. For example, if there are two GPUs, numbered 0 and 1, input '0,1'.

@littleyeson
Copy link
Author

I find use adamW8bit and more bitchsize (GPU memory not overflow).The GPU load can use 70~80%.

@bmaltais bmaltais closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants