Multiple GPU Lora training is not working. #838

leonary · 2023-09-28T03:37:33Z

I have successfully configured two A40 graphics cards to perform Lora training. During training, both cards are observed to be utilized, but the training speed does not improve significantly. The time required for training is almost the same as using a single card, and the number of epochs increases from 1 to 2. Furthermore, the training results achieved with two cards (the capability displayed by Lora) are even worse than those obtained with a single card.
I would like to know if it is possible to accelerate Lora training using multiple cards. If so, what should I do? Apart from setting the Accelerate config, are there any additional steps required?

kohya-ss · 2023-10-01T05:17:06Z

In multiple GPU training, the number of the images multiplied by GPU count is trained with single step. So it is recommended to use --max_train_epochs for training same amount as the single GPU training.

For the result of LoRA, I think it may be overfitted by multiple GPU training.

leonary closed this as completed Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple GPU Lora training is not working. #838

Multiple GPU Lora training is not working. #838

leonary commented Sep 28, 2023

kohya-ss commented Oct 1, 2023

Multiple GPU Lora training is not working. #838

Multiple GPU Lora training is not working. #838

Comments

leonary commented Sep 28, 2023

kohya-ss commented Oct 1, 2023