-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
not use gpu calcuation with dora #160
Comments
I can ensure this is running on GPU. |
│ C:\Users\te\anaconda3\envs\kohya_ss\lib\site-packages\lycoris_lora-2.2.0.dev4-py3.10.egg\lyco │ === I got this error when I tried this command accelerate launch --num_cpu_threads_per_process=2 C:/Users/te/ML/kohya_ss/sdxl_train_network.py --pretrained_model_name_or_path="E:\diffusion-models\Stable-diffusion\mixgirl.safetensors" --train_data_dir="C:\Users\te\ML\diffusion-benchmark\temp\pl9ia1o0" --resolution=1024,1024 --output_dir="E:/diffusion-models/Lora\test16" --logging_dir="C:/Users/te/ML/kohya_ss/logs" --network_alpha=1 --save_model_as=safetensors --network_module=lycoris.kohya --network_args "algo=lora" "dora_wd=True" --text_encoder_lr=0.0004 --unet_lr=0.0004 --network_dim=64 --output_name=woman_young_64rank --lr_scheduler_num_cycles=8 --no_half_vae --learning_rate=0.0004 --lr_scheduler=constant --train_batch_size=1 --max_train_steps=2000 --save_every_n_epochs=1 --mixed_precision=fp16 --save_precision=fp16 --optimizer_type=Adafactor --optimizer_args scale_parameter=False relative_step=False warmup_init=False --max_data_loader_n_workers=0 --bucket_reso_steps=64 --seed=1234 --gradient_checkpointing --full_fp16 --xformers --bucket_no_upscale --noise_offset=0.0 --lowram --cache_latents --cache_latents_to_disk this command is working when remove the "dora_wd=True" option |
I guess this is caused by --lowvram. |
I changed the code
this is working for me |
This is not how it works. |
I tried with --lowvram but I got same issue |
No |
I apologize for the typo. I proceeded without using the --lowvram option, but encountered an issue where the device type does not match. If I add "dora_wd=True" like --network_module="lycoris.kohya", --network_args "algo=lora" "dora_wd=True", an error occurs. However, if I remove "dora_wd=True", the error does not occur. |
@SlZeroth Should be solved in 2.2.0.dev7 |
Will close this issue on the weekend if no reply. |
This issue was resolved in version 2.2.0.dev7, but the same error has reoccurred in versions released after 2.2.0.dev8. |
Thx for info |
@KohakuBlueleaf thank you for check! |
@KohakuBlueleaf thank you I checked latest update version ! it works fine |
Hi KohakuBlueleaf, The recent changes to the apply_weight_decompose function might have caused the [Expected all tensors to be on the same device] issue to resurface. The dora_scale in the return value of this function might be running on the CPU. def apply_weight_decompose(self, weight):
weight_norm = (
weight.transpose(0, 1)
.reshape(weight.shape[1], -1)
.norm(dim=1, keepdim=True)
.reshape(weight.shape[1], *[1] * self.dora_norm_dims)
.transpose(0, 1)
)
return weight * (self.dora_scale / weight_norm) After making some test adjustments to the "lycoris/modules/locon.py" file, I was able to run it successfully, but I'm not sure if these changes are entirely correct. def make_weight(self, device=None):
...
...
...
if self.wd and self.dora_scale.device != weight.device:
#print("self.dora_scale.device:", self.dora_scale.device) => self.dora_scale.device: cpu
#print("weight.device:", weight.device) => weight.device: cuda:0
self.dora_scale = self.dora_scale.to(weight.device)
return weight * self.scalar.to(device) Could you please help verify this? Thank you. |
--network_module="lycoris.kohya",
--network_args "algo=lora" "dora_wd=True""
Hello. I used Lycoris for train dora. but It is really slow when train dora and I got warning in this code
I think apply_weight_decompose use CPU calucation.
how can i fix it?
The text was updated successfully, but these errors were encountered: