RuntimeError: Input type (c10::Half) and bias type (float) should be the same

First of all, congrats on the great work!

I got this error in the middle of training on T4 in a colab:

```
***** Running training *****
  Num examples = 16
  Num batches each epoch = 16
  Num Epochs = 938
  Instantaneous batch size per device = 1
  Total train batch size (w. parallel, distributed & accumulation) = 1
  Gradient Accumulation steps = 1
  Total optimization steps = 15000
Steps:  53% 8000/15000 [1:06:51<59:27,  1.96it/s, loss=0.215, lr=0.0001]
Fetching 12 files: 100% 12/12 [00:00<00:00, 9088.42it/s]
Steps:  53% 8000/15000 [1:06:54<59:27,  1.96it/s, loss=0.831, lr=0.0001]Traceback (most recent call last):
  File "train_lora_dreambooth.py", line 964, in <module>
    main(args)
  File "train_lora_dreambooth.py", line 864, in main
    model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/models/unet_2d_condition.py", line 375, in forward
    sample = self.conv_in(sample)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
Steps:  53% 8000/15000 [1:06:55<58:33,  1.99it/s, loss=0.831, lr=0.0001]
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_lora_dreambooth.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base', '--instance_data_dir=./data_example', '--output_dir=./output_example', '--instance_prompt=ghblx style', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=1e-4', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=15000', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_checkpointing']' returned non-zero exit status 1.
```

Works fine with fewer steps. Not sure why this is happening.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: Input type (c10::Half) and bias type (float) should be the same #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RuntimeError: Input type (c10::Half) and bias type (float) should be the same #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions