Training error: RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. #18

hosea7456 · 2021-11-02T02:53:04Z

Hi, thanks for your great jobs!
When I try to train a model, there was an error like that:

Traceback (most recent call last):
File "so_run.py", line 51, in
main()
File "so_run.py", line 43, in main
trainer.train()
File "/home/CCM/trainer/source_only_trainer.py", line 58, in train
self.optim.step()
File /home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/sgd.py", line 110, in step
F.sgd(params_with_grad,
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/functional.py", line 180, in sgd
param.add(d_p, alpha=-lr)
RuntimeError: For non-complex input tensors, argument alpha must not be a complex number.

How should I fix it? Thank you.
And my config used to train is:

note: 'train'

configs of data

model: 'deeplab'
train: True
multigpu: False
fixbn: True
fix_seed: True

Optimizaers

learning_rate: 7.5e-5
num_steps: 5000
epochs: 2
weight_decay: 0.0005
momentum: 0.9
power: 0.9
round: 6

Logging

print_freq: 1
save_freq: 2000
tensorboard: False
neptune: False
screen: True
val: False
val_freq: 300

Dataset

source: 'gta5'
target: 'cityscapes'
worker: 0
batch_size: 2

#Transforms
input_src: 720
input_tgt: 720
crop_src: 600
crop_tgt: 600
mirror: True
scale_min: 0.5
scale_max: 1.5
rec: False

Model hypers

init_weight: './pretrained/DeepLab_resnet_pretrained_init-f81d91e8.pth'
restore_from: None

snapshot: './Data/snapshot/'
result: './miou_result/'
log: './log/'
plabel: './plabel'
gta5: {
data_dir: '/home/data/datasets/GTA5/',
data_list: './dataset/list/gta5_list.txt',
input_size: [1280, 720]
}
synthia: {
data_dir: '/home/guangrui/data/synthia/',
data_list: './dataset/list/synthia_list.txt',
input_size: [1280, 760]
}
cityscapes: {
data_dir: '/home/data/datasets/Cityscapes',
data_list: './dataset/list/cityscapes_train.txt',
input_size: [1024, 512]
}

Solacex · 2021-11-04T06:06:54Z

Hello,

Thanks for your interest on our work!
I tried to locate the problem you post but failed. But I postulate that the error is caused by the new version of pytorch, so I think using pytorch=1.7.0 may helps.

Hope it helps.

hosea7456 · 2021-11-17T06:43:52Z

Hello,

Thanks for your interest on our work! I tried to locate the problem you post but failed. But I postulate that the error is caused by the new version of pytorch, so I think using pytorch=1.7.0 may helps.

Hope it helps.

Hi, thanks for your advise. I have tried the version of pytorch==1.7.0, the before error was disappeared but another error is appaer:

Traceback (most recent call last):
File "so_run.py", line 51, in
main()
File "so_run.py", line 43, in main
trainer.train()
File "/home/CCM/trainer/source_only_trainer.py", line 58, in train
self.optim.step()
File "/home/anaconda3/envs/torch1.7/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/anaconda3/envs/torch1.7/lib/python3.8/site-packages/torch/optim/sgd.py", line 112, in step
p.add_(d_p, alpha=-group['lr'])
RuntimeError: value cannot be converted to type float without overflow: (2.10957e-06,-6.85442e-07)

I have no idea at all

Solacex · 2021-11-17T06:45:36Z

Hello
As far as I can postulate, it maybe because the training steps exceeds the max steps of the optimizer.
You can check it..

Jo-wang · 2022-02-09T08:24:41Z

Same error here, and I've tried to increase num_steps in so_config.yaml but it didn't work. Could you provide the parameter that you use to train source-only model?
Thank you!

Jo-wang · 2022-02-14T07:45:34Z

Hi, I just solved that several days ago. The error caused by the fixed max number of steps in adjusting learning rate. You can have a check if it's work.
Cheers,
zx

Hyx098130 · 2023-05-04T07:10:08Z

嗨，我几天前刚刚解决了这个问题。调整学习率时固定的最大步数引起的错误。您可以检查它是否有效。干杯，zx

I also encountered this problem recently, can you elaborate on how to solve it? Thank you very much

Jo-wang · 2023-05-11T02:12:42Z

嗨，我几天前刚刚解决了这个问题。调整学习率时固定的最大步数引起的错误。您可以检查它是否有效。干杯，zx

I also encountered this problem recently, can you elaborate on how to solve it? Thank you very much

Hi there,
Sorry for the late reply. The issue is coming from the incorrect max step during optimizating. Here is my version:

def adjust_learning_rate(optimizer, i_iter, len_loader, args):
    lr = lr_poly(args.learning_rate, i_iter, args.epochs*len_loader, args.power)
    optimizer.param_groups[0]['lr'] = lr
    if len(optimizer.param_groups) > 1:
        optimizer.param_groups[1]['lr'] = lr * 10
    return lr

Hope this could help.

Zx

FengheTan9 mentioned this issue May 31, 2023

RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. FengheTan9/Multi-Level-Global-Context-Cross-Consistency#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training error: RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. #18

Training error: RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. #18

hosea7456 commented Nov 2, 2021 •

edited

Solacex commented Nov 4, 2021

hosea7456 commented Nov 17, 2021

Solacex commented Nov 17, 2021

Jo-wang commented Feb 9, 2022

Jo-wang commented Feb 14, 2022

Hyx098130 commented May 4, 2023

Jo-wang commented May 11, 2023

Training error: RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. #18

Training error: RuntimeError: For non-complex input tensors, argument alpha must not be a complex number. #18

Comments

hosea7456 commented Nov 2, 2021 • edited

configs of data

Optimizaers

Logging

Dataset

Model hypers

Solacex commented Nov 4, 2021

hosea7456 commented Nov 17, 2021

Solacex commented Nov 17, 2021

Jo-wang commented Feb 9, 2022

Jo-wang commented Feb 14, 2022

Hyx098130 commented May 4, 2023

Jo-wang commented May 11, 2023

hosea7456 commented Nov 2, 2021 •

edited