Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练第三步的时候加载模型出现错误 #21

Open
Zhang-Zhiwang opened this issue Feb 19, 2021 · 11 comments
Open

训练第三步的时候加载模型出现错误 #21

Zhang-Zhiwang opened this issue Feb 19, 2021 · 11 comments

Comments

@Zhang-Zhiwang
Copy link

RuntimeError: Error(s) in loading state_dict for ResnetGenerator_depth:
Missing key(s) in state_dict: "modelfea.1.weight", "modelfea.2.weight", "modelfea.2.bias", "modelfea.2.running_mean", "modelfea.2.running_var", "modelfea.4.weight", "modelfea.5.weight", "modelfea.5.bias", "modelfea.5.running_mean", "modelfea.5.running_var", "modelfea.7.weight", "modelfea.8.weight", "modelfea.8.bias", "modelfea.8.running_mean", "modelfea.8.running_var", "modelfea.10.conv_block.1.weight", "modelfea.10.conv_block.2.weight", "modelfea.10.conv_block.2.bias", "modelfea.10.conv_block.2.running_mean", "modelfea.10.conv_block.2.running_var", "modelfea.10.conv_block.5.weight", "modelfea.10.conv_block.6.weight", "modelfea.10.conv_block.6.bias", "modelfea.10.conv_block.6.running_mean", "modelfea.10.conv_block.6.running_var", "modelfea.11.conv_block.1.weight", "modelfea.11.conv_block.2.weight", "modelfea.11.conv_block.2.bias", "modelfea.11.conv_block.2.running_mean", "modelfea.11.conv_block.2.running_var", "modelfea.11.conv_block.5.weight", "modelfea.11.conv_block.6.weight", "modelfea.11.conv_block.6.bias", "modelfea.11.conv_block.6.running_mean", "modelfea.11.conv_block.6.running_var", "modelfea.12.conv_block.1.weight", "modelfea.12.conv_block.2.weight", "modelfea.12.conv_block.2.bias", "modelfea.12.conv_block.2.running_mean", "modelfea.12.conv_block.2.running_var", "modelfea.12.conv_block.5.weight", "modelfea.12.conv_block.6.weight", "modelfea.12.conv_block.6.bias", "modelfea.12.conv_block.6.running_mean", "modelfea.12.conv_block.6.running_var", "modelfea.13.conv_block.1.weight", "modelfea.13.conv_block.2.weight", "modelfea.13.conv_block.2.bias", "modelfea.13.conv_block.2.running_mean", "modelfea.13.conv_block.2.running_var", "modelfea.13.conv_block.5.weight", "modelfea.13.conv_block.6.weight", "modelfea.13.conv_block.6.bias", "modelfea.13.conv_block.6.running_mean", "modelfea.13.conv_block.6.running_var", "modelfea.14.conv_block.1.weight", "modelfea.14.conv_block.2.weight", "modelfea.14.conv_block.2.bias", "modelfea.14.conv_block.2.running_mean", "modelfea.14.conv_block.2.running_var", "modelfea.14.conv_block.5.weight", "modelfea.14.conv_block.6.weight", "modelfea.14.conv_block.6.bias", "modelfea.14.conv_block.6.running_mean", "modelfea.14.conv_block.6.running_var", "modelfea.15.conv_block.1.weight", "modelfea.15.conv_block.2.weight", "modelfea.15.conv_block.2.bias", "modelfea.15.conv_block.2.running_mean", "modelfea.15.conv_block.2.running_var", "modelfea.15.conv_block.5.weight", "modelfea.15.conv_block.6.weight", "modelfea.15.conv_block.6.bias", "modelfea.15.conv_block.6.running_mean", "modelfea.15.conv_block.6.running_var", "modelfea.16.conv_block.1.weight", "modelfea.16.conv_block.2.weight", "modelfea.16.conv_block.2.bias", "modelfea.16.conv_block.2.running_mean", "modelfea.16.conv_block.2.running_var", "modelfea.16.conv_block.5.weight", "modelfea.16.conv_block.6.weight", "modelfea.16.conv_block.6.bias", "modelfea.16.conv_block.6.running_mean", "modelfea.16.conv_block.6.running_var", "modelfea.17.conv_block.1.weight", "modelfea.17.conv_block.2.weight", "modelfea.17.conv_block.2.bias", "modelfea.17.conv_block.2.running_mean", "modelfea.17.conv_block.2.running_var", "modelfea.17.conv_block.5.weight", "modelfea.17.conv_block.6.weight", "modelfea.17.conv_block.6.bias", "modelfea.17.conv_block.6.running_mean", "modelfea.17.conv_block.6.running_var", "modelfea.18.conv_block.1.weight", "modelfea.18.conv_block.2.weight", "modelfea.18.conv_block.2.bias", "modelfea.18.conv_block.2.running_mean", "modelfea.18.conv_block.2.running_var", "modelfea.18.conv_block.5.weight", "modelfea.18.conv_block.6.weight", "modelfea.18.conv_block.6.bias", "modelfea.18.conv_block.6.running_mean", "modelfea.18.conv_block.6.running_var", "modelfea.19.weight", "modelfea.20.weight", "modelfea.20.bias", "modelfea.20.running_mean", "modelfea.20.running_var", "modelfea.22.weight", "modelfea.23.weight", "modelfea.23.bias", "modelfea.23.running_mean", "modelfea.23.running_var", "SFT.condition_conv.0.weight", "SFT.condition_conv.0.bias", "SFT.condition_conv.2.weight", "SFT.condition_conv.2.bias", "SFT.condition_conv.4.weight", "SFT.condition_conv.4.bias", "SFT.scale_conv.0.weight", "SFT.scale_conv.0.bias", "SFT.scale_conv.2.weight", "SFT.scale_conv.2.bias", "SFT.sift_conv.0.weight", "SFT.sift_conv.0.bias", "SFT.sift_conv.2.weight", "SFT.sift_conv.2.bias", "model2.1.weight", "model2.1.bias".

@Zhang-Zhiwang
Copy link
Author

第二步类似,但是没有缺失。
我看了网上的方法,说是加载的时候把strict设置成false,但是这样出来的效果基本等于没有
请问大家有没有碰到过这个问题,是怎样解决的?

@Zhang-Zhiwang
Copy link
Author

问题解决了,在训练CycleGan的时候netG_B 是resnet_9blocks ,而在SDehazing中,作者使用预训练模型初始化的时候传递的参数是which_model_netG_A,也就是resnet_9blocks_depth,两个模型不一致会出现缺少键值。
不知道作者为什么要用netG_A来进行初始化,这样难道不是应该在加载模型的函数中将strict设置为False吗?但是我试过,R2S就没什么效果了。
我的解决办法就是将define_G()函数中的which_model_netG_A改成which_model_netG_B,后面的forward也要改。
但是在最后一步联合训练的时候,作者仍然使用了resnet_9blocks_depth来装载resnet_9blocks的netG_B,不知道这样操作是为什么,希望能有大神回答一下

@xiaowei-chi
Copy link

+1 碰到了同样的问题

@ghost
Copy link

ghost commented Mar 5, 2021

netG_B

你在训练CycleGAN的时候遇到G_A的loss收敛不了的情况嘛

@Zhang-Zhiwang
Copy link
Author

netG_B

你在训练CycleGAN的时候遇到G_A的loss收敛不了的情况嘛

我两个G都是损失忽上忽下,而且一开始的几轮是损失最小的,后面反而变大了

@ghost
Copy link

ghost commented Mar 9, 2021 via email

@Zhang-Zhiwang
Copy link
Author

我训练出来是这样的,这种情况正常嘛? junkai.fan@njust.edu.cn 发件人: Jason Zhang 发送时间: 2021-03-09 21:10 收件人: HUSTSYJ/DA_dahazing 抄送: 樊俊凯; Comment 主题: Re: [HUSTSYJ/DA_dahazing] 训练第三步的时候加载模型出现错误 (#21) netG_B 你在训练CycleGAN的时候遇到G_A的loss收敛不了的情况嘛 我两个G都是损失忽上忽下,而且一开始的几轮是损失最小的,后面反而变大了 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

不知道,我想做去雨的,结果域偏移过程中图像损失太多细节了,糊得一塌糊涂。。现在疯狂调参,可能数据集还要改

@Zhang-Zhiwang
Copy link
Author

我训练出来是这样的,这种情况正常嘛? junkai.fan@njust.edu.cn 发件人: Jason Zhang 发送时间: 2021-03-09 21:10 收件人: HUSTSYJ/DA_dahazing 抄送: 樊俊凯; Comment 主题: Re: [HUSTSYJ/DA_dahazing] 训练第三步的时候加载模型出现错误 (#21) netG_B 你在训练CycleGAN的时候遇到G_A的loss收敛不了的情况嘛 我两个G都是损失忽上忽下,而且一开始的几轮是损失最小的,后面反而变大了 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

关于损失函数的变动你可以看看这篇文章,gan的损失和图像质量关系可能不大。
https://www.jianshu.com/p/914052bec9bc?utm_campaign

@buptlj
Copy link

buptlj commented May 12, 2021

问题解决了,在训练CycleGan的时候netG_B 是resnet_9blocks ,而在SDehazing中,作者使用预训练模型初始化的时候传递的参数是which_model_netG_A,也就是resnet_9blocks_depth,两个模型不一致会出现缺少键值。
不知道作者为什么要用netG_A来进行初始化,这样难道不是应该在加载模型的函数中将strict设置为False吗?但是我试过,R2S就没什么效果了。
我的解决办法就是将define_G()函数中的which_model_netG_A改成which_model_netG_B,后面的forward也要改。
但是在最后一步联合训练的时候,作者仍然使用了resnet_9blocks_depth来装载resnet_9blocks的netG_B,不知道这样操作是为什么,希望能有大神回答一下

按照论文中的结构描述,是要这样改。论文中从R到S是没有用到深度信息的,训练SDehazing时,初始化应该用which_model_netG_B,forward也要改,这样才和论文中一样。就是不知道作者最后的结果,是按照论文中描述训练的,还是按代码训练的。你这样修改后能复现论文中的结果吗?

@vvvvvvvvvvvvvvvvvvvvvvv

我训练出来是这样的,这种情况正常嘛? junkai.fan@njust.edu.cn 发件人: Jason Zhang 发送时间: 2021-03-09 21:10 收件人: HUSTSYJ/DA_dahazing 抄送: 樊俊凯; Comment 主题: Re: [HUSTSYJ/DA_dahazing] 训练第三步的时候加载模型出现错误 (#21) netG_B 你在训练CycleGAN的时候遇到G_A的loss收敛不了的情况嘛 我两个G都是损失忽上忽下,而且一开始的几轮是损失最小的,后面反而变大了 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

关于损失函数的变动你可以看看这篇文章,gan的损失和图像质量关系可能不大。 https://www.jianshu.com/p/914052bec9bc?utm_campaign

请问你训练是正常的吗,为什么我训练显示
Traceback (most recent call last): File "train.py", line 5, in <module> from util.visualizer import Visualizer File "/content/drive/MyDrive/pytorch_test/DA_dahazing/util/visualizer.py", line 6, in <module> from . import html File "/content/drive/MyDrive/pytorch_test/DA_dahazing/util/html.py", line 1, in <module> import dominate ModuleNotFoundError: No module named 'dominate'
是缺少什么步骤吗?谢谢

@hello-trouble
Copy link

image, 这里的代码按照论文要求应该选择without GAN Loss 吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants