Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no grad back propagate to EMAU.conv1? #14

Closed
zhaokegg opened this issue Sep 16, 2019 · 8 comments
Closed

no grad back propagate to EMAU.conv1? #14

zhaokegg opened this issue Sep 16, 2019 · 8 comments

Comments

@zhaokegg
Copy link

Excuse me, I can not find the grad back to the CONV1. Are there some bugs?

@XiaLiPKU
Copy link
Owner

Excuse me, I can not find the grad back to the CONV1. Are there some bugs?

No bug here. It is due to

with torch.no_grad():
. If you comment this line out, then conv1 shall have grad, but the performance may decreace a little. By now, I also don't know why no grad on conv1 is better.

@zhaokegg
Copy link
Author

Can you provide a model to check? I think that the CONV1 may not be included in your final model. Without gradient from loss, It is pruned by pytorch.

@XiaLiPKU
Copy link
Owner

Can you provide a model to check? I think that the CONV1 may not be included in your final model. Without gradient from loss, It is pruned by pytorch.

You can train a model without line

with torch.no_grad():
.
I don't agree with you. Without grad from loss, pytorch don't update its parameter, but it still works in the forward process.

@valencebond
Copy link

Can you provide a model to check? I think that the CONV1 may not be included in your final model. Without gradient from loss, It is pruned by pytorch.

You can train a model without line

with torch.no_grad():

.
I don't agree with you. Without grad from loss, pytorch don't update its parameter, but it still works in the forward process.

but i think if the param not update, conv1 make no sense. Will it work appropriate only using the init param of conv1? by the way, have you tried to test performance of model after removing conv1?

@XiaLiPKU
Copy link
Owner

Can you provide a model to check? I think that the CONV1 may not be included in your final model. Without gradient from loss, It is pruned by pytorch.

You can train a model without line

with torch.no_grad():

.
I don't agree with you. Without grad from loss, pytorch don't update its parameter, but it still works in the forward process.

but i think if the param not update, conv1 make no sense. Will it work appropriate only using the init param of conv1? by the way, have you tried to test performance of model after removing conv1?

Yes, I have. With the 'no_grad' setting, the only function of conv1, is just to map the distribution of input feature maps from R^+ to R.

@valencebond
Copy link

valencebond commented Sep 22, 2019

@XiaLiPKU thanks for your quickly reply, so performance is a bit worse? can you provide the concrete value?

@XiaLiPKU
Copy link
Owner

@XiaLiPKU thanks for your quickly reply, so performance is a bit worse? can you provide the concrete value?

I forgot the concrete value here. But in my memory, Deleting the 'with torch.no_grad():' will decrease around 0.5 in mIoU.
Moreover, without the conv1 layer, the minimum result of inner product is 0. As there is a 'exp' operation inside the softmax operation, 0 becomes exp(0) = 1, so the corresponding result of z_nk is not close to 0. But with the conv1 layer, the minimum can be -inf, and the correspongindg z_nk is very close to 0. Obviously, the later is what we want.
I haven't done the ablation study of conv1. But as analysed above, without conv1, there shall be some decreasing.

@valencebond
Copy link

@XiaLiPKU thanks for your detailed explanation~it is a good job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants