Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When finetuning, previous networks are kept fixed? #31

Closed
Blcony opened this issue May 14, 2018 · 5 comments
Closed

When finetuning, previous networks are kept fixed? #31

Blcony opened this issue May 14, 2018 · 5 comments

Comments

@Blcony
Copy link

Blcony commented May 14, 2018

Hi, Simmon!
Thanks for your wonderful code!
But I have some questions now, please give me some advices.
When finetuning, such as use the UnFlowC experiment for first network and UnFlowCS for the second network when training UnFlowCSS, you said that only the final network is trained and any previous networks are kept fixed.
But I can't figure out how you make the previous networks fixed in code, are there any settings in code to make sure the previous networks are fixed?
Please give me some clues, thank you very much!!

@simonmeister
Copy link
Owner

simonmeister commented May 14, 2018

These lines ensure that the previous network outputs are assumed to be static (and thus avoid feeding back gradients through these networks): https://github.com/simonmeister/UnFlow/blob/master/src/e2eflow/core/flownet.py#L51

@Blcony
Copy link
Author

Blcony commented May 14, 2018

Oh, I get it, thank you very much.
Sorry for my careless.
But in that way, if I don't just want to finetune the final network, I still want to finetune some layers in previous networks , I think things may be difficult.
Do you have any advices for that?
Thanks a lot.
Best wishes!

@simonmeister
Copy link
Owner

No problem! If you only want to add a few layers, you would need to insert the stop gradients directly before these layers in the flownet code, but it shouldn't be too hard.

@Blcony
Copy link
Author

Blcony commented May 14, 2018

But in fact I want to finetune the first layer in every networks to adapt my 1 channel input image.
In this case, using the stop gradients directly seems not very suitable.
I'm trying to find some ways to solve this problem, can you give me some advices to modify your codes?
By the way, if I turn the 1 channel images to 3 channels images by copying directly, does it make sense?
Sorry to bother you again, thank you very much.

@simonmeister
Copy link
Owner

simonmeister commented May 15, 2018

Ah so i am unsure if what you want to do would work (from the learning side of things). Technically it would be easy to filter out variables by name if you collect the gradients manually and then update the variables by using apply_gradients.

However, if you modify previous layers without adapting the later layers, i think you would run into problems with the learning, as the later layers can't adapt to the changing inputs and can't make sense of your "improved" representations from earlier layers. I would either re-train from scratch, or fine-tune end-to-end or you could maybe really try to just copy the channels first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants