When finetuning, previous networks are kept fixed? #31

Blcony · 2018-05-14T07:34:15Z

Hi, Simmon!
Thanks for your wonderful code!
But I have some questions now, please give me some advices.
When finetuning, such as use the UnFlowC experiment for first network and UnFlowCS for the second network when training UnFlowCSS, you said that only the final network is trained and any previous networks are kept fixed.
But I can't figure out how you make the previous networks fixed in code, are there any settings in code to make sure the previous networks are fixed?
Please give me some clues, thank you very much!!

simonmeister · 2018-05-14T08:07:56Z

These lines ensure that the previous network outputs are assumed to be static (and thus avoid feeding back gradients through these networks): https://github.com/simonmeister/UnFlow/blob/master/src/e2eflow/core/flownet.py#L51

Blcony · 2018-05-14T10:55:47Z

Oh, I get it, thank you very much.
Sorry for my careless.
But in that way, if I don't just want to finetune the final network, I still want to finetune some layers in previous networks , I think things may be difficult.
Do you have any advices for that?
Thanks a lot.
Best wishes!

simonmeister · 2018-05-14T11:19:00Z

No problem! If you only want to add a few layers, you would need to insert the stop gradients directly before these layers in the flownet code, but it shouldn't be too hard.

Blcony · 2018-05-14T11:35:29Z

But in fact I want to finetune the first layer in every networks to adapt my 1 channel input image.
In this case, using the stop gradients directly seems not very suitable.
I'm trying to find some ways to solve this problem, can you give me some advices to modify your codes?
By the way, if I turn the 1 channel images to 3 channels images by copying directly, does it make sense?
Sorry to bother you again, thank you very much.

simonmeister · 2018-05-15T12:10:54Z

Ah so i am unsure if what you want to do would work (from the learning side of things). Technically it would be easy to filter out variables by name if you collect the gradients manually and then update the variables by using apply_gradients.

However, if you modify previous layers without adapting the later layers, i think you would run into problems with the learning, as the later layers can't adapt to the changing inputs and can't make sense of your "improved" representations from earlier layers. I would either re-train from scratch, or fine-tune end-to-end or you could maybe really try to just copy the channels first.

simonmeister closed this as completed May 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When finetuning, previous networks are kept fixed? #31

When finetuning, previous networks are kept fixed? #31

Blcony commented May 14, 2018

simonmeister commented May 14, 2018 •

edited

Blcony commented May 14, 2018

simonmeister commented May 14, 2018

Blcony commented May 14, 2018

simonmeister commented May 15, 2018 •

edited

When finetuning, previous networks are kept fixed? #31

When finetuning, previous networks are kept fixed? #31

Comments

Blcony commented May 14, 2018

simonmeister commented May 14, 2018 • edited

Blcony commented May 14, 2018

simonmeister commented May 14, 2018

Blcony commented May 14, 2018

simonmeister commented May 15, 2018 • edited

simonmeister commented May 14, 2018 •

edited

simonmeister commented May 15, 2018 •

edited