-
Notifications
You must be signed in to change notification settings - Fork 45.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Switch line orders in trainer so that restore_map is called after mov…
…ing average variables are created. Moving averages are now properly loaded during fine-tuning, instead of being recreated. PiperOrigin-RevId: 190496046
- Loading branch information
Showing
1 changed file
with
23 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I want to point out that this switch changes the naming of the nodes. This breaks the compatibility of some pretrained models. See this:
#3922
93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
This has affected all models (including SSDlite_mobilenet and SSD_mobilenetv2) [Most recent models].
Also, could the model zoo include tensorflow version compatibility?
Currently, I believe it is 1.5+
93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this change did not change any naming.
The warning is complaining about missing RMSProp values, which is from optimizer instead of model. It doesn't matter at all because you don't really need to restore optimizer parameters.
This change is actually very helpful.
The training behavior should not be impacted, as the moving average has no impact on the training. But in eval, when moving average is turned on, it should pick up more quickly as the moving average no longer needs to start from 0.
This change makes the system aware of the missing vars, it doesn't change anything. The warning message doesn't affect your training either.
@varun19299 tf version compatibility already exists in the model zoo documents.
93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick reply.
Are you sure this hasn't affected the naming? The reason I am asking this is because:
WARNING:root:Variable [SecondStageFeatureExtractor/InceptionResnetV2/Repeat/block8_9/Conv2d_1x1/biases/Momentum]
was a warning that
/block8_9/Conv2d_1x1/biases/Momentum
was not found in the checkpoint, when in fact (by printing all the graph operations):/block8_9/Conv2d_1x1/biases/
was found in the checkpoint.Further, in January (4 commits prior), some users fixed this warning by re-initiating the
train-dir
directory. I am not sure if these two methods are connected. To note, this fix no longer works as of May. Could you please check if these two were connected?93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'm sure it doesn't hurt anything. This Momentum var is used by this optimizer. /block8_9/Conv2d_1x1/biases should be in checkpoint because it's part of the net, while optimizer parameters doesn't need to.
Making this warning disappear doesn't mean fixing.
Moreover, we will release new interfaces and this trainer file will be deprecated soon.
93b8168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes sense.
So the model zoo checkpoints do not include optimis(z)er parameters right?