Skip to content

Conversation

qlzh727
Copy link
Member

@qlzh727 qlzh727 commented May 10, 2018

The final BN and ReLU layer is only need for v2 model since it was
doing preactivation in each block.

The final BN and ReLU layer is only need for v2 model since it was
doing preactivation in each block.
@qlzh727 qlzh727 requested review from karmel and robieta May 10, 2018 17:45
@qlzh727 qlzh727 requested a review from a team as a code owner May 10, 2018 17:45
Copy link
Contributor

@robieta robieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karmel
Copy link
Contributor

karmel commented May 10, 2018

How was this missed? Or, to put that another way, why didn't this have any effect in training, etc.? Does this fix the loss scale issue?

@robieta
Copy link
Contributor

robieta commented May 10, 2018

Sadly this does not resolve the divergent fp16 v1 training.

@qlzh727
Copy link
Member Author

qlzh727 commented May 10, 2018

@karmel, the extra ReLU shouldn't impact the correctness of the model. The BN will affect the number to let the final number scale down further.

The fp16 problem in v1 is because of overflow of the number on the top few layer of the model. Removing the final BN and ReLU won't fix it.

@qlzh727 qlzh727 merged commit 89edd1c into tensorflow:master May 10, 2018
@qlzh727 qlzh727 deleted the resnet_v1_fix branch May 10, 2018 19:45
@HiKapok
Copy link

HiKapok commented May 11, 2018

have the pretrained model of resnet-v1 been replaced ?

@qlzh727
Copy link
Member Author

qlzh727 commented May 11, 2018

@HiKapok, the pretrained model has not been updated yet. Will do in later this week.

omegafragger pushed a commit to omegafragger/models that referenced this pull request May 15, 2018
The final BN and ReLU layer is only need for v2 model since it was
doing preactivation in each block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants