Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Problem with keras ResNet50 CAM visualisation #53 #122

Merged
merged 6 commits into from
Aug 31, 2018

Conversation

keisen
Copy link
Collaborator

@keisen keisen commented Jul 18, 2018

Cause
In the modify_model_backdrop method, It is setting new Relu function to all the ‘activation’ attribute. i.e., It's all of Convolution Layer and Activation Layer.
In Keras's ResNet50, the activation attribute of each Conv layer is set to the Linear function and also has Activation layer.
Replacing this Linear function with the Relu function seems to be the cause of the gradient loss.
Actually, If there are not Activation Layer, this issue isn't occured.

Detail

Please refer to below comment ( #122 (comment) ).

@raghakot ,
Since it is a bug fix, I didn’t add tests and documents.
Instead of that, I added an example.

https://github.com/keisen/keras-vis/blob/bugfix/%2353/examples/resnet/attention.ipynb

( MODIFIED on 15 Aug 2018 )

@keisen
Copy link
Collaborator Author

keisen commented Jul 20, 2018

@raghakot ,
Sorry, I forgot to fix the test code.
I will fix it so please give me some time.

@keisen
Copy link
Collaborator Author

keisen commented Jul 21, 2018

Hi, @raghakot

Now, I'm fixing the test code.
I have a question regarding the following implementation.

# 2. Replace all possible activations with ReLU.
for i, layer in utils.reverse_enumerate(modified_model.layers):
if hasattr(layer, 'activation'):
layer.activation = tf.nn.relu
if isinstance(layer, _ADVANCED_ACTIVATIONS):
# NOTE: This code is brittle as it makes use of Keras internal serialization knowledge and might
# break in the future.
modified_layer = Activation('relu')
modified_layer.inbound_nodes = layer.inbound_nodes
modified_layer.name = layer.name
modified_model.layers[i] = modified_layer
# 3. Save model with modifications.
modified_model.save(model_path)

As a test, I removed this implementation (for-each).
But GuidedBackprop is working fine. (There is no problem with ReNet 50 @ Keras)

What is this implementation necessary for?
Is it important to use tf.nn.relu ?

Thanks.

ADD on 23 Jul 2018:

What is this implementation necessary for?
Is it important to use tf.nn.relu ?

I understood about this. maybe.
Because it corresponds to Activation other than ReLU, isn't it ?

I think that it should be used gradient_override_map.
I wrote an example in the comment below.

@keisen keisen closed this Jul 23, 2018
@keisen keisen reopened this Jul 23, 2018
@keisen
Copy link
Collaborator Author

keisen commented Jul 23, 2018

I'm sorry. I misoperated.... ;-(

@keisen
Copy link
Collaborator Author

keisen commented Jul 23, 2018

@raghakot, Please review this PR, and point out if there is a problem.
Also please tell me if you have a good idea about the test.

Fixes

The problem can be avoided by modifying the following implementation.

# 2. Replace all possible activations with ReLU.
for i, layer in utils.reverse_enumerate(modified_model.layers):
if hasattr(layer, 'activation'):
layer.activation = tf.nn.relu
if isinstance(layer, _ADVANCED_ACTIVATIONS):
# NOTE: This code is brittle as it makes use of Keras internal serialization knowledge and might
# break in the future.
modified_layer = Activation('relu')
modified_layer.inbound_nodes = layer.inbound_nodes
modified_layer.name = layer.name
modified_model.layers[i] = modified_layer
# 3. Save model with modifications.
modified_model.save(model_path)

The problem is the condition of line 89.
You can avoid the problem by excluding keras.activations.linear from the condition.
(Change line 89 to if hasattr(layer, 'activation’) and layer.activation != keras.activations.linear: )

However, I deleted all of the above implementations in this PR.
If you will support Activation other than ReLU, you should add Activation with gradient_override_map instead of replacing Activation with ReLU.
(Ex: gradient_override_map({‘Relu’: backdrop_modifier}, {‘Elu’: backdrop_modifier}) )
In this PR, addition of Activation is not implemented.

Consultation about the test

I can not create a test to reproduce the problem because we have not identified the root cause of the problem.

If it change to not replace activation function, this problem can be avoided.
So, it seems to be the cause that the ReLU function is executed twice consecutively.

However, ReLU(ReLU(x)) is just same as ReLU(x).
I don't think that this will have an adverse effect on the results.
Batch Normalization and Add merge layer also do not seem to cause this problem.

Do you have any good idea to make a better test ?

Regards

@keisen keisen mentioned this pull request Aug 15, 2018
4 tasks
@keisen keisen requested a review from raghakot August 31, 2018 08:29
@keisen
Copy link
Collaborator Author

keisen commented Aug 31, 2018

Hi,@raghakot.

I’ve thought about this PR, then I decided to merge it.
The reason is following.

  • In the using backdrop_model with Resnet50, this issue is critical.
  • This patch is able to resoluve build-error in travis-ci.

if it have any problem, please revert this PR.

p.s.
If you have any idea about the test of this PR, please response for #122 (comment) .

@keisen keisen merged commit 9821aea into raghakot:master Aug 31, 2018
@keisen keisen deleted the bugfix/#53 branch October 22, 2018 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant