Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the ResNet-* weights #15780

Closed
sayakpaul opened this issue Dec 12, 2021 · 16 comments
Closed

Updating the ResNet-* weights #15780

sayakpaul opened this issue Dec 12, 2021 · 16 comments
Assignees
Labels
type:feature The user is asking for a new feature.

Comments

@sayakpaul
Copy link
Contributor

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information.

TensorFlow version (you are using): 2.7.0
Are you willing to contribute it (Yes/No) : Currently no

Describe the feature and the current behavior/state.

ResNets are arguably one of the most influential architectures in deep learning. Today, they are used in different capacities. For example, sometimes they act as strong baselines, sometimes they are used as backbones. Since their inception, their performance on ImageNet-1k, in particular, has improved quite a lot. I think it's time the ResNets under tf.keras.applications were updated to facilitate these changes.

Will this change the current api? How?

ResNet-RS (https://arxiv.org/abs/2103.07579) introduces slight architectural changes to the vanilla ResNet architecture (https://arxiv.org/abs/1512.03385). So, yes, there will be changes to the current implementation of ResNets (among other things) we have under tf.keras.applications. We could call it tf.keras.applications.ResNet50RS, for example. Following summarizes the performance benefits that ResNet-RS introduces to the final ImageNet-1k performance (measured on the val set):

image

Source

Who will benefit from this feature?

Keras users that use ResNets from tf.keras.applications for building downstream applications.

Contributing

  • Do you want to contribute a PR? (yes/no): Currently no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):
@sayakpaul sayakpaul added the type:feature The user is asking for a new feature. label Dec 12, 2021
@sachinprasadhs sachinprasadhs added the keras-team-review-pending Pending review by a Keras team member. label Dec 15, 2021
@LukeWood
Copy link
Contributor

Notes from triage:

If we don't have a contributor we probably will not have time for this. Also, before a contributor prepares a PR we'd like to see if @fchollet thinks this is a good idea.

@LukeWood LukeWood removed the keras-team-review-pending Pending review by a Keras team member. label Dec 16, 2021
@fchollet
Copy link
Member

fchollet commented Jan 6, 2022

This issue proposes the introduce of a family of tf.keras.applications.ResNet50RS models with new implementations and new checkpoints. This sounds good to me, but we won't have cycles to work on it, so it would have to be contributed by the community. Marking as "contributions welcome".

@fchollet fchollet removed their assignment Jan 6, 2022
@sayakpaul
Copy link
Contributor Author

@spatil6
Copy link
Contributor

spatil6 commented Jan 12, 2022

I can contribute on this, @sayakpaul Please assign it to me.

@sayakpaul
Copy link
Contributor Author

@sebastian-sz will be working on it. Also, I cannot assign tasks to anyone. Only Keras team members can do that.

@sebastian-sz
Copy link
Contributor

@sayakpaul I'm kind of confused whether we should work on it or wait for the implementation from keras-cv. There seems to be some works going on with it. Reproducible models trained from scratch in Keras would be better than the converted ones.

Also I did not fully understand @bhack 's commment on TF forum - Is this a green light or a red light for such PR?

@bhack
Copy link
Contributor

bhack commented Jan 12, 2022

@sebastian-sz I think that having reusable components it Is better then having an embedded trainabile model that It Is better then just the converted weights.

So I think that reusable components it is a little bit low level then all the other things.

But as we have not defined a public process to produce validated weights from the community using these new reusable components I think that currently this roadmap need to be exposed by the internal teams. E.g. when the reusable components will be ready who Is going to produce validated weights?

I think that, more in general, by a community point of view this is partially similar to the Tensorflow Model garden process.

See:
https://discuss.tensorflow.org/t/a-tensorflow-model-garden-contributor-report/6993

@sayakpaul
Copy link
Contributor Author

This issue is tagged with "Contributions welcome". So, I am not sure if that arises any confusion.

Just to clear some confusion, @sebastian-sz has already written the ResNet-RS model classes and then they were populated with the original weights. I think this does expose reusable components which are quite in the spirit of keras.applications.

Sebastian also provides code to validate the converted weights i.e., ImageNet-1k validation set scores.

Furthermore, EfficientNetV2 models were contributed similarly if I am not wrong. There is also a standing PR on RegNets and from the EfficientNetV2 PR, I believe the Keras team will validate the weights on their end as well.

@fchollet am I missing anything here?

@bhack
Copy link
Contributor

bhack commented Jan 12, 2022

IMHO when the reusable components are landing in Keras-cv it is not so clear to me what is the path of having reproducible training vs keras.application only availablity vs models garden.

Of course we could duplicate our contribution paths as we want but I don't think It will a clear and transparent approach for the community.

Probably it is still a little bit clear for me with TFHUB only models as generally they are not so end-users "manipulable".
But if I remember correctly we had also a discussion on this specific topic at:

https://discuss.tensorflow.org/t/modification-of-layers-in-hub-keraslayer/2508/8

@sayakpaul
Copy link
Contributor Author

sayakpaul commented Jan 12, 2022

IMHO when the reusable components are landing in Keras-cv it is not so clear to me what is the path of having reproducible training vs keras.application only availablity vs models garden.

I am also on the same boat there.

But that said, as for this PR, I do not think there's anything blocking @sebastian-sz to work on a ResNet-RS PR since it's already cleared by the Keras team as mentioned in #15780 (comment).

@bhack
Copy link
Contributor

bhack commented Jan 12, 2022

IMHO when the reusable components are landing in Keras-cv it is not so clear to me what is the path of having reproducible training vs keras.application only availablity vs models garden.

I am also on the same boat there.

But that said, as for this PR, I do not think there's anything blocking Sebastian to work on a ResNet-RS PR since it's already cleared by the Keras team as mentioned in #15780 (comment).

Ok but we need to consider also that we have already official regular weights in:

https://github.com/tensorflow/models/blob/master/official/vision/beta/MODEL_GARDEN.md#resnet-rs-models-trained-with-various-settings

@sayakpaul
Copy link
Contributor Author

Sure thing. But I think Keras users would love to avail the flexibilities they get while working with tf.keras.applications. I find the official code harder to work with than pure Keras.

@sebastian-sz
Copy link
Contributor

Ok but we need to consider also that we have already official regular weights in:

https://github.com/tensorflow/models/blob/master/official/vision/beta/MODEL_GARDEN.md#resnet-rs-models-trained-with-various-settings

@bhack there are two Issue with the mentioned repository:

  1. These weights don't work. I tried to use it when rewriting the models to Keras, also opened an Issue but it didn't attract much attention Cannot export Resnet-RS to Saved Model format. tensorflow/models#10234
  2. The models building blocks are composed as tf.keras.layers.Layer subclasses rather than functional blocks like keras.applications. Please, correct me if I'm wrong but this limits functionality. In keras.applications I can access all model layers as such
some_layer, another_layer = [
    backbone.get_layer(layer_name).output
    for layer_name in ["block_4_middle_conv", "block_3_first_conv"]
]

The above will work for functional models with flat structure but I'm afraid it might fail where blocks are implemented as tf.keras.layers.Layer subclasses.

@bhack
Copy link
Contributor

bhack commented Jan 12, 2022

I suppose this why there is a Keras-cv effort to have a more reusable impl of things that we have embedded in models garden.
But in the end if we don't have a plan to contribute/expose Keras-cv trained models It couldn't create any conflict or resoruce wasting.

@sebastian-sz
Copy link
Contributor

@sayakpaul Update from me: sorry I will not be able to submit this PR - I am very short on time recently.

If anyone wants to work on this (@spatil6 ) feel free to use my repository (I would appreciate mentioning me thought, if you do).

@spatil6
Copy link
Contributor

spatil6 commented Mar 28, 2022

PR is merged for this, Please close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature The user is asking for a new feature.
Projects
None yet
Development

No branches or pull requests

8 participants