-
Notifications
You must be signed in to change notification settings - Fork 907
Add ResNet, ResNetV2, and ResNeXt variants #26
Conversation
18fbdff
to
1838a36
Compare
@fchollet, Due to the recent backend interfacing changes, especially |
@taehoonlee Any chance you could share the weights?? |
@daavoo, Sure. You can refer to:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code/style looks good to me! I'm not sure we need add arbitrarily many new models just because we can, though -- we should focus on a small set of models that are significantly differentiated and that will deliver unique value for users. ResNetV2, for instance, seems like a good idea. ResNext101c64, maybe not so much.
keras_applications/resnet_common.py
Outdated
'c0ed64b8031c3730f411d2eb4eea35b5'), | ||
'resnet152v2': ('a49b44d1979771252814e80f8ec446f9', | ||
'ed17cf2e0169df9d443503ef94b23b33'), | ||
'resnext50c32': ('67a5b30d522ed92f75a1f16eef299d1a', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm skeptical that we need the c32
and c64
versions. Model choices that don't map to real value-add for users should be avoided. More choice is not always a good thing, it confuses users and bloats the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I agree.
@@ -0,0 +1,209 @@ | |||
"""ResNet, ResNetV2, and ResNeXt models for Keras. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably you could just merge this with resnet.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integrated all the variants into resnet_common.py
.
@fchollet in the spirit of your remark about significant differentiation, would you say that adding a model trained on Open Images instead of Imagenet meet this threshold ? I have been considering working on training such a model or porting the ResNet provided by Google. I suspect there's quite a lot of value to be extracted in transfer learning tasks using Open Images trained models, given that Imagenet is so full of animal categories. |
@rohit-gupta, I think that it is worth and significant differentiation to add reference models trained with Open Images. Contributions welcome :) In my humble opinion,
|
c44782c
to
a2ae357
Compare
9da7b3e
to
6292c70
Compare
6292c70
to
f8fc2f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the CL. The new choice of models looks good to me.
# Returns | ||
Preprocessed array. | ||
""" | ||
return imagenet_utils.preprocess_input(x, mode='tf', **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it is somewhat error-prone that ResNet50, ResNeXt50, and ResNet50V2 use 3 different preprocessing methods. Is there anything we could do about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fchollet, I agree that several users may be confused. However, it is not easy to revise because the weights are originated from different researchers using different libraries. I think that one of the possible ways is to insert addition and division operations into model definitions, but it is not consistent with the existing model definition styles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that having the preprocessing in the model definition make a lot of sense and is a good idea. Models in keras.application
cannot be used without the appropriate preprocessing, even when fine-tuning, when weights
is not None
.
I would be in favor of adding the preprocessing in the model when weights='imagenet'
but it would break the API and change the existing behavior. Can we maybe add a new argument when creating the model? preprocessing='imagenet' (or None)
?
I think it adds a lot of unnecessary thinking to users when trying to use pre-trained models. (the number of issues and stackoverflow questions about pre-processing when using keras.applications
is staggering). It also comes from the functions being all called preprocess_input
,so users who don't know the details of the preprocessing of the different models think that the same preprocessing_input
will cover all models.
There are also performance issues to consider since the preprocessing will happen on the GPU. Since this is close to an element-wise operation in most cases, this should be fine because GPUs are good at this.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gabrieldemarmiesse, Thank you for the comment. The preprocessing='imagenet'
idea looks good to me. However, it is not easy to change the styles because users who use the current styles (model.predict(preprocess_input(x))
) must be confused if a preprocessing is included in a model definition. If such is the case, the current styles will perform the preprocessing twice and result in wrong outputs. I think that the style change is a matter of choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. It's true that it can be a source of confusion for users. I guess we can let @fchollet decide if the API needs a modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gabrieldemarmiesse, Yes, I agree with your basic principles. However, it would be perfect if there was a way to force users not to run preprocessing twice, instead of giving a warning message. I have no idea, thus frankly speaking, I think the current form is better from the perspective of users who are familiar with the current one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it would be perfect if there was a way to force users not to run preprocessing twice, instead of giving a warning message.
I totally agree. If we use a global variable, it would work, but I don't really like global variables. I'm not even sure it's going to work well with multithreading/multiprocessing.
Generally speaking, I would really like a way to avoid silent preprocessing errors (they are VERY common for many users), after that, which API we choose doesn't really matter to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, I would really like a way to avoid silent preprocessing errors (they are VERY common for many users), after that, which API we choose doesn't really matter to me.
Is it possible to give a warning in model.fit()
, model.fit_generator()
and model.train_on_batch
based on the values of the input training data ? i.e. give a warning if value of image pixels is not of the scale [0-255] (could be something like max(X) > 1
) indicating that data might have been pre-processed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rohit-gupta, I think that is overkill. It must result in performance degradation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the preprocessing is declared in the graph, a graph optimizer (let's say XLA for example) can fold it in the first convolution, making the preprocessing time drop to 0. I may be wrong but I think it's possible. Not that it's going to be significant compared to the execution time of the full graph. But speedups are always nice.
tests/applications_test.py
Outdated
@@ -16,12 +17,38 @@ | |||
from keras.applications import vgg16 | |||
from keras.applications import vgg19 | |||
from keras.applications import xception | |||
from keras_applications import resnet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we simply apply this new testing style to all applications? It would be easier to have all or nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fchollet, The testing style has been revised.
Yes, I think that would be great and it would provide a lot of value to users, especially given that ImageNet is pretty biased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval for addition of
ResNet101
ResNet152
ResNet101V2
ResNet50V2
ResNet152V2
ResNeXt50
ResNeXt101
Thanks a lot for the PR, and sorry for the delaying approval (lots of things going on lately...)
@fchollet, I always appreciate you giving me useful feedback and the opportunity to maintain the repository. Don't be sorry and thanks a lot! |
This PR adds ResNet, ResNetV2, and ResNeXt variants (9 models and 18 weight files). I would appreciate it if you could give me some feedbacks or style guides @fchollet.