Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Use of clone_model() #454

Closed
juliotorrest opened this issue Jul 22, 2021 · 2 comments
Closed

[QUESTION] Use of clone_model() #454

juliotorrest opened this issue Jul 22, 2021 · 2 comments

Comments

@juliotorrest
Copy link

Hello!
I am going through Chapter 11, "Reusing Pretrained Layers" section.

On cell 60, model_B_on_A is created from a pretrained model:

model_B_on_A = keras.models.Sequential(model_A.layers[:-1])

Then, on cell 61, the original model is cloned:

model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())

Finally, on cell 62 the model is compiled.
My question is, shouldn't model_B_on_A use 'model_A_clone' instead of 'model_A'? I guess it makes no difference, but it is just not 100% clear to me.
Thanks in advance!

@ageron
Copy link
Owner

ageron commented Oct 7, 2021

Hi @juliotorrest ,
Thanks for your feedback. I understand your confusion, as the notebook on its own is not very clear. But the corresponding section in the book is clearer:

First, you need to load model A and create a new model based on that model's layers. Let's reuse all the layers except for the output layer:

model_A = keras.models.load_model("my_model_A.h5")
model_B_on_A = keras.models.Sequential(model_A.layers[:-1])
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))

Note that model_A and model_B_on_A now share some layers. When you train model_B_on_A, it will also affect model_A. If you want to avoid that, you need to clone model_A before you reuse its layers. To do this, you clone model A's architecture with clone_model(), then copy its weights:

model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())

WARNING: keras.models.clone_model() only clones the architecture, not the weights. If you don't copy them manually using set_weights(), they will be initialized randomly when the cloned model is first used.

I want to ensure that every code example in the book is actually present in the corresponding notebook. It's a bit hard to do when I discuss several options, since the notebook then contains a mix of (often mutually incompatible) options. That's what's happening here. If I used model_B_on_A = keras.models.Sequential(model_A_clone.layers[:-1]) in the notebook, then the code would not be in the same order as in the book, and it wouldn't be identical. But perhaps in this case it would be actually okay... I'll think about it...

Note that you only need to clone model A if you care about preserving its weights, and if you plan on fine-tuning that part of the final model. Plus, you could alternatively just reload model A later if you need it, using keras.models.load_model("my_model_A.h5").

Anyway, assuming you do want to clone model A, then here's what the code would look like:

model_A = keras.models.load_model("my_model_A.h5")

model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())

model_B_on_A = keras.models.Sequential(model_A_clone.layers[:-1]) # using the clone here
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))

[...]

@ageron ageron closed this as completed in cf093c8 Oct 7, 2021
@ageron
Copy link
Owner

ageron commented Oct 7, 2021

I added a comment in the notebook to clarify this, and I decided to just create model_B_on_A a second time, so it uses the cloned model. Thanks again @juliotorrest !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants