[QUESTION] Use of clone_model() #454

juliotorrest · 2021-07-22T03:18:28Z

Hello!
I am going through Chapter 11, "Reusing Pretrained Layers" section.

On cell 60, model_B_on_A is created from a pretrained model:

model_B_on_A = keras.models.Sequential(model_A.layers[:-1])

Then, on cell 61, the original model is cloned:

model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())

Finally, on cell 62 the model is compiled.
My question is, shouldn't model_B_on_A use 'model_A_clone' instead of 'model_A'? I guess it makes no difference, but it is just not 100% clear to me.
Thanks in advance!

The text was updated successfully, but these errors were encountered:

ageron · 2021-10-07T04:24:50Z

Hi @juliotorrest ,
Thanks for your feedback. I understand your confusion, as the notebook on its own is not very clear. But the corresponding section in the book is clearer:

First, you need to load model A and create a new model based on that model's layers. Let's reuse all the layers except for the output layer:
model_A = keras.models.load_model("my_model_A.h5")
model_B_on_A = keras.models.Sequential(model_A.layers[:-1])
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))
Note that model_A and model_B_on_A now share some layers. When you train model_B_on_A, it will also affect model_A. If you want to avoid that, you need to clone model_A before you reuse its layers. To do this, you clone model A's architecture with clone_model(), then copy its weights:
model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())
WARNING: keras.models.clone_model() only clones the architecture, not the weights. If you don't copy them manually using set_weights(), they will be initialized randomly when the cloned model is first used.

I want to ensure that every code example in the book is actually present in the corresponding notebook. It's a bit hard to do when I discuss several options, since the notebook then contains a mix of (often mutually incompatible) options. That's what's happening here. If I used model_B_on_A = keras.models.Sequential(model_A_clone.layers[:-1]) in the notebook, then the code would not be in the same order as in the book, and it wouldn't be identical. But perhaps in this case it would be actually okay... I'll think about it...

Note that you only need to clone model A if you care about preserving its weights, and if you plan on fine-tuning that part of the final model. Plus, you could alternatively just reload model A later if you need it, using keras.models.load_model("my_model_A.h5").

Anyway, assuming you do want to clone model A, then here's what the code would look like:

model_A = keras.models.load_model("my_model_A.h5")

model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())

model_B_on_A = keras.models.Sequential(model_A_clone.layers[:-1]) # using the clone here
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))

[...]

ageron · 2021-10-07T04:42:48Z

I added a comment in the notebook to clarify this, and I decided to just create model_B_on_A a second time, so it uses the cloned model. Thanks again @juliotorrest !

ageron closed this as completed in cf093c8 Oct 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Use of clone_model() #454

[QUESTION] Use of clone_model() #454

juliotorrest commented Jul 22, 2021

ageron commented Oct 7, 2021 •

edited

ageron commented Oct 7, 2021

[QUESTION] Use of clone_model() #454

[QUESTION] Use of clone_model() #454

Comments

juliotorrest commented Jul 22, 2021

ageron commented Oct 7, 2021 • edited

ageron commented Oct 7, 2021

ageron commented Oct 7, 2021 •

edited