Finally, we will learn how to use pretrained models in keras.

The usage of pretrained models is important since it often allow us to get a model with a "better start" than training from scratch.

The stereotypical pretraining dataset is called ImageNet, which is a dataset of 14M images over 1k classes.

![](imgs/imagenet_.jpg)

We will use 2 models, ResNet50 and ResNet101.

We will train ResNet50 on MNIST and showcase ResNet101 for feature extraction.

In [1]:
import keras
from keras.models import Sequential
from keras.layers import Dense, GlobalAveragePooling2D

from keras.applications.resnet import ResNet50, ResNet101, preprocess_input

import tensorflow as tf
import numpy as np


2025-03-20 11:36:18.589385: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


ResNet50:

In [2]:
cnn_pretrained = ResNet50(weights='imagenet', include_top=False, input_shape=(32, 32, 3))

2025-03-20 11:36:30.499039: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


MNIST shape is 28 x 28; however, the pretrained model requires exactly 3 channels and a minimum size of 32 x 32. We will need to fix the mnist dataset to abide to these characteristics to use the pretrained model. More on that later.

The model:

In [3]:
cnn = Sequential()
cnn.add(cnn_pretrained)
cnn.add(GlobalAveragePooling2D())
cnn.add(Dense(10, activation='softmax'))

In [4]:
import tensorflow as tf
(mnist_train_images, mnist_train_labels), (mnist_validation_images, mnist_validation_labels) = tf.keras.datasets.mnist.load_data()

In [6]:
mnist_train_images.shape

(60000, 28, 28)

In [5]:
from torchvision import transforms as T
import torch

transform = T.Resize(size=(32, 32))

mnist_transformed = transform(torch.from_numpy(mnist_train_images))



In [6]:
mnist_expanded = torch.stack([mnist_transformed,mnist_transformed,mnist_transformed])
mnist_expanded = mnist_expanded.permute(1, 2, 3, 0)
mnist_expanded.shape

torch.Size([60000, 32, 32, 3])

In [7]:
mnist_expanded = mnist_expanded.numpy()

**DIY**: fix the MNIST dataset and train this neural network for 1 or 2 epochs. Do it first with frozen weights (besides the last layer) and then with fine-tuning all of the weights.

To freeze the weigths you can do something like

```python
for layer in cnn_pretrained.layers:
    layer.trainable = False
cnn.add(cnn_pretrained) # when creating the cnn
```

In [8]:
mnist_expanded = mnist_expanded / 255.0

In [9]:
cnn.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
cnn.fit(mnist_expanded, mnist_train_labels)

  52/1875 [..............................] - ETA: 30:00 - loss: 1.1060 - accuracy: 0.7326

KeyboardInterrupt: 

**Feature extraction**

The follwing cells showcase how to operate fetaure extraction. With our super small dataset, it will probably not work...

In [16]:
img_size = (224, 224)

cnn_pretrained = ResNet101(weights='imagenet', include_top=False, input_shape=img_size + (3,))

In [15]:
(1,2,3)+(3,)

(1, 2, 3, 3)

In [17]:
dataset = keras.utils.image_dataset_from_directory(
    "oxford_pets",
    labels = "inferred",
    batch_size = 32,
    image_size = img_size,
    color_mode = "rgb",
    interpolation = "bilinear",
    crop_to_aspect_ratio = True
)

Found 332 files belonging to 2 classes.


In [18]:
dataset.class_names

['british_shorthair', 'maine_coon']

In [19]:
def preprocess_batch(batch_images, batch_labels):
    batch_images = preprocess_input(batch_images)  # This applies ImageNet normalization
    return batch_images, batch_labels

dataset = dataset.map(preprocess_batch)

In [20]:
all_images = []
all_labels = []

# Iterate through the dataset to collect all images and labels
for images, labels in dataset:
    all_images.append(images.numpy())
    all_labels.append(labels.numpy())

all_images = np.concatenate(all_images, axis=0)
all_labels = np.concatenate(all_labels, axis=0)

2025-03-20 11:43:35.705854: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [332]
	 [[{{node Placeholder/_4}}]]
2025-03-20 11:43:35.706930: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [332]
	 [[{{node Placeholder/_4}}]]


In [21]:
all_images.shape

(332, 224, 224, 3)

In [22]:
predictions = cnn_pretrained.predict(dataset)



In [23]:
predictions.shape

(332, 7, 7, 2048)

In [24]:
predictions = predictions.reshape(predictions.shape[0], -1)
predictions.shape

(332, 100352)

In [25]:
from sklearn.svm import SVC

svc = SVC()
svc.fit(predictions, all_labels)

In [26]:
def test_image(image_path):
    test_image = keras.utils.load_img(
        image_path, target_size=img_size, keep_aspect_ratio=True
    )
    test_image = keras.utils.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis=0)
    test_image = preprocess_input(test_image)

    prediction = cnn_pretrained.predict(test_image)
    prediction = prediction.reshape(prediction.shape[0], -1)
    print("Predicted class:", svc.predict(prediction)[0])

In [30]:
test_image("extra_dataset/golden-retriever.jpg")

Predicted class: 0


In [29]:
test_image("extra_dataset/Maine_Coon_203.jpg")

Predicted class: 1
