#Deep learning frameworks

Deep learning frameworks allow for creating complex neural networks that runs on optimized hardware such as GPUs and TPUs using higher level programming languages like python with APIs to a lower level platform. In this notebook, examples of building and training a convolutional neural network are provided. Examples are given in tensorflow/keras, PyTorch and JAX. Each framework provides a more a less simple API to define neural networks, with different levels of abstraction and degress of allowed customization.

Currently, Tensoflow/keras and Pytorch are the two most used frameworks used in research and industry, with Pytorch gaining and edge in the research community. JAX is little less mature and you probably shoudtn't use it in this course when you start to develop and train networks for assignment 2 and 3. It is a research project from google which aims to provide even more customizability with acces to control of more low level features to improve efficiency of neural network training on specialized hardware.    

#Keras

(from https://keras.io/about/):

Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result as fast as possible is key to doing good research.

Keras is:

    Simple -- but not simplistic. Keras reduces developer cognitive load to free you to focus on the parts of the problem that really matter.
    Flexible -- Keras adopts the principle of progressive disclosure of complexity: simple workflows should be quick and easy, while arbitrarily advanced workflows should be possible via a clear path that builds upon what you've already learned.
    Powerful -- Keras provides industry-strength performance and scalability: it is used by organizations and companies including NASA, YouTube, or Waymo.


TensorFlow 2 is an end-to-end, open-source machine learning platform. You can think of it as an infrastructure layer for differentiable programming. It combines four key abilities:

    Efficiently executing low-level tensor operations on CPU, GPU, or TPU.
    Computing the gradient of arbitrary differentiable expressions.
    Scaling computation to many devices, such as clusters of hundreds of GPUs.
    Exporting programs ("graphs") to external runtimes such as servers, browsers, mobile and embedded devices.

Keras is the high-level API of the TensorFlow platform: an approachable, highly-productive interface for solving machine learning problems, with a focus on modern deep learning. It provides essential abstractions and building blocks for developing and shipping machine learning solutions with high iteration velocity.

Keras empowers engineers and researchers to take full advantage of the scalability and cross-platform capabilities of the TensorFlow platform: you can run Keras on TPU or on large clusters of GPUs, and you can export your Keras models to run in the browser or on a mobile device.

The core data structures of Keras are layers and models. The simplest type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API, which allows to build arbitrary graphs of layers, or write models entirely from scratch via subclasssing.

#The Sequential API

here's an example of a VGG-style Convolutional Neural Network (CNN) implemented using the Sequential API in Keras

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define the VGG-style CNN model
model = Sequential()

# Block 1
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# Block 2
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# Block 3
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# Block 4
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# Block 5
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# Flatten the output for fully connected layers
model.add(Flatten())

# Fully connected layers
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
model.add(Dense(1000, activation='softmax'))  # 1000 is the number of output classes (adjust as needed)

# Print a summary of the model architecture
model.summary()

This code defines a VGG-style CNN with five convolutional blocks and three fully connected layers.
You can adjust the input shape and the number of output classes as needed for your specific task.
Additionally, you can compile the model and train it on your dataset using appropriate data preprocessing and training code.

#The Functional API

Here's the same VGG-style Convolutional Neural Network (CNN) model implemented
using the Functional API in Keras:

In [None]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense

# Input layer
input_layer = Input(shape=(224, 224, 3))

# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_layer)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)

# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)

# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)

# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)

# Flatten the output for fully connected layers
x = Flatten()(x)

# Fully connected layers
x = Dense(4096, activation='relu')(x)
x = Dense(4096, activation='relu')(x)
output_layer = Dense(1000, activation='softmax')(x)  # 1000 is the number of output classes (adjust as needed)

# Create the model
model = Model(inputs=input_layer, outputs=output_layer)

# Print a summary of the model architecture
model.summary()

This code defines the same VGG-style CNN model using the Functional API in Keras.
As with the previous example, you can adjust the input shape and the number of output classes as needed for your specific task.
To use this model, you can compile it and train it on your dataset using appropriate data preprocessing and training code.

#Model Subclassing

you can also implement the VGG-style CNN using subclassing in Keras. Here's how you can do it:

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Layer, Conv2D, MaxPooling2D, Flatten, Dense

class VGGModel(tf.keras.Model):
    def __init__(self, num_classes=1000):
        super(VGGModel, self).__init__()

        # Block 1
        self.conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')
        self.conv1_2 = Conv2D(64, (3, 3), activation='relu', padding='same')
        self.maxpool1 = MaxPooling2D((2, 2), strides=(2, 2))

        # Block 2
        self.conv2_1 = Conv2D(128, (3, 3), activation='relu', padding='same')
        self.conv2_2 = Conv2D(128, (3, 3), activation='relu', padding='same')
        self.maxpool2 = MaxPooling2D((2, 2), strides=(2, 2))

        # Block 3
        self.conv3_1 = Conv2D(256, (3, 3), activation='relu', padding='same')
        self.conv3_2 = Conv2D(256, (3, 3), activation='relu', padding='same')
        self.conv3_3 = Conv2D(256, (3, 3), activation='relu', padding='same')
        self.maxpool3 = MaxPooling2D((2, 2), strides=(2, 2))

        # Block 4
        self.conv4_1 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.conv4_2 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.conv4_3 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.maxpool4 = MaxPooling2D((2, 2), strides=(2, 2))

        # Block 5
        self.conv5_1 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.conv5_2 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.conv5_3 = Conv2D(512, (3, 3), activation='relu', padding='same')
        self.maxpool5 = MaxPooling2D((2, 2), strides=(2, 2))

        # Fully connected layers
        self.flatten = Flatten()
        self.fc1 = Dense(4096, activation='relu')
        self.fc2 = Dense(4096, activation='relu')
        self.fc3 = Dense(num_classes, activation='softmax')

    def call(self, inputs):
        x = self.conv1_1(inputs)
        x = self.conv1_2(x)
        x = self.maxpool1(x)

        x = self.conv2_1(x)
        x = self.conv2_2(x)
        x = self.maxpool2(x)

        x = self.conv3_1(x)
        x = self.conv3_2(x)
        x = self.conv3_3(x)
        x = self.maxpool3(x)

        x = self.conv4_1(x)
        x = self.conv4_2(x)
        x = self.conv4_3(x)
        x = self.maxpool4(x)

        x = self.conv5_1(x)
        x = self.conv5_2(x)
        x = self.conv5_3(x)
        x = self.maxpool5(x)

        x = self.flatten(x)
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)

        return x

# Create an instance of the VGGModel class
model = VGGModel(num_classes=1000)

# Print a summary of the model architecture
model.build((None, 224, 224, 3))
model.summary()


In this code, we create a custom subclass of tf.keras.Model called VGGModel and define the layers and operations in the __init__ method.
The call method specifies the forward pass of the model. You can adjust the number of output classes by passing the num_classes parameter
when creating an instance of the model. Finally, we build the model and print a summary of its architecture.



#Functional v. Sequential

The functional API makes it possible to build more complex architectures where layers are not neccesarrily arranged sequentialy one after another, but instead the network branches out, and reconnects in different parts, or networks with mulitple inputs/outputs.


#Resnet example

example of defining a ResNet-style model using the Keras Functional API. The ResNet architecture is known for its deep structure with residual blocks. In this example, We will define a simplified ResNet with just a few residual blocks for demonstration purposes:

In [None]:
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, ReLU, Add, GlobalAveragePooling2D, Dense
from tensorflow.keras.models import Model

def residual_block(x, filters, stride=1):
    shortcut = x

    # First convolution layer
    x = Conv2D(filters, (3, 3), strides=stride, padding='same')(x)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # Second convolution layer
    x = Conv2D(filters, (3, 3), strides=1, padding='same')(x)
    x = BatchNormalization()(x)

    # Shortcut connection
    if stride != 1 or shortcut.shape[-1] != filters:
        shortcut = Conv2D(filters, (1, 1), strides=stride, padding='same')(shortcut)

    x = Add()([x, shortcut])
    x = ReLU()(x)

    return x

# Define input layer
input_shape = (224, 224, 3)
input_tensor = Input(shape=input_shape)

# Initial convolution layer
x = Conv2D(64, (7, 7), strides=2, padding='same')(input_tensor)
x = BatchNormalization()(x)
x = ReLU()(x)

# Residual blocks
num_blocks = 3  # Number of residual blocks
num_filters = 64  # Number of filters in each block

for _ in range(num_blocks):
    x = residual_block(x, num_filters)

# Global Average Pooling
x = GlobalAveragePooling2D()(x)

# Fully connected layer for classification
num_classes = 10  # Number of output classes
output_tensor = Dense(num_classes, activation='softmax')(x)

# Create the ResNet model
resnet_model = Model(inputs=input_tensor, outputs=output_tensor)

# Print model summary
resnet_model.summary()

In this example:

    We define a residual_block function that represents a single residual block in the ResNet architecture. It consists of two convolutional layers with batch normalization and a shortcut connection.

    We use this residual_block function to stack multiple residual blocks together.

    The model starts with an initial convolutional layer, followed by the stack of residual blocks.

    After the stack of residual blocks, we apply global average pooling to reduce the spatial dimensions.

    Finally, we add a fully connected layer for classification with the desired number of output classes.


#Siamese network

The Keras Functional API allows for more flexible model architectures compared to the Sequential API because it supports multiple inputs and outputs and enables you to create complex network structures. Here's an example of a model that cannot be easily written using the Sequential API: a Siamese Network for image similarity comparison.

A Siamese Network is used for tasks like face recognition or similarity-based retrieval. It learns to differentiate between pairs of input samples, often used in scenarios where you want to determine the similarity or dissimilarity between two inputs. Here's how you can define a Siamese Network using the Keras Functional API:

In [None]:
from tensorflow.keras.layers import Input, Flatten, Dense, Lambda, MaxPooling2D
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras import backend as K


# Define the base network (shared weights)
input_shape = (28, 28, 1)
input_left = Input(shape=input_shape)
input_right = Input(shape=input_shape)

shared_conv = Sequential([
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(),
    Conv2D(128, (3,3), activation='relu'),
    MaxPooling2D(),
    Flatten()
])

output_left = shared_conv(input_left)
output_right = shared_conv(input_right)

# Define a custom layer to compute the L1 distance
def euclidean_distance(vects):
    x, y = vects
    sum_square = K.sum(K.square(x - y), axis=1, keepdims=True)
    return K.sqrt(K.maximum(sum_square, K.epsilon()))

distance_layer = Lambda(euclidean_distance, output_shape=lambda x: x[0])

# Connect the inputs and compute the distance
distance = distance_layer([output_left, output_right])

# Define the final model
siamese_model = Model(inputs=[input_left, input_right], outputs=distance)

# Print the model summary
siamese_model.summary()

In this Siamese Network:

    We define two separate input layers, input_left and input_right, which represent two images to be compared.

    We define a shared convolutional base, shared_conv, that processes both input images. This base is shared between the two branches of the network.

    We define a custom layer, euclidean_distance, which calculates the Euclidean distance between the output vectors of the shared base for the two input images.

    We use the Lambda layer to apply the custom distance function to the output vectors of the shared base.

    The final model, siamese_model, takes two input images and outputs the Euclidean distance between their representations.

This Siamese Network architecture is an example of a model that requires the flexibility of the Functional API due to its multiple inputs and custom layer for computing the distance between the inputs. It's commonly used for various similarity-based tasks, including image similarity and face recognition.

#Pytorch

PyTorch is an open source machine learning (ML) framework based on the Python programming language and the Torch library. Torch is an open source ML library used for creating deep neural networks and is written in the Lua scripting language. It's one of the preferred platforms for deep learning research. The framework is built to speed up the process between research prototyping and deployment. Like Keras, Pytorch has different API such as functional and sequential which can be used to construct various kinds of Neural networks

#Sequential
you can implement a VGG-style CNN in PyTorch as well.
Here's an example of how you can do it:

In [None]:
import torch
import torch.nn as nn

class VGG(nn.Module):
    def __init__(self, num_classes=1000):
        super(VGG, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(128, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(256, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

# Create an instance of the VGG model
model = VGG(num_classes=1000)

# Print a summary of the model architecture
print(model)

This code defines a VGG model in PyTorch, similar to the Keras implementation.
You can adjust the number of output classes by passing the num_classes parameter when creating an instance of the model.
To use this model, you can load your data, define a loss function, and perform training as needed in PyTorch.

#Functional

PyTorch provides a functional API that allows you to build neural networks in a manner similar to the functional API in Keras.
You can use PyTorch's nn.Module class to create custom layers and then compose them together using Python functions to create your network.
Here's how you can implement a VGG-style CNN using the PyTorch functional API:

In [None]:
import torch
import torch.nn as nn

class VGGBlock(nn.Module):
    def __init__(self, in_channels, out_channels, num_convs):
        super(VGGBlock, self).__init__()
        layers = []
        for _ in range(num_convs):
            layers.append(nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1))
            layers.append(nn.ReLU(inplace=True))
            in_channels = out_channels
        layers.append(nn.MaxPool2d(kernel_size=2, stride=2))
        self.block = nn.Sequential(*layers)

    def forward(self, x):
        return self.block(x)

class VGG(nn.Module):
    def __init__(self, num_classes=1000):
        super(VGG, self).__init__()
        self.block1 = VGGBlock(3, 64, 2)
        self.block2 = VGGBlock(64, 128, 2)
        self.block3 = VGGBlock(128, 256, 3)
        self.block4 = VGGBlock(256, 512, 3)
        self.block5 = VGGBlock(512, 512, 3)

        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, num_classes)
        )

    def forward(self, x):
        x = self.block1(x)
        x = self.block2(x)
        x = self.block3(x)
        x = self.block4(x)
        x = self.block5(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x






# Create an instance of the VGG model
model = VGG(num_classes=1000)

# Print a summary of the model architecture
print(model)

In this code, we define a VGGBlock class that represents a single block in the VGG architecture. Then, we use this block to build the entire VGG network in the VGG class.
You can adjust the number of output classes by passing the num_classes parameter when creating an instance of the model.
This implementation follows a functional and modular approach similar to the Keras Functional API.

#JAX

(from https://github.com/google/jax)

JAX is Autograd and XLA, brought together for high-performance machine learning research.

With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation) via grad as well as forward-mode differentiation, and the two can be composed arbitrarily to any order.

What’s new is that JAX uses XLA to compile and run your NumPy programs on GPUs and TPUs. Compilation happens under the hood by default, with library calls getting just-in-time compiled and executed. But JAX also lets you just-in-time compile your own Python functions into XLA-optimized kernels using a one-function API, jit. Compilation and automatic differentiation can be composed arbitrarily, so you can express sophisticated algorithms and get maximal performance without leaving Python. You can even program multiple GPUs or TPU cores at once using pmap, and differentiate through the whole thing.

Dig a little deeper, and you'll see that JAX is really an extensible system for composable function transformations. Both grad and jit are instances of such transformations. Others are vmap for automatic vectorization and pmap for single-program multiple-data (SPMD) parallel programming of multiple accelerators, with more to come.

This is a research project, not an official Google product. Expect bugs and sharp edges. Please help by trying it out, reporting bugs, and letting us know what you think!



#VGG in JAX

In JAX, you can build a VGG-style CNN using the JAX neural network library, flax.
Here's an example of how you can implement such a network:

In [None]:
import jax
import jax.numpy as jnp
from flax import linen as nn

class VGGBlock(nn.Module):
    out_channels: int
    num_convs: int

    @nn.compact
    def __call__(self, x):
        for _ in range(self.num_convs):
            x = nn.Conv(self.out_channels, kernel_size=(3, 3), padding='SAME')(x)
            x = nn.relu(x)
        x = nn.max_pool(x, window_shape=(2, 2), strides=(2, 2))
        return x

class VGG(nn.Module):
    num_classes: int

    @nn.compact
    def __call__(self, x):
        x = VGGBlock(64, 2)(x)
        x = VGGBlock(128, 2)(x)
        x = VGGBlock(256, 3)(x)
        x = VGGBlock(512, 3)(x)
        x = VGGBlock(512, 3)(x)

        x = x.mean(axis=(1, 2))  # Global Average Pooling
        x = nn.Dense(self.num_classes)(x)
        return x

# Create an instance of the VGG model
rng = jax.random.PRNGKey(0)
input_shape = (1, 224, 224, 3)  # Batch size of 1, input shape
model = VGG(num_classes=1000)
params = model.init(rng, jnp.ones(input_shape, dtype=jnp.float32))

# Print a summary of the model architecture
print(model)



VGG(
    # attributes
    num_classes = 1000
)


In this code, we use the linen module from Flax to define the VGGBlock and VGG classes.
The @nn.compact decorator is used to define the forward pass of each block and the entire network.
We create an instance of the VGG model and initialize its parameters using JAX's PRNGKey.
Finally, we print a summary of the model architecture.

You can adjust the num_classes parameter when creating an instance of the model to specify the number of output classes you need.

#Training models

In these next examples we will se how we can train our models to perform multiclass classification on the cifar-10 dataset (https://www.cs.toronto.edu/~kriz/cifar.html) with models defined in each of the frameworks presented above.

##Note:
These are only vary basic examples. When it becomes time to train your own models you should definitely look up more information. E.g how to efficiently tune hyperparameters, loading and transforming data and try to define more complex models. As well as how to save, load and use your model to perform inference on new data.

#Keras generators

Here's an example of how you can create a training loop using data generators and the model.fit function
in Keras for a Sequential model using the CIFAR-10 dataset:

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import SGD

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Preprocess the data
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0

train_labels = to_categorical(train_labels, 10)
test_labels = to_categorical(test_labels, 10)

# Create a Sequential model
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer=SGD(learning_rate=0.01,momentum=0.9),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Create data generators for data augmentation
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Fit the model using data generators
batch_size = 64
epochs = 10

history = model.fit(
    datagen.flow(train_images, train_labels, batch_size=batch_size),
    steps_per_epoch=len(train_images) // batch_size,
    epochs=epochs,
    validation_data=(test_images, test_labels),
    verbose=1
)

# Evaluate the model on test data
test_loss, test_accuracy = model.evaluate(test_images, test_labels, verbose=2)
print(f"Test accuracy: {test_accuracy*100:.2f}%")

in this code:

    We load the CIFAR-10 dataset and preprocess the data.
    We create a Sequential model with convolutional and fully connected layers.
    We compile the model with the SGD optimizer and categorical cross-entropy loss.
    We create an ImageDataGenerator for data augmentation.
    We use model.fit to train the model using the data generator.
    Finally, we evaluate the model on the test data and print the test accuracy.

You can adjust the batch size, number of epochs, and other hyperparameters as needed for your training.

#Torch dataloader

In PyTorch, you can create a similar training loop using data loaders and manual iterations over the dataset.
Here's how you can train a PyTorch model on the CIFAR-10 dataset:

In [None]:
from prompt_toolkit.shortcuts import progress_bar
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Define the model similar to the keras sequential model above
class net(nn.Module):
    def __init__(self, num_classes=10):
        super(net, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            )
        self.classifier = nn.Sequential(
            nn.Linear(64*8*8, 64),
            nn.ReLU(inplace=True),
            nn.Linear(64, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

# Create an instance of the model
model = net(num_classes=10)

# Define data transformations and create data loaders
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(degrees=120),
    transforms.RandomAffine(degrees=0,translate=[0,0.2],fill=0),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True, num_workers=4)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False, num_workers=4)

# Define loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# Training loop
epochs = 10
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(epochs):
    running_loss = 0.0
    for i, data in tqdm(enumerate(trainloader, 0)):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        if i % 100 == 99:  # Print every 100 mini-batches
            print(f"[Epoch {epoch + 1}, Batch {i + 1}] Loss: {running_loss / 100:.3f}")
            running_loss = 0.0

print("Training Finished")

# Test the model
correct = 0
total = 0

with torch.no_grad():
    for data in testloader:
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Accuracy on test data: {100 * correct / total:.2f}%")

In this code:

    We define a model, set up data transformations, and create data loaders for both the training and test datasets.

    We define the loss function (CrossEntropyLoss) and optimizer (SGD).

    The training loop iterates over the dataset, computes gradients, and updates the model's weights using backpropagation.

    After training, we evaluate the model on the test dataset to calculate accuracy.

Make sure you have the required libraries installed and a compatible GPU if you want to run this code on a GPU.

#Training JAX model using data generators

In JAX, you can create a training loop for a model using manual iteration and gradient updates. Here's how you can train a JAX model on the CIFAR-10 dataset:

In [None]:
import jax
import jax.numpy as jnp
from flax import linen as nn
import optax
import numpy as np
from jax import jit, random, value_and_grad, vmap, pmap
from jax.example_libraries import optimizers
from tqdm import tqdm


# Define the model (similar to the previous Keras and Pytorch examples)


class net(nn.Module):
    num_classes: int

    @nn.compact
    def __call__(self, x):
        x = nn.Conv(32, kernel_size=(3, 3), padding='SAME')(x)
        x = nn.relu(x)
        x = nn.max_pool(x, window_shape=(2, 2), strides=(2, 2))
        x = nn.Conv(64, kernel_size=(3, 3), padding='SAME')(x)
        x = nn.relu(x)
        x = nn.max_pool(x, window_shape=(2, 2), strides=(2, 2))
        x = nn.Conv(64, kernel_size=(3, 3), padding='SAME')(x)
        x = nn.relu(x)
        x = x.reshape((x.shape[0], -1))
        x = nn.Dense(features=10)(x)
        x = nn.log_softmax(x)

        return x



# Create an instance of the model
rng = jax.random.PRNGKey(0)
input_shape = (64, 32, 32, 3)  # Batch size of 64, input shape
model = net(num_classes=10)
params = model.init(rng, jnp.ones(input_shape, dtype=jnp.float32))


#load and preprocess data
(x_train, y_train), (x_valid, y_valid) = cifar10.load_data()
print(f"\nNumber of training samples: {len(x_train)} with samples shape: {x_train.shape[1:]}")
print(f"Number of validation samples: {len(x_valid)} with samples shape: {x_valid.shape[1:]}")


x_train_normalized = jnp.array(x_train / 255.)
x_valid_normalized = jnp.array(x_valid / 255.)

# One hot encoding applied to the labels. We have 10
# classes in the dataset, hence the depth of OHE would be 10
y_train_ohe = jnp.squeeze(jax.nn.one_hot(y_train, num_classes=10))
y_valid_ohe = jnp.squeeze(jax.nn.one_hot(y_valid, num_classes=10))

print(f"Training images shape:   {x_train_normalized.shape}  Labels shape: {y_train_ohe.shape}")
print(f"Validation images shape: {x_valid_normalized.shape}  Labels shape: {y_valid_ohe.shape}")

#define functions for dataaugmentation
def rotate_90(img):
    """Rotates an image by 90 degress k times."""
    return jnp.rot90(img, k=1, axes=(0, 1))


def identity(img):
    """Returns an image as it is."""
    return img


def flip_left_right(img):
    """Flips an image left/right direction."""
    return jnp.fliplr(img)


def flip_up_down(img):
    """Flips an image in up/down direction."""
    return jnp.flipud(img)


def random_rotate(img, rotate):
    """Randomly rotate an image by 90 degrees.

    Args:
        img: Array representing the image
        rotate: Boolean for rotating or not
    Returns:
        Rotated or an identity image
    """

    return jax.lax.cond(rotate, rotate_90, identity, img)


def random_horizontal_flip(img, flip):
    """Randomly flip an image vertically.

    Args:
        img: Array representing the image
        flip: Boolean for flipping or not
    Returns:
        Flipped or an identity image
    """

    return jax.lax.cond(flip, flip_left_right, identity, img)


def random_vertical_flip(img, flip):
    """Randomly flip an image vertically.

    Args:
        img: Array representing the image
        flip: Boolean for flipping or not
    Returns:
        Flipped or an identity image
    """

    return jax.lax.cond(flip, flip_up_down, identity, img)

# All the above function are written to work on a single example.
# We will use `vmap` to get a version of these functions that can
# operate on a batch of images. We will also add the `jit` transformation
# on top of it so that the whole pipeline can be compiled and executed faster
random_rotate_jitted = jit(vmap(random_rotate, in_axes=(0, 0)))
random_horizontal_flip_jitted = jit(vmap(random_horizontal_flip, in_axes=(0, 0)))
random_vertical_flip_jitted = jit(vmap(random_vertical_flip, in_axes=(0, 0)))


def augment_images(images, key):
    """Augment a batch of input images.

    Args:
        images: Batch of input images as a jax array
        key: Seed/Key for random functions for generating booleans
    Returns:
        Augmented images with the same shape as the input images
    """

    batch_size = len(images)

    # 1. Rotation
    key, subkey = random.split(key)
    rotate = random.randint(key, shape=[batch_size], minval=0, maxval=2)
    augmented = random_rotate_jitted(images, rotate)

    # 2. Flip horizontally
    key, subkey = random.split(key)
    flip = random.randint(key, shape=[batch_size], minval=0, maxval=2)
    augmented = random_horizontal_flip_jitted(augmented, flip)

    # 3. Flip vertically
    key, subkey = random.split(key)
    flip = random.randint(key, shape=[batch_size], minval=0, maxval=2)
    augmented = random_vertical_flip_jitted(augmented, flip)

    return augmented


#define datagenerator
def data_generator(images, labels, batch_size=64, is_valid=False, key=None):
    """Generates batches of data from a given dataset.

    Args:
        images: Image data represented by a ndarray
        labels: One-hot enocded labels
        batch_size: Number of data points in a single batch
        is_valid: (Boolean) If validation data, then don't shuffle and
                    don't apply any augmentation
        key: PRNG key needed for augmentation
    Yields:
        Batches of images-labels pairs
    """

    # 1. Calculate the total number of batches
    num_batches = int(np.ceil(len(images) / batch_size))

    # 2. Get the indices and shuffle them
    indices = np.arange(len(images))

    if not is_valid:
        if key is None:
             raise ValueError("A PRNG key is required if `aug` is set to True")
        else:
            np.random.shuffle(indices)

    for batch in range(num_batches):
        curr_idx = indices[batch * batch_size: (batch+1) * batch_size]
        batch_images = images[curr_idx]
        batch_labels = labels[curr_idx]

        if not is_valid:
            batch_images = augment_images(batch_images, key=key)
        yield batch_images, batch_labels




# Sanity Check: To make sure that the batches generated by the data
# generator are of correct size, we will just pull a batch of data and
# will check the shape of the images and the labels

sample_data_gen = data_generator(
    images=x_train_normalized,
    labels=y_train_ohe,
    batch_size=8,
    is_valid=False,
    key=random.PRNGKey(0)
)

sample_batch_images, sample_batch_labels = next(sample_data_gen)
print("Batch of images is of shape: ", sample_batch_images.shape)
print("Batch of labels is of shape: ", sample_batch_labels.shape)

# Clean up unnecessary objects
del sample_data_gen, sample_batch_images, sample_batch_labels


def calculate_accuracy(params, batch_data):
    """Implements accuracy metric.

    Args:
        params: Parameters of the network
        batch_data: A batch of data (images and labels)
    Returns:
        Accuracy for the current batch
    """
    inputs, targets = batch_data
    target_class = jnp.argmax(targets, axis=1)
    predicted_class = jnp.argmax(model.apply(params,inputs), axis=1)
    return jnp.mean(predicted_class == target_class)


# We will jit the train and test steps to make them more efficient
@jit
def train_step(step, opt_state, batch_data):
    """Implements train step.

    Args:
        step: Integer representing the step index
        opt_state: Current state of the optimizer
        batch_data: A batch of data (images and labels)
    Returns:
        Batch loss, batch accuracy, updated optimizer state
    """
    params = get_params(opt_state)
    batch_loss, batch_gradients = value_and_grad(loss_fn)(params, batch_data)
    batch_accuracy = calculate_accuracy(params, batch_data)
    return batch_loss, batch_accuracy, opt_update(step, batch_gradients, opt_state)


@jit
def test_step(opt_state, batch_data):
    """Implements train step.

    Args:
        opt_state: Current state of the optimizer
        batch_data: A batch of data (images and labels)
    Returns:
        Batch loss, batch accuracy
    """
    params = get_params(opt_state)
    batch_loss = loss_fn(params, batch_data)
    batch_accuracy = calculate_accuracy(params, batch_data)
    return batch_loss, batch_accuracy


# Define loss function

def loss_fn(params, batch_data):
    inputs, targets = batch_data
    logits = model.apply(params, inputs)
    loss = jnp.mean(optax.softmax_cross_entropy(logits=logits, labels=targets))
    return loss


LEARNING_RATE = 0.01

# Get the optimizer objects
opt_init, opt_update, get_params = optimizers.momentum(step_size=LEARNING_RATE,mass=0.9)

# Initialize the state of the optimizer using the parameters
opt_state = opt_init(params)

EPOCHS = 10
BATCH_SIZE = 64

# Initial rng key for the data generator
key = random.PRNGKey(0)

# Lists to record loss and accuracy for each epoch
training_loss = []
validation_loss = []
training_accuracy = []
validation_accuracy = []

# Training
for i in range(EPOCHS):
    num_train_batches = len(x_train) // BATCH_SIZE
    num_valid_batches = len(x_valid) // BATCH_SIZE

    # Lists to store loss and accuracy for each batch
    train_batch_loss, train_batch_acc = [], []
    valid_batch_loss, valid_batch_acc = [], []

    # Key to be passed to the data generator for augmenting
    # training dataset
    key, subkey = random.split(key)

    # Initialize data generators
    train_data_gen = data_generator(x_train_normalized,
                                y_train_ohe,
                                batch_size=BATCH_SIZE,
                                is_valid=False,
                                key=key
                               )

    valid_data_gen = data_generator(x_valid_normalized,
                               y_valid_ohe,
                               batch_size=BATCH_SIZE,
                               is_valid=True
                               )

    print(f"Epoch: {i+1:<3}", end=" ")

    # Training
    for step in tqdm(range(num_train_batches)):
        batch_data = next(train_data_gen)
        loss_value, acc, opt_state = train_step(step, opt_state, batch_data)
        train_batch_loss.append(loss_value)
        train_batch_acc.append(acc)

    # Evaluation on validation data
    for step in tqdm(range(num_valid_batches)):
        batch_data = next(valid_data_gen)
        loss_value, acc = test_step(opt_state, batch_data)
        valid_batch_loss.append(loss_value)
        valid_batch_acc.append(acc)

    # Loss for the current epoch
    epoch_train_loss = np.mean(train_batch_loss)
    epoch_valid_loss = np.mean(valid_batch_loss)

    # Accuracy for the current epoch
    epoch_train_acc = np.mean(train_batch_acc)
    epoch_valid_acc = np.mean(valid_batch_acc)

    training_loss.append(epoch_train_loss)
    training_accuracy.append(epoch_train_acc)
    validation_loss.append(epoch_valid_loss)
    validation_accuracy.append(epoch_valid_acc)

    print(f"loss: {epoch_train_loss:.3f}   acc: {epoch_train_acc:.3f}  valid_loss: {epoch_valid_loss:.3f}  valid_acc: {epoch_valid_acc:.3f}")

#Pretrained models

Using pretrained models can be a great help in training deep learning models and convolutional neural networks. Especially in cases where data is limited relative to huge datasets such as ImageNet. The use of pretrained models can take different forms. A model can be used either as a feature extractor, i.e you take your data an pass it trough the model once to extract features and then train a model on those feature, or you finetune some, or all of the weights of the pretrained model on the new data.

#Pretrained models in Keras

To import a pre-trained VGG model with ImageNet weights in Keras,
you can use the tf.keras.applications module, which provides pre-trained models with pre-trained weights.
Here's how you can import a VGG model with ImageNet weights:

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, ReLU, Dropout

num_classes = 10

# Import the VGG16 model with ImageNet weights
vgg_model = tf.keras.applications.VGG16(
    weights='imagenet',  # Use ImageNet pre-trained weights
    include_top=False,     # Exclude the fully connected layers (FC classification layer) when used on new data.
    input_shape=(224, 224, 3)  # Specify the input shape of your data
)

input = Input(shape=(224,224,3))
x = vgg_model(input)
x = Flatten()(x)
x = Dense(4096)(x)
x = ReLU()(x)
x = Dropout(0.5)(x)
out = Dense(num_classes,activation='softmax')(x)

model = Model(inputs = input, outputs = out)





# Print a summary of the model architecture
model.summary()


#And in Pytorch
And here is how it is done in PyTorch using the torchvision library

In [115]:
import torch
import torchvision.models as models
import torch.nn as nn

# Import the VGG16 model with ImageNet weights, excluding the top layer
vgg_model = models.vgg16(pretrained=True)
vgg_model = nn.Sequential(*list(vgg_model.children())[:-1])  # Remove the top fully connected layer

# Define a new fully connected layer with the number of classes in your new dataset
num_classes = 10  # Replace with the number of classes in your new dataset
classifier = nn.Sequential(
    nn.Linear(512, 4096),  # Example: Add a new fully connected layer
    nn.ReLU(inplace=True),
    nn.Dropout(),
    nn.Linear(4096, num_classes)  # Output layer with the new number of classes
)

# Combine the pre-trained VGG model and the new classifier
model = nn.Sequential(
    vgg_model,
    nn.Flatten(),
    classifier
)

# Print a summary of the model architecture
print(model)

Sequential(
  (0): Sequential(
    (0): Sequential(
      (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (1): ReLU(inplace=True)
      (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (3): ReLU(inplace=True)
      (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (6): ReLU(inplace=True)
      (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (8): ReLU(inplace=True)
      (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (11): ReLU(inplace=True)
      (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (13): ReLU(inplace=True)
      (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (15): ReLU(inplace=True)
      (16): M

##Note on input shapes

Be aware that the model obtained from keras.applications and torchvision have been trained on images of a specific shape. The VGG network shown in this example for instance is trained on ImageNet image with 224x224 image crops. When removing the original FC classification layer (and other layers that expect specific shapes), models are in principle agnostic to the input shape. But, because of the downsampling in the spatial dimensions that happens in the network, there is a lower bound on the input shape.

(There is also an upper bound, but this is generally determined by the amount of memory availiable)

So for instance, we can load the VGG network with diffenrent input shapes. But if the initial shape is to small, the operations on feature maps inside the network will fail at some point.

In [None]:
import tensorflow as tf

num_classes = 10

# Import the VGG16 model with ImageNet weights
vgg_model = tf.keras.applications.VGG16(
    weights='imagenet',  # Use ImageNet pre-trained weights
    include_top=False,     # Exclude the fully connected layers (FC classification layer) when used on new data.
    input_shape=(128, 128, 3)  # Specify the input shape of your data
)

# Print a summary of the model architecture in this case the spatial shape after the final pooling layer is 4x4
vgg_model.summary()

Now, if we try with an even smaller input size, the call to the application function will fail.


In [None]:

import tensorflow as tf

num_classes = 10

# Import the VGG16 model with ImageNet weights
vgg_model = tf.keras.applications.VGG16(
    weights='imagenet',  # Use ImageNet pre-trained weights
    include_top=False,     # Exclude the fully connected layers (FC classification layer) when used on new data.
    input_shape=(28, 28, 3)  # Specify the input shape of your data
)

# Print a summary of the model architecture
vgg_model.summary()

In the above case, there are basically two options to fix the issue:

1. Resize the input to match the required shape
2. Create a model which is a shallower version of the orginal model with identical layers and layer names (but new shapes) and load the weights from the pretrained model into the new model. (No example will be shown here, but for keras you can look at the .get_weights() and .set_weights() methods)

Luckily, based on the error message above, we see that the minimum required input shape is 32x32x3, which matches the dimensions of the cifar 10 dataset. Meaning that we can use the VGG model for finetuning on this dataset.

Example below.

As you will see, now that we are beginning to work with vary large models training time increases significantly. Especially when training on CPU. If you go to: "Edit --> Notebook settings " you can select a hardware accelarator for the notebook. Select GPU if availiable. Try to run the code below both with, and without GPU acceleration to appreciate the difference.

(don't finish the training using CPU. It is going to take forever...)

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, ReLU, Dropout

from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import SGD

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Preprocess the data
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0

train_labels = to_categorical(train_labels, 10)
test_labels = to_categorical(test_labels, 10)

num_classes = 10

# Import the VGG16 model with ImageNet weights
vgg_model = tf.keras.applications.VGG16(
    weights='imagenet',  # Use ImageNet pre-trained weights
    include_top=False,     # Exclude the fully connected layers (FC classification layer) when used on new data.
    input_shape=(32, 32, 3)  # Specify the input shape of your data
)

input = Input(shape=(32,32,3))
x = vgg_model(input)
x = Flatten()(x)
x = Dense(4096)(x)
x = ReLU()(x)
x = Dropout(0.5)(x)
out = Dense(num_classes,activation='softmax')(x)

model = Model(inputs = input, outputs = out)


# Print a summary of the model architecture
model.summary()


# Compile the model
model.compile(optimizer=SGD(learning_rate=0.0001,momentum=0.9),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

#we train the model with a smaller learning rate than in the previous example. This is generally considered good practice as to not change the pretrained weights to much in the update stage

# Create data generators for data augmentation
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Fit the model using data generators
batch_size = 32
epochs = 10

history = model.fit(
    datagen.flow(train_images, train_labels, batch_size=batch_size),
    steps_per_epoch=len(train_images) // batch_size,
    epochs=epochs,
    validation_data=(test_images, test_labels),
    verbose=1
)

# Evaluate the model on test data
test_loss, test_accuracy = model.evaluate(test_images, test_labels, verbose=2)
print(f"Test accuracy: {test_accuracy*100:.2f}%")


##"Subsampling" pretrained networks

when using pretrained networks it can be tempting to just take the biggest and best model trained on ImageNet and apply it to your own data. But remember, with increasing depth of the networks, the filters that are learned are more and more specialized to the data they where originally trained on. Thus, if you are working with data that are very dissimilar to ImageNet, it may be a good idea to only reuse some of the weights from a pretrained model, as the filters in the shallow parts of a pretrained model will be of more general nature.

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, ReLU, Dropout, GlobalAveragePooling2D

from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import SGD


# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Preprocess the data
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0

train_labels = to_categorical(train_labels, 10)
test_labels = to_categorical(test_labels, 10)

# Create data generators for data augmentation
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)


num_classes = 10


# Import the VGG16 model with ImageNet weights
vgg_weights = tf.keras.applications.VGG16(
    weights='imagenet',  # Use ImageNet pre-trained weights
    include_top=False,     # Exclude the fully connected layers (FC classification layer) when used on new data.
    input_shape=(32, 32, 3)  # Specify the input shape of your data
)

vgg_weights.summary()

vgg_sub_in = vgg_weights.input

vgg_sub_out = vgg_weights.layers[10].output

vgg_sub = Model(inputs = vgg_sub_in, outputs = vgg_sub_out)


vgg_sub.summary()


input = Input(shape=(32,32,3))
x = vgg_sub(input)
x = GlobalAveragePooling2D()(x) #we change the flatten layer used in the model above to an average pooling layer to save parameters
x = Dense(4096)(x)
x = ReLU()(x)
x = Dropout(0.5)(x)
out = Dense(num_classes,activation='softmax')(x)

model = Model(inputs = input, outputs = out)

model.summary()

# Compile the model
model.compile(optimizer=SGD(learning_rate=0.0001,momentum=0.9),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Fit the model using data generators
batch_size = 32
epochs = 10

history = model.fit(
    datagen.flow(train_images, train_labels, batch_size=batch_size),
    steps_per_epoch=len(train_images) // batch_size,
    epochs=epochs,
    validation_data=(test_images, test_labels),
    verbose=1
)

# Evaluate the model on test data
test_loss, test_accuracy = model.evaluate(test_images, test_labels, verbose=2)
print(f"Test accuracy: {test_accuracy*100:.2f}%")




You can read much more about Keras and Pytorch online and both frameworks have a lot of good toturials on their respective websites:

- Keras: https://keras.io/
- Pytorch: https://pytorch.org/tutorials/
