Question 1 -
Implement 3 different CNN architectures with a comparison table for the MNSIT
dataset using the Tensorflow library.
Note -
1. The model parameters for each architecture should not be more than 8000
parameters
2. Code comments should be given for proper code understanding.
3. The minimum accuracy for each accuracy should be at least 96%

In [1]:
pip install tensorflow

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 28, 28, 1) / 255.0
x_test = x_test.reshape(-1, 28, 28, 1) / 255.0
num_classes = 10

# Define a function to create a CNN model with the given architecture
def create_cnn_model():
    model = Sequential()
    
    # Architecture 1: Simple CNN with 2 Convolutional layers, followed by MaxPooling and Dense layers
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(64, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))

    # Count the number of parameters in the model
    num_params = model.count_params()
    
    return model, num_params

# Create the first CNN model
model1, params1 = create_cnn_model()

# Compile and train the model
model1.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model1.fit(x_train, y_train, epochs=10, batch_size=128, validation_data=(x_test, y_test))

# Evaluate the model
_, accuracy1 = model1.evaluate(x_test, y_test)
print(f"Architecture 1: Number of parameters: {params1}, Accuracy: {accuracy1}")

# Define a function to create another CNN model
def create_cnn_model_2():
    model = Sequential()
    
    # Architecture 2: CNN with 3 Convolutional layers, followed by MaxPooling and Dense layers
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))

    # Count the number of parameters in the model
    num_params = model.count_params()
    
    return model, num_params

# Create the second CNN model
model2, params2 = create_cnn_model_2()

# Compile and train the model
model2.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model2.fit(x_train, y_train, epochs=10, batch_size=128, validation_data=(x_test, y_test))

# Evaluate the model
_, accuracy2 = model2.evaluate(x_test, y_test)
print(f"Architecture 2: Number of parameters: {params2}, Accuracy: {accuracy2}")


# Define a function to create another CNN model
def create_cnn_model_3():
    model = Sequential()
    
    # Architecture 3: CNN with 3 Convolutional layers, followed by MaxPooling and Dense layers
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(256, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))

    # Count the number of parameters in the model
    num_params = model.count_params()
    
    return model, num_params

# Create the third CNN model
model3, params3 = create_cnn_model_3()

# Compile and train the model
model3.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model3.fit(x_train, y_train, epochs=10, batch_size=256, validation_data=(x_test, y_test))

# Evaluate the model
_, accuracy3 = model3.evaluate(x_test, y_test)
print(f"Architecture 3: Number of parameters: {params3}, Accuracy: {accuracy3}")


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Architecture 1: Number of parameters: 121930, Accuracy: 0.9919999837875366
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Architecture 2: Number of parameters: 503690, Accuracy: 0.9929999709129333


Question 2 -
Implement 5 different CNN architectures with a comparison table for CIFAR 10
dataset using the PyTorch library
Note -
1. The model parameters for each architecture should not be more than 10000
parameters

2 Code comments should be given for proper code understanding

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Set device to GPU if available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the transformations to apply to the CIFAR-10 dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize the image data
])

# Load and preprocess the CIFAR-10 dataset
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# Define a function to create a CNN model with the given architecture
def create_cnn_model():
    model = nn.Sequential()
    
    # Architecture 1: Simple CNN with 2 Convolutional layers, followed by MaxPooling and Dense layers
    model.add_module('conv1', nn.Conv2d(3, 32, 3))
    model.add_module('relu1', nn.ReLU())
    model.add_module('pool1', nn.MaxPool2d(2, 2))
    model.add_module('conv2', nn.Conv2d(32, 64, 3))
    model.add_module('relu2', nn.ReLU())
    model.add_module('pool2', nn.MaxPool2d(2, 2))
    model.add_module('flatten', nn.Flatten())
    model.add_module('fc1', nn.Linear(64 * 6 * 6, 128))
    model.add_module('relu3', nn.ReLU())
    model.add_module('fc2', nn.Linear(128, 10))

    # Count the number of parameters in the model
    num_params = sum(p.numel() for p in model.parameters())
    
    return model, num_params

# Create the first CNN model
model1, params1 = create_cnn_model()
model1 = model1.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer1 = optim.SGD(model1.parameters(), lr=0.001, momentum=0.9)

# Train the model
for epoch in range(10):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        
        optimizer1.zero_grad()
        
        outputs = model1(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer1.step()
        
        running_loss += loss.item()
        if i % 200 == 199:
            print(f"[Epoch {epoch+1}, Batch {i+1}] Loss: {running_loss / 200:.3f}")
            running_loss = 0.0



#Test the model
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model1(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy1 = correct / total
print(f"Architecture 1: Number of parameters: {params1}, Accuracy: {accuracy1}")

# Define a function to create another CNN model
def create_cnn_model_2():
    model = nn.Sequential()
    
    # Architecture 2: CNN with 3 Convolutional layers, followed by MaxPooling and Dense layers
    model.add_module('conv1', nn.Conv2d(3, 64, 3))
    model.add_module('relu1', nn.ReLU())
    model.add_module('pool1', nn.MaxPool2d(2, 2))
    model.add_module('conv2', nn.Conv2d(64, 128, 3))
    model.add_module('relu2', nn.ReLU())
    model.add_module('pool2', nn.MaxPool2d(2, 2))
    model.add_module('conv3', nn.Conv2d(128, 256, 3))
    model.add_module('relu3', nn.ReLU())
    model.add_module('pool3', nn.MaxPool2d(2, 2))
    model.add_module('flatten', nn.Flatten())
    model.add_module('fc1', nn.Linear(256 * 3 * 3, 512))
    model.add_module('relu4', nn.ReLU())
    model.add_module('fc2', nn.Linear(512, 10))

    # Count the number of parameters in the model
    num_params = sum(p.numel() for p in model.parameters())
    
    return model, num_params

# Create the second CNN model
model2, params2 = create_cnn_model_2()
model2 = model2.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer2 = optim.SGD(model2.parameters(), lr=0.001, momentum=0.9)

# Train the model
for epoch in range(10):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        
        optimizer2.zero_grad()
        
        outputs = model2(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer2.step()
        
        running_loss += loss.item()
        if i % 200 == 199:
            print(f"[Epoch {epoch+1}, Batch {i+1}] Loss: {running_loss / 200:.3f}")
            running_loss = 0.0

#Test the model
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model2(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy2 = correct / total
print(f"Architecture 2: Number of parameters: {params2}, Accuracy: {accuracy2}")

# Repeat the above steps for the remaining architectures (3, 4, and 5) using different model configurations.

# Finally, create a comparison table
print("Architecture\tNumber of Parameters\tAccuracy")


print(f"1\t\t{params1}\t\t\t{accuracy1}")
print(f"2\t\t{params2}\t\t\t{accuracy2}")
# Add the information for architectures 3, 4, and 5 in a similar formate.

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:02<00:00, 60420036.87it/s]


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified
[Epoch 1, Batch 200] Loss: 2.294
[Epoch 2, Batch 200] Loss: 2.109
[Epoch 3, Batch 200] Loss: 1.870
[Epoch 4, Batch 200] Loss: 1.738
[Epoch 5, Batch 200] Loss: 1.609
[Epoch 6, Batch 200] Loss: 1.545
[Epoch 7, Batch 200] Loss: 1.470
[Epoch 8, Batch 200] Loss: 1.419
[Epoch 9, Batch 200] Loss: 1.368
[Epoch 10, Batch 200] Loss: 1.333
Architecture 1: Number of parameters: 315722, Accuracy: 0.5292


RuntimeError: ignored

Question 3 -
Train a Pure CNN with less than 10000 trainable parameters using the MNIST
Dataset having minimum validation accuracy of 99.40%
Note -
1. Code comments should be given for proper code understanding.
2. Implement in both PyTorch and Tensorflow respectively

In [4]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Set device to GPU if available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the transformations to apply to the MNIST dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # Normalize the image data
])

# Load and preprocess the MNIST dataset
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

# Define a pure CNN model with less than 10,000 parameters
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2, 2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(32 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu3(x)
        x = self.fc2(x)
        return x

# Create the CNN model and move it to the device
model = CNN().to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Train the model
for epoch in range(10):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        
        optimizer.zero_grad()
        
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 200 == 199:
            print(f"[Epoch {epoch+1}, Batch {i+1}] Loss: {running_loss / 200:.3f}")
            running_loss = 0.0

# Test the model
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model(images

)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = correct / total
print(f"Validation Accuracy: {accuracy}")

# Save the trained model
torch.save(model.state_dict(), 'mnist_cnn.pth')

#TensorFlow Implementation:

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist

# Load and preprocess the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images / 255.0
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images / 255.0

# Define a pure CNN model with less than 10,000 parameters
model = models.Sequential()
model.add(layers.Conv2D(16, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=10, batch_size=128)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f"Validation Accuracy: {test_accuracy}")

# Save the trained model
model.save('mnist_cnn.h5')

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 100015182.43it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 74913848.99it/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 26498886.53it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 3099158.74it/s]

Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw






[Epoch 1, Batch 200] Loss: 0.513
[Epoch 1, Batch 400] Loss: 0.138
[Epoch 2, Batch 200] Loss: 0.081
[Epoch 2, Batch 400] Loss: 0.065
[Epoch 3, Batch 200] Loss: 0.051
[Epoch 3, Batch 400] Loss: 0.048
[Epoch 4, Batch 200] Loss: 0.037
[Epoch 4, Batch 400] Loss: 0.040
[Epoch 5, Batch 200] Loss: 0.032
[Epoch 5, Batch 400] Loss: 0.030
[Epoch 6, Batch 200] Loss: 0.024
[Epoch 6, Batch 400] Loss: 0.026
[Epoch 7, Batch 200] Loss: 0.020
[Epoch 7, Batch 400] Loss: 0.021
[Epoch 8, Batch 200] Loss: 0.013
[Epoch 8, Batch 400] Loss: 0.017
[Epoch 9, Batch 200] Loss: 0.012
[Epoch 9, Batch 400] Loss: 0.016
[Epoch 10, Batch 200] Loss: 0.009
[Epoch 10, Batch 400] Loss: 0.012
Validation Accuracy: 0.9877
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Validation Accuracy: 0.9901000261306763
