In [1]:
# 1. Explain the architecture of LeNet-5 and its significance in the field of deep learning.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, AveragePooling2D, Flatten, Dense

def build_lenet5():
    model = Sequential()

    # Layer 1: Convolutional Layer
    model.add(Conv2D(6, (5, 5), activation='tanh', input_shape=(32, 32, 1)))

    # Layer 2: Subsampling (Pooling) Layer
    model.add(AveragePooling2D(pool_size=(2, 2)))

    # Layer 3: Convolutional Layer
    model.add(Conv2D(16, (5, 5), activation='tanh'))

    # Layer 4: Subsampling (Pooling) Layer
    model.add(AveragePooling2D(pool_size=(2, 2)))

    # Layer 5: Fully Connected Layer
    model.add(Flatten())
    model.add(Dense(120, activation='tanh'))

    # Layer 6: Fully Connected Layer
    model.add(Dense(84, activation='tanh'))

    # Layer 7: Output Layer
    model.add(Dense(10, activation='softmax'))

    return model

# Build and compile the model
lenet5_model = build_lenet5()
lenet5_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Summary of the model architecture
lenet5_model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


**LeNet-5 Architecture and Its Significance**

LeNet-5 is one of the earliest and most influential Convolutional Neural Network (CNN) architectures, proposed by Yann LeCun in 1998. It was originally designed for handwritten digit recognition (MNIST dataset) but laid the foundation for many modern deep learning architectures.

**Architecture Overview:**

LeNet-5 consists of the following layers:

1. Input Layer: 32x32 grayscale images.

2. Convolutional Layer 1 (C1): Applies 6 filters (size 5x5) with a stride of 1, producing 6 feature maps of size 28x28.

3. Subsampling (Pooling) Layer 1 (S2): A 2x2 average pooling layer that reduces each feature map to 14x14.

4. Convolutional Layer 2 (C3): Applies 16 filters (size 5x5), producing 16 feature maps of size 10x10.

5. Subsampling (Pooling) Layer 2 (S4): Another 2x2 average pooling layer that reduces each feature map to 5x5.

6. Fully Connected Layer 1 (C5): A dense layer that connects all the 16x5x5 feature maps to 120 neurons.

7. Fully Connected Layer 2 (F6): A dense layer with 84 neurons.

8. Output Layer: 10 neurons for classification (digits 0-9).

**Significance:**

 - Early CNN Development: LeNet-5 demonstrated the effectiveness of convolutional layers for image recognition, especially for tasks like digit classification.

 - Pooling Layers: Introduced the concept of down-sampling (pooling) to reduce dimensionality and retain important features.

 - Foundation for Modern CNNs: Inspired many modern architectures like AlexNet, VGG, and ResNet.

In [2]:
# 2. Describe the key components of LeNet-5 and their roles in the network.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, AveragePooling2D, Flatten, Dense

def build_lenet5():
    model = Sequential()

    # Layer 1: Convolutional Layer
    model.add(Conv2D(6, (5, 5), activation='tanh', input_shape=(32, 32, 1)))

    # Layer 2: Subsampling (Pooling) Layer
    model.add(AveragePooling2D(pool_size=(2, 2)))

    # Layer 3: Convolutional Layer
    model.add(Conv2D(16, (5, 5), activation='tanh'))

    # Layer 4: Subsampling (Pooling) Layer
    model.add(AveragePooling2D(pool_size=(2, 2)))

    # Layer 5: Fully Connected Layer
    model.add(Flatten())
    model.add(Dense(120, activation='tanh'))

    # Layer 6: Fully Connected Layer
    model.add(Dense(84, activation='tanh'))

    # Layer 7: Output Layer
    model.add(Dense(10, activation='softmax'))

    return model

# Build and compile the model
lenet5_model = build_lenet5()
lenet5_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Summary of the model architecture
lenet5_model.summary()


**Key Components of LeNet-5 and Their Roles**

1. Input Layer:

  - Role: Takes in input data, typically 32x32 grayscale images.

  - Function: This layer doesn't process the data but serves as the entry point for the image into the network.

2. Convolutional Layer 1 (C1):

  - Role: Applies 6 convolutional filters (size 5x5) to the input image to extract basic features like edges, textures, etc.

  - Function: Produces 6 feature maps of size 28x28. This is the first step in feature extraction.

3. Subsampling (Pooling) Layer 1 (S2):

  - Role: Reduces the dimensionality of the feature maps by applying average pooling.

  - Function: The pooling layer reduces the spatial dimensions of the feature maps (from 28x28 to 14x14), which helps to reduce computational complexity while retaining important features.

4. Convolutional Layer 2 (C3):

  - Role: Applies 16 convolutional filters (size 5x5) to the pooled feature maps from S2.

  - Function: Generates 16 feature maps of size 10x10, capturing more complex patterns such as shapes and textures.

5. Subsampling (Pooling) Layer 2 (S4):

  - Role: Another average pooling layer that reduces the dimensionality again (from 10x10 to 5x5).

  - Function: Further reduces the complexity while preserving the features that are important for classification.

6. Fully Connected Layer 1 (C5):

  - Role: A fully connected layer with 120 neurons that takes the 5x5x16 feature maps and flattens them into a vector of 120 values.

  - Function: This layer learns higher-level features and makes connections between the extracted features for classification.

7. Fully Connected Layer 2 (F6):

  - Role: Another fully connected layer with 84 neurons.

  - Function: Refines the output from C5 and prepares it for the final output layer.

8. Output Layer:

  - Role: The final layer with 10 neurons, each representing a digit from 0 to 9.

  - Function: Outputs the predicted class based on the features learned by the network. Uses the softmax activation function to produce a probability distribution over the 10 classes.

In [3]:
# 3. Discuss the limitations of LeNet-5 and how subsequent architectures like AlexNet addressed these limitations.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

def build_alexnet():
    model = Sequential()

    # Convolutional Layer 1
    model.add(Conv2D(96, (11, 11), strides=4, activation='relu', input_shape=(224, 224, 3)))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 2
    model.add(Conv2D(256, (5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 3
    model.add(Conv2D(384, (3, 3), activation='relu'))

    # Convolutional Layer 4
    model.add(Conv2D(384, (3, 3), activation='relu'))

    # Convolutional Layer 5
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Flatten the output and fully connected layers
    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1000, activation='softmax'))  # 1000 classes for ImageNet

    return model

# Build and compile the model
alexnet_model = build_alexnet()
alexnet_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Summary of the model architecture
alexnet_model.summary()


**Limitations of LeNet-5 and How AlexNet Addressed Them**

1. Limited Capacity for Complex Data:

   - LeNet-5 was designed for simpler datasets like MNIST (28x28 grayscale images). As a result, its architecture struggles with more complex datasets like ImageNet that have high-resolution images (224x224) and require deeper networks.

   - Solution in AlexNet: AlexNet introduced a deeper architecture with more convolutional layers and the ability to handle high-resolution images. It significantly improved the network's capacity to learn complex features from larger, more diverse datasets.

2. Shallow Architecture:

   - LeNet-5 has only a few layers (2 convolutional layers and 2 fully connected layers). This shallow architecture limits its ability to capture hierarchical features in large datasets.

   - Solution in AlexNet: AlexNet deepened the network by introducing 5 convolutional layers and 3 fully connected layers, allowing the model to capture more abstract and complex features at various levels of granularity.

3. Computational Constraints:

   - LeNet-5 was limited in its computational requirements, suitable only for small-scale datasets on low-power machines.

  - Solution in AlexNet: AlexNet was designed with GPU utilization in mind. It was trained on two GPUs, allowing it to handle larger datasets and perform faster computations, overcoming the hardware constraints that LeNet-5 faced.

4. Limited Receptive Field:

   - LeNet-5 had a relatively small receptive field (input region considered by a convolutional filter), which constrained its ability to capture global features in larger images.

   - Solution in AlexNet: AlexNet used larger kernel sizes (11x11 in the first convolutional layer) to increase the receptive field and capture more global features early in the network.

5. Activation Function:

   - LeNet-5 used the tanh activation function, which can suffer from issues like vanishing gradients, especially during training deep networks.

   - Solution in AlexNet: AlexNet used the ReLU (Rectified Linear Unit) activation function, which solved the vanishing gradient problem and sped up training by allowing gradients to flow more effectively during backpropagation.

6. Overfitting:

   - LeNet-5 lacked regularization techniques for handling overfitting in deeper networks.

   - Solution in AlexNet: AlexNet used Dropout as a regularization method in the fully connected layers to prevent overfitting by randomly dropping units during training.

In [4]:
# 4. Explain the architecture of AlexNet and its contributions to the advancement of deep learning.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

def build_alexnet():
    model = Sequential()

    # Convolutional Layer 1
    model.add(Conv2D(96, (11, 11), strides=4, activation='relu', input_shape=(224, 224, 3)))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 2
    model.add(Conv2D(256, (5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 3
    model.add(Conv2D(384, (3, 3), activation='relu'))

    # Convolutional Layer 4
    model.add(Conv2D(384, (3, 3), activation='relu'))

    # Convolutional Layer 5
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Flatten the output and fully connected layers
    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1000, activation='softmax'))  # 1000 classes for ImageNet

    return model

# Build and compile the model
alexnet_model = build_alexnet()
alexnet_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Summary of the model architecture
alexnet_model.summary()


**Architecture of AlexNet and Its Contributions**

AlexNet Architecture: AlexNet is a deep convolutional neural network (CNN) that revolutionized deep learning by achieving exceptional performance on the ImageNet classification challenge in 2012. It consists of the following key layers:

1. Input Layer: Takes 224x224 RGB images (3 channels) as input.

2. Convolutional Layers:
  - Layer 1: Convolution with 96 filters of size 11x11, stride of 4, followed by a max-pooling layer.

  - Layer 2: Convolution with 256 filters of size 5x5, followed by a max-pooling layer.

  - Layer 3: Convolution with 384 filters of size 3x3.

  - Layer 4: Convolution with 384 filters of size 3x3.

  - Layer 5: Convolution with 256 filters of size 3x3, followed by a max-pooling layer.

3. Fully Connected Layers:
  
  - Layer 6: A dense layer with 4096 units and ReLU activation.

  - Layer 7: A dense layer with 4096 units and ReLU activation.

  - Layer 8: The output layer with 1000 units (one for each class in ImageNet) and softmax activation.

**Key Contributions to Deep Learning:**

1. Deep Architecture: AlexNet demonstrated that deeper architectures with more convolutional layers could learn more abstract and complex features, outperforming shallow networks.

2. ReLU Activation: AlexNet popularized the use of the ReLU activation function, which helped mitigate the vanishing gradient problem, allowing deeper networks to train effectively.

3. GPU Utilization: AlexNet leveraged GPU parallelization for training, reducing training time and enabling the training of deeper models on large datasets like ImageNet.

4. Dropout: Introduced Dropout as a regularization technique to prevent overfitting by randomly disabling neurons during training.

5. Large Datasets: AlexNet showed the importance of large-scale labeled datasets for training deep networks, helping to popularize the use of such datasets.

In [9]:
# 5. Compare and contrast the architectures of LeNet-5 and AlexNet.
# Discuss their similarities, differences, and respective contributions to the field of deep learning.

# Example for LeNet-5 with MNIST dataset:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

def build_lenet5():
    model = Sequential()

    # Convolutional Layer 1
    model.add(Conv2D(6, (5, 5), activation='tanh', input_shape=(32, 32, 1)))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    # Convolutional Layer 2
    model.add(Conv2D(16, (5, 5), activation='tanh'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    # Flatten and fully connected layers
    model.add(Flatten())
    model.add(Dense(120, activation='tanh'))
    model.add(Dense(84, activation='tanh'))
    model.add(Dense(10, activation='softmax'))  # 10 classes for MNIST

    return model

# Load and preprocess MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Resize images to 32x32 and normalize
x_train = tf.image.resize(x_train[..., tf.newaxis], (32, 32)).numpy().astype('float32') / 255.0
x_test = tf.image.resize(x_test[..., tf.newaxis], (32, 32)).numpy().astype('float32') / 255.0

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build and compile the model
lenet5_model = build_lenet5()
lenet5_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
lenet5_model.fit(x_train, y_train, epochs=5, batch_size=64, validation_data=(x_test, y_test))

# Evaluate the model
test_loss, test_accuracy = lenet5_model.evaluate(x_test, y_test)
print("Test accuracy:", test_accuracy)



Epoch 1/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 4ms/step - accuracy: 0.8859 - loss: 0.4065 - val_accuracy: 0.9713 - val_loss: 0.0885
Epoch 2/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.9805 - loss: 0.0625 - val_accuracy: 0.9819 - val_loss: 0.0583
Epoch 3/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.9868 - loss: 0.0409 - val_accuracy: 0.9870 - val_loss: 0.0416
Epoch 4/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.9915 - loss: 0.0284 - val_accuracy: 0.9870 - val_loss: 0.0402
Epoch 5/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - accuracy: 0.9932 - loss: 0.0238 - val_accuracy: 0.9883 - val_loss: 0.0378
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9870 - loss: 0.0441
Test accuracy: 0.9883000254631042


In [6]:
# Example for AlexNet with CIFAR-10 dataset:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

def build_alexnet():
    model = Sequential()

    # Convolutional Layer 1
    model.add(Conv2D(96, (11, 11), strides=1, activation='relu', input_shape=(32, 32, 3)))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 2
    model.add(Conv2D(256, (5, 5), padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Convolutional Layer 3
    model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))

    # Convolutional Layer 4
    model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))

    # Convolutional Layer 5
    model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

    # Flatten and Fully Connected Layers
    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(10, activation='softmax'))  # 10 classes for CIFAR-10

    return model

# Load and preprocess CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Normalize and preprocess the dataset
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build and compile the AlexNet model
alexnet_model = build_alexnet()
alexnet_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
alexnet_model.fit(x_train, y_train, epochs=5, batch_size=64, validation_data=(x_test, y_test))

# Evaluate the model
test_loss, test_accuracy = alexnet_model.evaluate(x_test, y_test)
print("Test accuracy:", test_accuracy)

Epoch 1/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 24ms/step - accuracy: 0.1009 - loss: 2.3054 - val_accuracy: 0.1000 - val_loss: 2.3026
Epoch 2/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m36s[0m 20ms/step - accuracy: 0.1020 - loss: 2.3027 - val_accuracy: 0.1000 - val_loss: 2.3027
Epoch 3/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 20ms/step - accuracy: 0.0974 - loss: 2.3027 - val_accuracy: 0.1000 - val_loss: 2.3026
Epoch 4/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 19ms/step - accuracy: 0.0980 - loss: 2.3027 - val_accuracy: 0.1000 - val_loss: 2.3026
Epoch 5/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 19ms/step - accuracy: 0.0988 - loss: 2.3027 - val_accuracy: 0.1000 - val_loss: 2.3026
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.0988 - loss: 2.3026
Test accuracy: 0.10000000149011612


**Comparison of LeNet-5 and AlexNet**

LeNet-5 and AlexNet are both pioneering convolutional neural networks (CNNs) that contributed to the advancement of deep learning, particularly in the field of image recognition. Here’s a brief comparison:

**Similarities:**

1. Convolutional Layers: Both LeNet-5 and AlexNet use multiple convolutional layers to extract hierarchical features from the input image.

2. Pooling Layers: Both networks use pooling layers (max-pooling) to downsample feature maps, reducing dimensionality and computational cost.

3. Fully Connected Layers: After several convolutional and pooling layers, both networks use fully connected layers for classification.

**Differences:**

1. Depth of Architecture:

  - LeNet-5: Has 7 layers in total, including convolutional, pooling, and fully connected layers, designed for simpler datasets like MNIST.

  - AlexNet: Has 8 layers (5 convolutional and 3 fully connected layers) and is much deeper, designed to handle complex datasets like ImageNet with 1000 classes.

2. Input Size:

  - LeNet-5: Takes 32x32 grayscale images as input (MNIST dataset).

  - AlexNet: Takes 224x224 RGB images as input (ImageNet dataset).

3. ReLU Activation:

  - LeNet-5: Uses sigmoid or tanh activation functions in hidden layers.

  - AlexNet: Uses ReLU (Rectified Linear Unit) activation, which helps mitigate the vanishing gradient problem and accelerates training.

4. GPU Utilization:

  - LeNet-5: Was designed before the widespread use of GPUs, so it was primarily used for CPU-based training.

  - AlexNet: Was one of the first models to use GPU parallelization, which allowed for faster training on large datasets like ImageNet.

5. Dropout:

  - LeNet-5: Does not use dropout.

  - AlexNet: Introduced dropout as a regularization technique to prevent overfitting, especially in fully connected layers.

6. Dataset:

  - LeNet-5: Primarily used on smaller, simpler datasets like MNIST (handwritten digits).

  - AlexNet: Used on large-scale, complex datasets like ImageNet, which contains millions of labeled images across 1000 classes.

**Contributions to Deep Learning:**

1. LeNet-5:

  - Introduced the concept of convolutional layers, pooling layers, and fully connected layers for digit recognition.

  - Set the foundation for CNNs in computer vision tasks.

2. AlexNet:

  - Showed the effectiveness of deeper networks for more complex datasets and tasks.

  - Popularized the use of ReLU activation and dropout, improving training efficiency and performance.

  - Demonstrated the importance of GPU-based training for large-scale datasets.

  - Achieved a breakthrough performance in the ImageNet challenge, sparking the modern deep learning revolution.