Skip to content

mohsenkhashei/cifar10

Repository files navigation

Improved CNN Training and Visualization

Objective

The objective of this document is to provide a detailed overview of the modifications made to the Convolutional Neural Network (CNN) training process for the CIFAR-10 dataset. The changes are aimed at enhancing model performance through data augmentation, learning rate scheduling, and other strategies.

DataSet

The dataset used in the provided code is the CIFAR-10 dataset. Here's some information about CIFAR-10:

  1. Overview:

    • CIFAR-10 stands for the Canadian Institute for Advanced Research - 10. It is a well-known benchmark dataset for image classification tasks in the field of machine learning and computer vision.
  2. Contents:

    • The CIFAR-10 dataset consists of a collection of 60,000 32x32 color images in 10 different classes, with each class containing 6,000 images.
  3. Classes:

    • The dataset is divided into 10 classes, each representing a specific category of objects. The classes are as follows:
      1. Airplane
      2. Automobile
      3. Bird
      4. Cat
      5. Deer
      6. Dog
      7. Frog
      8. Horse
      9. Ship
      10. Truck
  4. Image Size:

    • All images in the CIFAR-10 dataset are 32 pixels in height and 32 pixels in width. Each image is a color image with three channels (RGB).
  5. Training and Testing:

    • The dataset is split into two subsets: a training set and a test set.
    • The training set contains 50,000 images (5,000 images per class).
    • The test set contains 10,000 images (1,000 images per class).
  6. Purpose:

    • CIFAR-10 is commonly used for benchmarking image classification algorithms and models. It is particularly suitable for testing the performance of deep neural networks, including convolutional neural networks (CNNs).
  7. Challenges:

    • The small size of the images and the presence of multiple classes with similar visual features make CIFAR-10 a challenging dataset. Models need to learn subtle differences between classes to achieve high accuracy.
  8. Dataset Source:

    • The CIFAR-10 dataset is available for download from the CIFAR website. It is also accessible through various machine learning libraries, such as TensorFlow and PyTorch.
  9. Usage in Research and Education:

    • CIFAR-10 has been widely used in academic research, educational settings, and machine learning competitions. It serves as a standard dataset for testing and comparing the performance of image classification algorithms.
  10. Data Normalization:

    • In the code provided, the pixel values of the images are normalized to be between 0 and 1 by dividing them by 255.0. Normalization is a common preprocessing step in machine learning to ensure numerical stability during training.

1. Parameters Table

Overview

The following table summarizes the key parameters used in the CNN training process.

Parameter Value
Epochs 10
Batch Size 64
Validation Split 0.2
Optimizer Adam
Learning Rate 0.001
Loss Function Categorical Crossentropy
Metrics Accuracy

Interpretation

  • Epochs: Increased to 50 for more extensive training.
  • Batch Size: Remains at 64 for balanced training.
  • Validation Split: 20% of training data used for validation.
  • Optimizer: Adam optimizer with a learning rate of 0.001.
  • Loss Function: Categorical Crossentropy for multi-class classification.
  • Metrics: Accuracy used as the evaluation metric.

2. Model Architecture

Overview

The CNN architecture remains unchanged, featuring three convolutional layers followed by max-pooling, flattening, and fully connected layers.

Interpretation

  • Convolutional Layers: Extract features using 3x3 filters.
  • Max Pooling: Reduces spatial dimensions.
  • Flatten Layer: Converts 3D tensor to 1D for dense layers.
  • Dense Layers: Two dense layers with ReLU activation.
  • Output Layer: Dense layer with softmax activation for multi-class classification.

3. Training Loop

Overview

The training loop is modified to include data augmentation using the ImageDataGenerator. Additionally, a learning rate scheduler (ReduceLROnPlateau) is implemented for dynamic adjustment during training.

Changes Made

  • Data Augmentation: Applied rotations, shifts, zooms, flips to augment training dataset.
  • Learning Rate Scheduler: Adjusts learning rate dynamically based on validation loss.

4. Model Evaluation

Overview

Model performance is evaluated on the test set using accuracy as the primary metric.

Metrics

  • Test Accuracy: The accuracy achieved on the test set.

5. Visualizations

Overview

Various visualizations are included to provide insights into model training, performance, and predictions.

Included Visualizations

  1. Confusion Matrix: Provides a detailed breakdown of model predictions.
  2. Sample Predictions: Visualization of a few sample predictions.
  3. Learning Rate Plot: Displays the learning rate changes during training.
  4. Filter Visualization: Visualization of filters from the first convolutional layer.
  5. Scatter Plot: Training vs Validation Accuracy.
  6. Histogram: Distribution of Training and Validation Loss.
  7. Bubble Chart: Accuracy and Learning Rate Over Epochs.
  8. Area Chart: Accuracy Over Epochs with Validation Range.
  9. Spline Chart: Training and Validation Loss Over Epochs. Your documentation is quite comprehensive and covers the key aspects of the modified CNN training process along with visualizations. If you feel that it adequately communicates the changes made, the purpose of each modification, and the resulting visualizations, then it looks great for your presentation.

More Details

  1. Data Augmentation Details:

    • Types of data augmentation applied, such as rotations, shifts, zooms, and flips.
  2. Model Architecture Visualization:

    • CNN architecture in the form of a diagram illustrating the layers and connections within the model.
  3. Filter Visualization Details:

    • Helps in understanding what low-level features the model is learning.

Result

The modified CNN training process, along with visualizations, aims to improve model performance on the CIFAR-10 dataset. Experimentation with parameters and architecture can be further explored to optimize the model for specific use cases.

Visualization

CNN MODEL

CNN Architecture

Run #1

Alt text Alt text Alt text Alt text Alt text Alt text

Run #2

Alt text Alt text Alt text

Run #3

learning rate 0.001 -> 0.002

Alt text Alt text Alt text Alt text Alt text

Run #4

Alt text Alt text Alt text Alt text

Run #5

Alt text Alt text Alt text

RS Squarred

Alt text Alt text Alt text

Important Note:

Alt text Alt text Alt text

Requirements

installation below:

installation is different for various OS

  • # anaconda3

inside anaconda install following packages:

- build environment:

    - install python 
    - install tensorflow
    - matplotlib 
    - pydot
    - graphviz   
    - python-graphviz

using tensor board dashboard:

installing pip install tensorboard

# run this in commandline: 

tensorboard --logdir logs/fit`
  • TensorBoard provides a suite of visualization tools to understand, debug, and optimize the model training process.
  • tools like Graphviz to visualize the computational graph of your model architecture. TensorFlow and Keras provide utilities to export a model's graph in the DOT format, which can be visualized using Graphviz.