<a href="https://colab.research.google.com/github/Ashahet1/AI-Engineer-Roadmap-2024/blob/main/Chapter_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Implementing Convolution Neural Network

- **Convolutional Neural Network (CNN)**:
  - A neural network that uses **convolutions** to detect features in images, enhancing the ability to classify and recognize objects within them.

- **Convolution**:
  - A **mathematical filter** applied to images, multiplying pixel values with a filter’s weights to transform pixel values, detecting features like edges or textures.
  - Example: Applying vertical and horizontal filters for line detection.

- **Pooling**:
  - Reduces image size while retaining important features.
  - **Max Pooling**: Selects the maximum value in a pixel group (pool), reducing dimensions (e.g., 512x512 to 256x256).
  - Other types include **Min Pooling** (smallest value) and **Average Pooling** (mean value).

- **`Conv2D` Layer**:
  - Adds convolutional layers to the model.
  - Parameters:
    - **Number of filters**: e.g., 64.
    - **Filter size**: e.g., `(3, 3)`.
    - **Activation function**: e.g., `relu`.
    - **Input shape**: `(height, width, channels)` (e.g., `28x28x1` for grayscale).

- **`MaxPooling2D` Layer**:
  - Performs max pooling, reducing dimensions while emphasizing key features.
  - Example: `(2, 2)` pool size reduces image data by 75%.

- **Image Augmentation and Transfer Learning**:
  - Techniques to enhance model training:
    - **Augmentation**: Generates more data by transforming existing images (e.g., rotating, flipping).
    - **Transfer Learning**: Uses pre-trained models to improve efficiency.



In [3]:
import tensorflow as tf

In [4]:
data = tf.keras.datasets.fashion_mnist

In [5]:
(training_images, training_labels), (test_images, test_labels) = data.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
[1m29515/29515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
[1m26421880/26421880[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
[1m5148/5148[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
[1m4422102/4422102[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


In [6]:
training_images = training_images.reshape(60000, 28, 28, 1)
training_images = training_images / 255.0
test_images = test_images.reshape(10000, 28, 28, 1)
test_images = test_images / 255.0

In [7]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [8]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [10]:
model.fit(training_images, training_labels, epochs=5)

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 47ms/step - accuracy: 0.9422 - loss: 0.1545
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 48ms/step - accuracy: 0.9497 - loss: 0.1322
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m144s[0m 49ms/step - accuracy: 0.9565 - loss: 0.1153
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 47ms/step - accuracy: 0.9638 - loss: 0.0981
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m142s[0m 47ms/step - accuracy: 0.9681 - loss: 0.0845


<keras.src.callbacks.history.History at 0x7a2f07d82920>

In [11]:
model.evaluate(test_images, test_labels)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 14ms/step - accuracy: 0.9048 - loss: 0.3317


[0.3310088813304901, 0.9059000015258789]

In [12]:
Classification = model.predict(test_images)
print(Classification[0])
print(test_labels[0])

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 13ms/step
[5.0208947e-13 2.2525632e-15 6.8157533e-12 1.1800760e-11 1.4490742e-13
 2.6254929e-08 1.3990652e-14 8.4780574e-08 1.2049288e-11 9.9999982e-01]
9


#### **1. Dataset and Preprocessing**
- **Horses or Humans Dataset**:
  - 1,000+ images (300x300 pixels), balanced between horses and humans.
  - Images vary in orientation, pose, and background.

- **Computer-Generated Imagery (CGI)**:
  - The dataset consists of CGI images, allowing cost-effective training with transferable features to real-world images.

- **`ImageDataGenerator`**:
  - Automates labeling by subdirectory structure.
  - Performs **rescaling** and **data augmentation**:
    - **Rotation**: Random rotations up to 40 degrees.
    - **Shifting**: Horizontal/vertical shifts up to 20%.
    - **Shearing**: Tilting images.
    - **Zooming**: Random zooming.
    - **Flipping**: Horizontal flips.
    - **Fill Mode**: Fills gaps from transformations using nearest neighbors.

---

#### **2. Convolutional Neural Networks (CNNs)**
- **Conv2D Layer**:
  - Extracts features using **filters** (e.g., 3x3).
  - Works with color images (input shape: `300x300x3`).

- **MaxPooling2D**:
  - Reduces feature map size, keeping essential features.
  - Typical pool size: `(2, 2)`.

- **Model Output**:
  - **Sigmoid activation** for binary classification.
  - A single output neuron predicts values between 0 and 1 (horse vs. human).

---

#### **3. Model Training and Performance**
- **Binary Cross Entropy Loss**:
  - Suitable for binary classification.
  
- **RMSprop Optimizer**:
  - Adaptive learning rate optimization.

- **Overfitting and Validation**:
  - High accuracy on training data but lower on validation indicates overfitting.
  - Validation accuracy (88%) vs. training accuracy (99%) in early experiments.

---

#### **4. Advanced Techniques**
- **Transfer Learning**:
  - Uses pretrained models (e.g., **InceptionV3**).
  - Freeze pretrained layers, add custom dense layers for binary classification.
  - Achieves >96% validation accuracy in fewer epochs.

- **Image Augmentation**:
  - Helps generalize the model by expanding training data through transformations.
  - Reduces overfitting by simulating varied real-world data.

- **Dropout Regularization**:
  - Randomly deactivates neurons during training (e.g., 20%) to avoid overspecialization.
  - Balances performance between training and validation sets.

---

#### **5. Multiclass Classification**
- **Softmax Activation**:
  - Used in the output layer for multiple classes.
  - Ensures predictions sum to 1.

- **Categorical Cross Entropy Loss**:
  - Suitable for multiclass problems (e.g., Rock-Paper-Scissors dataset).

---

#### **6. Practical Testing**
- **Real-World Testing**:
  - Demonstrated with external horse and human images.
  - Highlights limitations: model misclassifies unseen poses due to dataset bias.

- **Model Evaluation**:
  - Performance improves with augmented training data.
  - Provides >90% accuracy on unseen validation data.


In [13]:
import urllib.request
import zipfile
import tensorflow as tf

In [14]:
url = "https://storage.googleapis.com/learning-datasets/horse-or-human.zip"
file_name = "horse-or-human.zip"
training_dir = 'horse-or-human/training/'
urllib.request.urlretrieve(url, file_name)
zip_ref = zipfile.ZipFile(file_name, 'r')
zip_ref.extractall(training_dir)
zip_ref.close()

In [15]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1/255)

train_generator = train_datagen.flow_from_directory(
    training_dir,
    target_size=(300, 300),
    class_mode='binary'
)

Found 1027 images belonging to 2 classes.


In [16]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16, (3, 3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

In [17]:
model.summary()

In [18]:
!pip install tensorflow
from tensorflow import keras
from tensorflow.keras.optimizers import RMSprop



In [19]:
model.compile(loss='binary_crossentropy', optimizer=RMSprop(learning_rate=0.001), metrics=['accuracy'])

In [20]:
history = model.fit(train_generator, epochs=5)

Epoch 1/5


  self._warn_if_super_not_called()


[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m119s[0m 3s/step - accuracy: 0.6305 - loss: 1.1026
Epoch 2/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 3s/step - accuracy: 0.8918 - loss: 0.2658
Epoch 3/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m144s[0m 3s/step - accuracy: 0.9158 - loss: 0.1938
Epoch 4/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 3s/step - accuracy: 0.9679 - loss: 0.1206
Epoch 5/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m150s[0m 3s/step - accuracy: 0.9767 - loss: 0.0632


In [21]:
validation_url = "https://storage.googleapis.com/learning-datasets/validation-horse-or-human.zip"

validation_file_name = "validation-horse-or-human.zip"
validation_dir = 'horse-or-human/validation/'
urllib.request.urlretrieve(validation_url, validation_file_name)

zip_ref = zipfile.ZipFile(validation_file_name, 'r')
zip_ref.extractall(validation_dir)
zip_ref.close()

In [22]:
validation_datagen = ImageDataGenerator(rescale=1/255)

validation_generator = train_datagen.flow_from_directory(
  validation_dir,
  target_size=(300, 300),
  class_mode='binary'
)

Found 256 images belonging to 2 classes.


In [24]:
history = model.fit(
  train_generator,
  epochs=5,
  validation_data=validation_generator
)

Epoch 1/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m115s[0m 3s/step - accuracy: 0.9702 - loss: 0.1076 - val_accuracy: 0.7930 - val_loss: 3.1583
Epoch 2/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m115s[0m 3s/step - accuracy: 0.9997 - loss: 0.0063 - val_accuracy: 0.9453 - val_loss: 0.1419
Epoch 3/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m143s[0m 3s/step - accuracy: 0.9837 - loss: 0.0474 - val_accuracy: 0.8008 - val_loss: 2.6866
Epoch 4/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m123s[0m 4s/step - accuracy: 0.9995 - loss: 0.0052 - val_accuracy: 0.7305 - val_loss: 3.5524
Epoch 5/5
[1m33/33[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m134s[0m 3s/step - accuracy: 0.9782 - loss: 0.1102 - val_accuracy: 0.8008 - val_loss: 2.5427
