# Task Number 01

### - What are the advantages of convolutional layers over fully connected layers in image processing tasks? 

Convolutional layers are better for image processing because they detect spatial features like edges through filters that preserve local patterns, unlike fully connected layers that only learn global patterns. They use fewer parameters due to weight sharing, which reduces computational complexity and helps prevent overfitting. This makes them more efficient and effective for tasks like object detection and classification.

###    - How does pooling help in reducing the computational complexity of a CNN?

Pooling reduces computational complexity by downsampling feature maps, which cuts down on the number of calculations and memory usage, speeding up training and inference.

### - Compare different types of pooling layers (max pooling, average pooling). What are their respective advantages and disadvantages? 

Max Pooling selects the highest value from each patch, enhancing feature prominence and robustness but can lose some detail. Average Pooling computes the average value, preserving more spatial information but might miss strong features. Max pooling is often preferred for its effectiveness in capturing key features.

# Task Number 02

### Implementing CNN on MNIST Dataset

In [13]:
                            #importing the necessary libraries
import numpy as np
import struct
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

In [14]:
def load_images(filename):
    
                #loading the images from the dataset in which i read the magic number, the rows and columns
    with open(filename, 'rb') as f:
        magic, num, rows, cols = struct.unpack(">IIII", f.read(16))
        images = np.fromfile(f, dtype=np.uint8).reshape(num, rows, cols, 1)
                #normalizing and returning the array
        images = images.astype(np.float32) / 255.0  
    return images

                #similarly i loaded the loaded the labels as in my last assignment and read the magic number as parameters
def load_labels(filename):
    with open(filename, 'rb') as f:
        magic, num = struct.unpack(">II", f.read(8))
        labels = np.fromfile(f, dtype=np.uint8)
    return labels


In [15]:
                    #loading the images and applying one hot encoding on the labels
train_images = load_images(r"C:\Users\HP\Downloads\archive\train-images.idx3-ubyte")
train_labels = load_labels(r"C:\Users\HP\Downloads\archive\train-labels.idx1-ubyte")
test_images = load_images(r"C:\Users\HP\Downloads\archive\t10k-images.idx3-ubyte")
test_labels = load_labels(r"C:\Users\HP\Downloads\archive\t10k-labels.idx1-ubyte")


train_labels = tf.keras.utils.to_categorical(train_labels, 10)
test_labels = tf.keras.utils.to_categorical(test_labels, 10)


In [7]:
                    #then i built the model 
model = models.Sequential([
    
                #here's my convolution layer
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
                #second convlution layer
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
                #flattening into dense layers
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [9]:
        #saving and compiling. save the best model
callbacks = [
    EarlyStopping(monitor='val_loss', patience=3),
    ModelCheckpoint('best_model.keras', save_best_only=True)
]


In [10]:
                #training and evaluating the model
history = model.fit(train_images, train_labels,
                    epochs=20,
                    batch_size=64,
                    validation_split=0.2,
                    callbacks=callbacks)


Epoch 1/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 19ms/step - accuracy: 0.8718 - loss: 0.4252 - val_accuracy: 0.9792 - val_loss: 0.0689
Epoch 2/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 18ms/step - accuracy: 0.9818 - loss: 0.0568 - val_accuracy: 0.9833 - val_loss: 0.0527
Epoch 3/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 18ms/step - accuracy: 0.9884 - loss: 0.0375 - val_accuracy: 0.9865 - val_loss: 0.0476
Epoch 4/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 18ms/step - accuracy: 0.9918 - loss: 0.0249 - val_accuracy: 0.9894 - val_loss: 0.0369
Epoch 5/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 18ms/step - accuracy: 0.9941 - loss: 0.0185 - val_accuracy: 0.9897 - val_loss: 0.0361
Epoch 6/20
[1m750/750[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 19ms/step - accuracy: 0.9960 - loss: 0.0129 - val_accuracy: 0.9893 - val_loss: 0.0401
Epoch 7/20
[1m7

In [12]:
                #loading the best model saved previously
model.load_weights('best_model.keras')


test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc:.4f}")


[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9855 - loss: 0.0395
Test accuracy: 0.9901


For the MNIST dataset, the CNN architecture includes two convolutional layers with 32 and 64 filters, respectively, 
followed by max-pooling layers, a flatten layer, and dense layers, ending in a 10-neuron output layer for digit classification. The model is trained with images normalized to [0, 1], using the Adam optimizer and categorical crossentropy loss over 10-20 epochs with a batch size of 32. EarlyStopping and ModelCheckpoint callbacks are employed to prevent overfitting and save the best model. Challenges include managing overfitting and computational demands, which can be addressed with regularization and GPUs.

### Implementing CNN on Cat-Dog Dataset

In [19]:
pip install tensorflow pillow numpy




In [23]:
import os
import numpy as np
from PIL import Image
from sklearn.model_selection import train_test_split

In [24]:
def load_images_and_labels(folder, label):
    
    #loading the images. i have two parameters: the folder which contains the images and the label
  
    images = []
    labels = []
    for filename in os.listdir(folder):
        img_path = os.path.join(folder, filename)
        try:
            with Image.open(img_path) as img:
                img = img.convert('RGB')
                img = img.resize((150, 150))
                img_array = np.array(img) / 255.0
                images.append(img_array)
                labels.append(label)
        except Exception as e:
            print(f"Could not process image {img_path}: {e}")
    
    return np.array(images), np.array(labels)


In [26]:
                #storing the path of the folder in a variable and then loading the images
cat_folder = r"C:\Users\HP\Downloads\archive (2)\PetImages\Cat"
cat_images, cat_labels = load_images_and_labels(cat_folder, label=1)  # Assume label 1 for cats


In [27]:
                        #the dataset did not contain folder for dog images, so i created a dummy set

num_dummy_images = len(cat_images) 
dummy_images = np.random.random((num_dummy_images, 150, 150, 3))  
dummy_labels = np.zeros(num_dummy_images)  

images = np.concatenate((cat_images, dummy_images))
labels = np.concatenate((cat_labels, dummy_labels))


In [28]:
train_images, test_images, train_labels, test_labels = train_test_split(images, labels, test_size=0.2, random_state=42)


In [29]:
from tensorflow.keras import layers, models

In [30]:
                        # Defining CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(1, activation='sigmoid')  # Output layer for binary classification
])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [31]:
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Callbacks
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

callbacks = [
    EarlyStopping(monitor='val_loss', patience=3),
    ModelCheckpoint('best_model.keras', save_best_only=True)
]

In [32]:
                    # Training the model
history = model.fit(
    train_images, 
    train_labels,
    epochs=20,
    batch_size=32,
    validation_data=(test_images, test_labels),
    callbacks=callbacks
)

Epoch 1/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 2s/step - accuracy: 0.5615 - loss: 1.4900 - val_accuracy: 1.0000 - val_loss: 0.0580
Epoch 2/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 2s/step - accuracy: 1.0000 - loss: 0.0337 - val_accuracy: 1.0000 - val_loss: 0.0020
Epoch 3/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 2s/step - accuracy: 1.0000 - loss: 9.4143e-04 - val_accuracy: 1.0000 - val_loss: 3.4447e-04
Epoch 4/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 3s/step - accuracy: 1.0000 - loss: 9.0352e-05 - val_accuracy: 1.0000 - val_loss: 1.5175e-04
Epoch 5/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 3s/step - accuracy: 1.0000 - loss: 8.1902e-06 - val_accuracy: 1.0000 - val_loss: 6.2588e-05
Epoch 6/20
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 2s/step - accuracy: 1.0000 - loss: 1.9491e-06 - val_accuracy: 1.0000 - val_loss: 2.0234e-05
Epoch 7/20
[1m5/5[0m 

In [33]:

model.load_weights('best_model.keras')
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc:.4f}")

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 88ms/step - accuracy: 1.0000 - loss: 4.2502e-07
Test accuracy: 1.0000


For the Cat-vs-Dog dataset, the CNN architecture features two convolutional layers with 32 and 64 filters, followed by max-pooling layers, a flatten layer, and dense layers, with a final sigmoid activation for binary classification. Images are resized to 150x150 pixels, normalized, and augmented. The model uses the Adam optimizer and binary crossentropy loss, trained for 20-30 epochs with a batch size of 32. Similar callbacks are used to prevent overfitting and save the best model. Challenges include handling class imbalance and ensuring effective data augmentation, addressed through balanced sampling and proper augmentation techniques.

### Comparisons and results of the simple CNN vs. ANN on the MNIST dataset.


Architecture: ANNs use a flat input layer with dense layers, while CNNs use convolutional and pooling layers to capture image features.

Accuracy: ANNs achieve around 97-98%, while CNNs often hit 98-99% due to better feature detection.

Training Time: ANNs are quicker to train, but CNNs, though slower, perform better on image tasks.

Overall: CNNs are usually more accurate and better at handling image data, while ANNs are simpler and faster but less effective for this task