<a href="https://colab.research.google.com/github/raxstar5/CatsDogs/blob/main/CatsDogs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing the required modules and dependencies
TensorFlow and Keras will be used for building the model.The PIL library for image manipulation and numpy for data handling

In [18]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from PIL import Image


**Image Manipulation**:
We create a function to load and preprocess the images. We'll resize the images to a common size and normalize the pixel values to be between 0 and 1.

In [19]:
def preprocess_image(image_path):
    image = Image.open(image_path).resize((150, 150))
    image = np.array(image) / 255.0  # Normalize pixel values
    return image

**Image Augmentation** : According to the given task, we only had to use 15 images each, so we can use data augmentation techniques to generate additional training data. We'll use Keras' ImageDataGenerator to apply random transformations to our existing images


In [20]:
data_augmentation = keras.Sequential(
    [
        layers.experimental.preprocessing.RandomFlip("horizontal"),
        layers.experimental.preprocessing.RandomRotation(0.1),
        layers.experimental.preprocessing.RandomZoom(0.1),
    ]
)

**Model Building**:
We'll create a CNN model from scratch using a combination of convolutional, pooling, and dense layers. We can start with a simple architecture and later improve it based on the results.

In [21]:
model = keras.Sequential(
    [
        layers.Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, (3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(128, (3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(128, activation="relu"),
        layers.Dense(1, activation="sigmoid"),
    ]
)


We load the cat and dog images and store it into image_paths. It's required to have the first 15 images of either of cat or of dog for proper labelling, so to maintain that we sort the array

In [25]:
import os
image_directory = "/content"
image_files = os.listdir(image_directory)
image_files = [file for file in image_files if file.endswith((".jpg", ".jpeg", ".png"))]
image_paths = [os.path.join(image_directory, file) for file in image_files]
image_paths.sort()
print(image_paths)


['/content/cat1.jpg', '/content/cat10.jpg', '/content/cat11.jpg', '/content/cat12.jpg', '/content/cat13.jpg', '/content/cat14.jpg', '/content/cat15.jpg', '/content/cat2.jpg', '/content/cat3.jpg', '/content/cat4.jpg', '/content/cat5.jpg', '/content/cat6.jpg', '/content/cat7.jpg', '/content/cat8.jpg', '/content/cat9.jpg', '/content/dog1.jpg', '/content/dog10.jpg', '/content/dog11.jpg', '/content/dog12.jpg', '/content/dog13.jpg', '/content/dog14.jpg', '/content/dog15.jpg', '/content/dog2.jpg', '/content/dog3.jpg', '/content/dog4.jpg', '/content/dog5.jpg', '/content/dog6.jpg', '/content/dog7.jpg', '/content/dog8.jpg', '/content/dog9.jpg']


**Training the model**:
We'll compile and train the model using the cat and dog images. We'll split the data into training and validation sets and use binary cross-entropy loss since it's a binary classification task.

In [26]:
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

# Load and preprocess images
images = [preprocess_image(path) for path in image_paths]
labels = [0] * 15 + [1] * 15

images = np.array(images)
labels = np.array(labels)

# Generate augmented images
augmented_images = []
for image in images:
    augmented_images.append(data_augmentation(image[np.newaxis, ...]))
    augmented_images.append(data_augmentation(image[np.newaxis, ...]))

# Create labels for augmented images
augmented_labels = np.concatenate([labels, labels])

# Convert augmented images and labels to numpy arrays
augmented_images = np.concatenate(augmented_images, axis=0)

# Shuffle the data
num_samples = augmented_images.shape[0]
shuffle_indices = np.random.permutation(num_samples)

augmented_images = augmented_images[shuffle_indices]
augmented_labels = augmented_labels[shuffle_indices]

# Train-validation split
val_split = 0.2
num_val_samples = int(val_split * num_samples)

x_train = augmented_images[:-num_val_samples]
y_train = augmented_labels[:-num_val_samples]
x_val = augmented_images[-num_val_samples:]
y_val = augmented_labels[-num_val_samples:]

# Train the model
model.fit(x_train, y_train, batch_size=16, epochs=10, validation_data=(x_val, y_val))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f66ab0dd540>

**Improving the model**:
If the initial model doesn't perform well, we can try several techniques to improve it. Some possible approaches include increasing the model complexity, using a pre-trained model (transfer learning), adjusting hyperparameters, and incorporating techniques like class activation maps (CAM) for better interpretability.

Here we're using transfer learning with a pre-trained model like VGG16

In [29]:
base_model = keras.applications.VGG16(
    weights="imagenet", include_top=False, input_shape=(150, 150, 3)
)

# Freeze the base model layers
base_model.trainable = False

# Add custom classification head
inputs = keras.Input(shape=(150, 150, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dense(256, activation="relu")(x)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

# In the same way we can repeat the steps for data loading, augmentation, and training .


In [27]:
from IPython.display import display, Image
import ipywidgets as widgets
from google.colab import files
from PIL import Image

# Function to handle file upload
def handle_upload(button):
    def _handle_upload(change):
        uploaded = files.upload()
        for filename in uploaded.keys():
            # Load and preprocess the uploaded image
            image = Image.open(filename)
            image = image.resize((150, 150))
            image = np.array(image) / 255.0
            image = image[np.newaxis, ...]

            # Make the prediction
            prediction = model.predict(image)
            if prediction > 0.5:
                result = "Dog"
            else:
                result = "Cat"
            accuracy = prediction[0][0] * 100

            # Display the result
            print("Prediction: {}, Accuracy: {:.2f}%".format(result, accuracy))

    # Create the upload button
    upload_button = widgets.Button(description="Upload Image")
    upload_button.on_click(_handle_upload)

    # Display the button
    display(upload_button)

# Call the handle_upload() function
handle_upload(None)


Button(description='Upload Image', style=ButtonStyle())

Saving cat11.jpg to cat11 (1).jpg
Prediction: Cat, Accuracy: 30.23%


Saving dog6.jpg to dog6 (1).jpg
Prediction: Dog, Accuracy: 65.75%


Saving dog15.jpg to dog15 (1).jpg
Prediction: Dog, Accuracy: 67.79%


Saving download.jpg to download.jpg
Prediction: Dog, Accuracy: 52.84%


Saving download (1).jpg to download (1).jpg
Prediction: Dog, Accuracy: 73.49%


Saving cat4.jpg to cat4 (1).jpg
Prediction: Dog, Accuracy: 63.02%


Saving cat11.jpg to cat11 (2).jpg
Prediction: Dog, Accuracy: 51.95%


Saving dog9.jpg to dog9 (1).jpg
Prediction: Cat, Accuracy: 45.41%
