<a href="https://colab.research.google.com/github/belyakov23/intro-ml-projects-/blob/main/cats_dogs_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Cats vs Dogs Image Classification

This project uses the Kaggle "Cats and Dogs image classification" dataset to train a convolutional neural network that distinguishes between cat and dog images. The data consists of JPEG images organised into train and test folders with separate subfolders for cats and dogs.

I preprocess the images with Keras ImageDataGenerator, rescaling pixels and creating training, validation, and test generators. Then I build a small CNN with three convolution–max pooling blocks followed by dense layers and train it with binary cross-entropy loss and the Adam optimizer.

The final model reaches about 85% training accuracy, around 66% validation accuracy, and approximately 69% test accuracy. This shows a basic but functional image classifier on a real-world, slightly noisy cats-and-dogs dataset.


In [2]:
import zipfile
import os

zip_path = "/content/cats and dogs.zip"  # copy exact name from left pane

with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall("/content/cats_dogs_data")

os.listdir("/content/cats_dogs_data")


['test', 'train']

In [3]:
for root, dirs, files in os.walk("/content/cats_dogs_data"):
    print(root, "->", len(files), "files")


/content/cats_dogs_data -> 0 files
/content/cats_dogs_data/test -> 0 files
/content/cats_dogs_data/test/dogs -> 70 files
/content/cats_dogs_data/test/cats -> 70 files
/content/cats_dogs_data/train -> 0 files
/content/cats_dogs_data/train/dogs -> 278 files
/content/cats_dogs_data/train/cats -> 279 files


In [6]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models


In [7]:
train_dir = "/content/cats_dogs_data/train"
test_dir = "/content/cats_dogs_data/test"

train_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2
)

test_datagen = ImageDataGenerator(rescale=1./255)

img_size = (150, 150)
batch_size = 32

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    subset="training"
)

val_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    subset="validation"
)

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    shuffle=False
)


Found 447 images belonging to 2 classes.
Found 110 images belonging to 2 classes.
Found 140 images belonging to 2 classes.


In [8]:
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [9]:
history = model.fit(
    train_generator,
    epochs=10,
    validation_data=val_generator
)


  self._warn_if_super_not_called()


Epoch 1/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 2s/step - accuracy: 0.4977 - loss: 1.2798 - val_accuracy: 0.5000 - val_loss: 0.6917
Epoch 2/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m24s[0m 2s/step - accuracy: 0.5001 - loss: 0.6934 - val_accuracy: 0.5000 - val_loss: 0.6906
Epoch 3/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m24s[0m 2s/step - accuracy: 0.4883 - loss: 0.6920 - val_accuracy: 0.5000 - val_loss: 0.6839
Epoch 4/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m25s[0m 2s/step - accuracy: 0.5108 - loss: 0.6903 - val_accuracy: 0.5000 - val_loss: 0.6924
Epoch 5/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m25s[0m 2s/step - accuracy: 0.5047 - loss: 0.6858 - val_accuracy: 0.5273 - val_loss: 0.6882
Epoch 6/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m24s[0m 2s/step - accuracy: 0.5556 - loss: 0.6751 - val_accuracy: 0.5909 - val_loss: 0.6692
Epoch 7/10
[1m14/14[0m [32m━━━━━━━━━━

In [10]:
test_loss, test_acc = model.evaluate(test_generator)
print("Test accuracy:", test_acc)


[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 662ms/step - accuracy: 0.6572 - loss: 0.7704
Test accuracy: 0.6928571462631226


In [11]:
import numpy as np

test_generator.reset()
imgs, labels = next(test_generator)

preds = model.predict(imgs)
pred_classes = (preds.flatten() > 0.5).astype(int)

print("True labels:     ", labels[:10].astype(int))
print("Predicted labels:", pred_classes[:10])
print("Class indices:", test_generator.class_indices)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 450ms/step
True labels:      [0 0 0 0 0 0 0 0 0 0]
Predicted labels: [1 0 0 0 1 0 1 0 0 0]
Class indices: {'cats': 0, 'dogs': 1}
