<a href="https://colab.research.google.com/github/belyakov23/intro-ml-projects-/blob/main/cats_dogs_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Cats vs Dogs Image Classification

This project uses the Kaggle "Cats and Dogs image classification" dataset to train a convolutional neural network that distinguishes between cat and dog images. The data consists of JPEG images organised into train and test folders with separate subfolders for cats and dogs.

I preprocess the images with Keras `ImageDataGenerator`, rescaling pixels and creating training, validation, and test generators. Then I build a small CNN with three convolution–max pooling blocks followed by dense layers and train it with binary cross-entropy loss and the Adam optimizer.

The final model reaches about 89% training accuracy and around 67% validation accuracy after 10 epochs, with test accuracy close to the validation score. This shows a basic but functional image classifier on a real-world, slightly noisy cats-and-dogs dataset.



In [10]:
!pip install -q kagglehub

import kagglehub
import os

# Download latest version of the dataset
path = kagglehub.dataset_download("samuelcortinhas/cats-and-dogs-image-classification")

print("Path to dataset files:", path)
print(os.listdir(path))


Using Colab cache for faster access to the 'cats-and-dogs-image-classification' dataset.
Path to dataset files: /kaggle/input/cats-and-dogs-image-classification
['test', 'train']


In [11]:
for root, dirs, files in os.walk("/content/cats_dogs_data"):
    print(root, "->", len(files), "files")


/content/cats_dogs_data -> 0 files


In [12]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models


In [13]:
train_dir = os.path.join(path, "train")
test_dir = os.path.join(path, "test")

train_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2
)

test_datagen = ImageDataGenerator(rescale=1./255)

img_size = (150, 150)
batch_size = 32

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    subset="training"
)

val_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    subset="validation"
)

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="binary",
    shuffle=False
)


Found 447 images belonging to 2 classes.
Found 110 images belonging to 2 classes.
Found 140 images belonging to 2 classes.


In [14]:
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation="relu", input_shape=(150, 150, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [15]:
history = model.fit(
    train_generator,
    epochs=10,
    validation_data=val_generator
)


  self._warn_if_super_not_called()


Epoch 1/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m30s[0m 2s/step - accuracy: 0.5036 - loss: 1.0948 - val_accuracy: 0.5000 - val_loss: 0.6930
Epoch 2/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 2s/step - accuracy: 0.6006 - loss: 0.6926 - val_accuracy: 0.5000 - val_loss: 0.6938
Epoch 3/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m24s[0m 2s/step - accuracy: 0.5232 - loss: 0.6925 - val_accuracy: 0.4909 - val_loss: 0.6933
Epoch 4/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 2s/step - accuracy: 0.5411 - loss: 0.6863 - val_accuracy: 0.5182 - val_loss: 0.7032
Epoch 5/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 2s/step - accuracy: 0.6053 - loss: 0.6500 - val_accuracy: 0.5727 - val_loss: 0.6869
Epoch 6/10
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m25s[0m 2s/step - accuracy: 0.5749 - loss: 0.6284 - val_accuracy: 0.6182 - val_loss: 0.6282
Epoch 7/10
[1m14/14[0m [32m━━━━━━━━━━

In [16]:
test_loss, test_acc = model.evaluate(test_generator)
print("Test accuracy:", test_acc)


[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 431ms/step - accuracy: 0.7023 - loss: 0.6707
Test accuracy: 0.6642857193946838


In [17]:
import numpy as np

test_generator.reset()
imgs, labels = next(test_generator)

preds = model.predict(imgs)
pred_classes = (preds.flatten() > 0.5).astype(int)

print("True labels:     ", labels[:10].astype(int))
print("Predicted labels:", pred_classes[:10])
print("Class indices:", test_generator.class_indices)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 471ms/step
True labels:      [0 0 0 0 0 0 0 0 0 0]
Predicted labels: [1 0 0 0 0 0 0 0 0 1]
Class indices: {'cats': 0, 'dogs': 1}
