# Cats vs Dogs Binary Classification - Model Training

This notebook demonstrates training a binary image classifier (Cat vs Dog) using TensorFlow/Keras with Weights & Biases (W&B) integration for experiment tracking.

## Overview

- **Model Type**: Convolutional Neural Network (CNN)
- **Task**: Binary Classification (Cat vs Dog)
- **Input Size**: 128x128 RGB images
- **Output**: Single sigmoid value (0=Cat, 1=Dog)
- **Dataset**: Cats and Dogs Light from Zenodo

## Model Architecture

```
Input (128, 128, 3)
    ↓
Conv2D(32) + MaxPool → (64, 64, 32)
    ↓
Conv2D(64) + MaxPool → (32, 32, 64)
    ↓
Conv2D(128) + MaxPool → (16, 16, 128)
    ↓
Flatten → Dense(128) → Dropout(0.3)
    ↓
Dense(1, sigmoid) → Binary Output
```

## Training Features

- **Data Augmentation**: Rotation, shifts, shear, zoom, horizontal flip
- **Normalization**: Pixel values scaled to [0, 1]
- **Optimizer**: Adam (lr=1e-3)
- **Loss**: Binary Cross-Entropy
- **Callbacks**: 
  - W&B Metrics Logger
  - ReduceLROnPlateau
  - EarlyStopping (patience=20)

## Export Format

The final model is exported as TensorFlow SavedModel format, compatible with TensorFlow Serving for production deployment.

---

In [2]:
# ----------------------------
# 0. Install required packages
# ----------------------------
#!pip install tensorflow wandb pillow --quiet

In [3]:
# ----------------------------
# 1. Import libraries
# ----------------------------
import os
import numpy as np
import json
from PIL import Image
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.callbacks import EarlyStopping
import wandb
from wandb.integration.keras import WandbMetricsLogger, WandbModelCheckpoint, WandbCallback
import os
import shutil
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import wandb
import numpy as np
from PIL import Image
import zipfile


In [4]:
# ----------------------------
# 2. Initialize W&B
# ----------------------------
wandb.login()
wandb.init(project="cats_vs_dogs", name="run_3")

[34m[1mwandb[0m: Currently logged in as: [33mthomas-leonharts[0m ([33mthomas-leonharts-fh-technikum-wien[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


## Initialize Weights & Biases

W&B provides experiment tracking, metric logging, and model versioning. Make sure you have a W&B account and API key ready.

In [5]:
# ----------------------------
# 3. Download ZIP to working directory
# ----------------------------
work_dir = os.getcwd()
url = "https://zenodo.org/record/5226945/files/cats_dogs_light.zip?download=1"

zip_path = tf.keras.utils.get_file(
    fname="cats_dogs_light.zip",
    origin=url,
    extract=False,
    cache_dir=work_dir,
    cache_subdir=""
)

print(f"ZIP downloaded to: {zip_path}")

# ----------------------------
# 4. Extract ZIP
# ----------------------------
extract_dir = "../data"
with zipfile.ZipFile(zip_path, "r") as zip_ref:
    zip_ref.extractall(extract_dir)
print(f"ZIP extracted to: {extract_dir}")

# copy content from extracted folder to extract_dir and remove extracted folder
extracted_folder = os.path.join(extract_dir, "cats_dogs_light")
for item in os.listdir(extracted_folder):
    s = os.path.join(extracted_folder, item)
    d = os.path.join(extract_dir, item)
    if os.path.isdir(s):
        shutil.copytree(s, d, dirs_exist_ok=True)
    else:
        shutil.copy2(s, d)
shutil.rmtree(extracted_folder)
print(f"Content moved from {extracted_folder} to {extract_dir} and original folder removed.")


# remove the zip file to save space
os.remove(zip_path)

Downloading data from https://zenodo.org/record/5226945/files/cats_dogs_light.zip?download=1
[1m32608921/32608921[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step
ZIP downloaded to: c:\Users\Besitzer\Documents\GitHub\wandb-model-serving\cats_vs_dogs\notebook\cats_dogs_light.zip
ZIP extracted to: ../data
Content moved from ../data\cats_dogs_light to ../data and original folder removed.


## Download and Prepare Dataset

The dataset is downloaded from Zenodo (cats_dogs_light.zip) and organized into train/test directories with cat/dog subfolders.

In [6]:
# ----------------------------
# 5. Within train and test folders create 'cats' and 'dogs' subfolders and move images accordingly
# ----------------------------

base_dirs = ['train', 'test']
categories = ['cats', 'dogs']

for base_dir in base_dirs:
    base_path = os.path.join(extract_dir, base_dir)
    for category in categories:
        category_path = os.path.join(base_path, category)
        os.makedirs(category_path, exist_ok=True)

    for filename in os.listdir(base_path):
        if filename.startswith('cat') and filename.endswith(('.jpg', '.jpeg', '.png')):
            shutil.move(os.path.join(base_path, filename), os.path.join(base_path, 'cats', filename))
        elif filename.startswith('dog') and filename.endswith(('.jpg', '.jpeg', '.png')):
            shutil.move(os.path.join(base_path, filename), os.path.join(base_path, 'dogs', filename))
print("Images moved to respective category folders.")

Images moved to respective category folders.


In [7]:
# ----------------------------
# 6. Image generators
# ----------------------------
img_size = (128, 128)
batch_size = 32

train_dir = os.path.join(extract_dir, "train")
test_dir = os.path.join(extract_dir, "test")

# ----------------------------
# Image Data Generators
# ----------------------------
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

val_datagen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(
    "../data/train",
    target_size=(128, 128),
    batch_size=32,
    class_mode='binary'
)

val_gen = val_datagen.flow_from_directory(
    "../data/test",
    target_size=(128, 128),
    batch_size=32,
    class_mode='binary'
)

Found 1000 images belonging to 2 classes.
Found 400 images belonging to 2 classes.


## Create Data Generators

Data generators handle:
- **Training augmentation**: Rotation, shifts, zoom, flips to improve generalization
- **Normalization**: Rescaling pixel values to [0, 1]
- **Batching**: Load images in batches of 32
- **Binary labels**: 0 for cats, 1 for dogs

In [8]:
print(train_gen.class_indices)  # should show {'0':0, '1':1} or {'cat':0, 'dog':1}
print(val_gen.class_indices)  # should show {'0':0, '1':1} or {'cat':0, 'dog':1}

{'cats': 0, 'dogs': 1}
{'cats': 0, 'dogs': 1}


In [9]:
# ----------------------------
# 7. CNN Model
# ----------------------------
def build_cnn(input_shape=(128,128,3)):
    model = models.Sequential()

    # Block 1
    model.add(layers.Conv2D(32, (3,3), activation='relu', padding="same", input_shape=input_shape))
    model.add(layers.MaxPooling2D(2,2))

    # Block 2
    model.add(layers.Conv2D(64, (3,3), activation='relu', padding="same"))
    model.add(layers.MaxPooling2D(2,2))

    # Block 3
    model.add(layers.Conv2D(128, (3,3), activation='relu', padding="same"))
    model.add(layers.MaxPooling2D(2,2))

    # Dense head
    model.add(layers.Flatten())
    model.add(layers.Dense(128, activation='relu'))
    model.add(layers.Dropout(0.3))   # small dropout is enough
    model.add(layers.Dense(1, activation='sigmoid'))

    return model


model = build_cnn()
model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-3),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Build CNN Model

A simple but effective 3-block CNN architecture:
- **3 Convolutional Blocks**: Progressive feature extraction (32→64→128 filters)
- **Max Pooling**: Spatial dimension reduction
- **Dense Layer**: 128 units with ReLU activation
- **Dropout**: 0.3 to prevent overfitting
- **Output**: Single sigmoid unit for binary classification

In [12]:
# ----------------------------
# 8. Callbacks and Training
# ----------------------------

from wandb.integration.keras import WandbMetricsLogger, WandbModelCheckpoint
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
import os

# Directory to save model checkpoints within W&B run
model_dir = wandb.run.dir

callbacks = [
    WandbMetricsLogger(),  # logs metrics automatically
    ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=1e-6),
    # WandbModelCheckpoint(
    #     filepath=os.path.join(model_dir, 'cats_vs_dogs_epoch-{epoch:02d}.keras'),
    #     monitor='val_loss',
    #     verbose=1,
    #     save_best_only=False,  # change to True if you only want best model
    #     save_weights_only=False,
    #     mode='auto',
    #     save_freq='epoch'
    # ),
    EarlyStopping(monitor='val_accuracy', patience=20, restore_best_weights=True)
]

# Training with generators
history = model.fit(
    train_gen,
    validation_data=val_gen,
    epochs=200,
    callbacks=callbacks
)


  self._warn_if_super_not_called()


Epoch 1/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 645ms/step - accuracy: 0.5450 - loss: 0.7402 - val_accuracy: 0.5000 - val_loss: 0.6901 - learning_rate: 0.0010
Epoch 2/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 211ms/step - accuracy: 0.5500 - loss: 0.6863 - val_accuracy: 0.5000 - val_loss: 0.6924 - learning_rate: 0.0010
Epoch 3/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 210ms/step - accuracy: 0.5590 - loss: 0.6808 - val_accuracy: 0.4975 - val_loss: 0.6913 - learning_rate: 0.0010
Epoch 4/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 214ms/step - accuracy: 0.5600 - loss: 0.6871 - val_accuracy: 0.5125 - val_loss: 0.6814 - learning_rate: 0.0010
Epoch 5/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 224ms/step - accuracy: 0.5770 - loss: 0.6706 - val_accuracy: 0.5450 - val_loss: 0.6937 - learning_rate: 0.0010
Epoch 6/200
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m

## Training with Callbacks

Training configuration:
- **Epochs**: Up to 200 (early stopping will prevent overfitting)
- **W&B Metrics Logger**: Automatically logs loss, accuracy, learning rate
- **ReduceLROnPlateau**: Reduces learning rate when validation loss plateaus
- **EarlyStopping**: Stops training if validation accuracy doesn't improve for 20 epochs

All metrics are tracked in W&B for visualization and comparison.

In [None]:
# ----------------------------
# 9. Save final model
# ----------------------------
# Now export
saved_model_path = "../model/cats_dogs_model/1"
model.export(saved_model_path)
print(f"SavedModel exported to: {saved_model_path}")

INFO:tensorflow:Assets written to: ../model/1\assets


INFO:tensorflow:Assets written to: ../model/1\assets


Saved artifact at '../model/1'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)
Captures:
  2088687413456: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088956427920: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088956426000: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088956428112: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088673218768: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088964540560: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088954406736: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088956424080: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088956415632: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2088964539024: TensorSpec(shape=(), dtype=tf.resource, name=None)
SavedModel exported to: ../model/1


## Export Model for Serving

The trained model is exported in TensorFlow SavedModel format, which is compatible with TensorFlow Serving.

**Export Path**: `../model/cats_dogs_model/1/`

This format includes:
- `saved_model.pb`: Model architecture and weights
- `variables/`: Model parameters
- `assets/`: Additional files (if any)

### Deploying to Production

To serve this model:

1. Copy the exported model to the TensorFlow Serving models directory:
   ```bash
   cp -r ../model/cats_dogs_model/1 /path/to/models/animals/2/
   ```

2. Update `models.config` to include the new version

3. TensorFlow Serving will automatically detect and serve the model

### Model Specifications

- **Input**: `(batch_size, 128, 128, 3)` - RGB images normalized to [0, 1]
- **Output**: `(batch_size, 1)` - Sigmoid probability (0=Cat, 1=Dog)
- **Preprocessing Required**: 
  - Resize to 128x128
  - Convert to RGB
  - Normalize to [0, 1]