# üê± Cat Breed Classification - Kaggle Training

**Optimized for Kaggle GPU Environment**

This notebook implements complete training pipeline with:
- ‚úÖ GlobalAveragePooling2D (fixed architecture)
- ‚úÖ Two-stage training (feature extraction + fine-tuning)
- ‚úÖ Comprehensive evaluation with confusion matrix
- ‚úÖ Ready for Kaggle GPU (30 hours/week free)

---

## üìã Before Running:

1. **Accelerator**: GPU P100 or T4 (Settings ‚Üí Accelerator ‚Üí GPU)
2. **Dataset**: Attach your cat-classification dataset
3. **Internet**: Enable if needed for packages

## 1Ô∏è‚É£ Setup & Imports

In [None]:
%%time
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import json
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

# TensorFlow
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.applications.resnet_v2 import preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2  # Added for L2 regularization
from tensorflow.keras.callbacks import (
    ModelCheckpoint, EarlyStopping, ReduceLROnPlateau, CSVLogger
)
from sklearn.metrics import classification_report, confusion_matrix

print(f"‚úì TensorFlow version: {tf.__version__}")
print(f"‚úì GPU Available: {len(tf.config.list_physical_devices('GPU'))} device(s)")

## 2Ô∏è‚É£ Configuration (Kaggle-optimized)

In [None]:
# ============================================================================
# KAGGLE PATHS - Update these based on your dataset
# ============================================================================

# Input data (Kaggle dataset location)
KAGGLE_INPUT = Path('/kaggle/input')

# IMPORTANT: Change this to your dataset name after uploading
DATASET_NAME = 'cat-classification-processed'  # Change to your dataset name!
DATA_ROOT = KAGGLE_INPUT / DATASET_NAME

# Data directories
TRAIN_DIR = DATA_ROOT / 'processed' / 'train'
VAL_DIR = DATA_ROOT / 'processed' / 'val'
TEST_DIR = DATA_ROOT / 'processed' / 'test'

# Output directory (Kaggle working directory)
OUTPUT_DIR = Path('/kaggle/working')
MODELS_DIR = OUTPUT_DIR / 'models'
PLOTS_DIR = OUTPUT_DIR / 'plots'
REPORTS_DIR = OUTPUT_DIR / 'reports'

# Create output directories
MODELS_DIR.mkdir(exist_ok=True)
PLOTS_DIR.mkdir(exist_ok=True)
REPORTS_DIR.mkdir(exist_ok=True)

# ============================================================================
# MODEL PARAMETERS - OPTIMIZED TO REDUCE OVERFITTING
# ============================================================================

IMG_WIDTH, IMG_HEIGHT = 224, 224
IMG_SIZE = (IMG_WIDTH, IMG_HEIGHT)
BATCH_SIZE = 32  # Adjust based on GPU memory

# Training parameters
EPOCHS_STAGE1 = 50  # Feature extraction
EPOCHS_STAGE2 = 30  # Fine-tuning
LEARNING_RATE_STAGE1 = 1e-4
LEARNING_RATE_STAGE2 = 1e-5

# Model architecture - REDUCED to prevent overfitting
DENSE_UNITS = 256  # Reduced from 512 to 256
DROPOUT_RATE = 0.7  # Increased from 0.5 to 0.7
UNFREEZE_LAYERS = 30  # Reduced from 50 to 30
L2_REG = 0.01  # L2 regularization strength

# Augmentation - STRONGER to reduce overfitting
AUGMENTATION_CONFIG = {
    'rotation_range': 40,  # Increased from 30
    'width_shift_range': 0.3,  # Increased from 0.2
    'height_shift_range': 0.3,  # Increased from 0.2
    'shear_range': 0.3,  # Increased from 0.2
    'zoom_range': 0.3,  # Increased from 0.2
    'horizontal_flip': True,
    'brightness_range': [0.7, 1.3],  # Added brightness augmentation
    'fill_mode': 'nearest'
}

# Random seed
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)

print("\n" + "="*80)
print("CONFIGURATION - ANTI-OVERFITTING OPTIMIZED")
print("="*80)
print(f"Data Root: {DATA_ROOT}")
print(f"Output Dir: {OUTPUT_DIR}")
print(f"Image Size: {IMG_SIZE}")
print(f"Batch Size: {BATCH_SIZE}")
print(f"Stage 1 Epochs: {EPOCHS_STAGE1} (LR: {LEARNING_RATE_STAGE1})")
print(f"Stage 2 Epochs: {EPOCHS_STAGE2} (LR: {LEARNING_RATE_STAGE2})")
print(f"\nAnti-Overfitting Settings:")
print(f"  Dense Units: {DENSE_UNITS} (reduced)")
print(f"  Dropout Rate: {DROPOUT_RATE} (increased)")
print(f"  Unfreeze Layers: {UNFREEZE_LAYERS} (reduced)")
print(f"  L2 Regularization: {L2_REG}")
print(f"  Stronger Augmentation: ‚úì")
print("="*80)

## 3Ô∏è‚É£ Verify Dataset

In [None]:
# Check if dataset exists
print("Checking dataset...\n")

if not DATA_ROOT.exists():
    print(f"‚ùå ERROR: Dataset not found at {DATA_ROOT}")
    print(f"\nAvailable datasets in /kaggle/input:")
    for path in KAGGLE_INPUT.iterdir():
        print(f"  - {path.name}")
    print(f"\n‚ö†Ô∏è  Please update DATASET_NAME variable to match your dataset!")
    raise FileNotFoundError(f"Dataset not found: {DATA_ROOT}")

print(f"‚úì Dataset found: {DATA_ROOT}")

# Check data directories
for dir_name, dir_path in [("Train", TRAIN_DIR), ("Val", VAL_DIR), ("Test", TEST_DIR)]:
    if dir_path.exists():
        num_breeds = len([d for d in dir_path.iterdir() if d.is_dir()])
        print(f"‚úì {dir_name:5s}: {dir_path} ({num_breeds} breeds)")
    else:
        print(f"‚ùå {dir_name:5s}: NOT FOUND at {dir_path}")

# Determine number of classes
NUM_CLASSES = len([d for d in TRAIN_DIR.iterdir() if d.is_dir()])
print(f"\n‚úì Total cat breeds: {NUM_CLASSES}")

## 4Ô∏è‚É£ Data Generators

In [None]:
%%time

# Training generator with augmentation
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    **AUGMENTATION_CONFIG
)

# Val/Test generators without augmentation
val_test_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input
)

# Create generators
train_generator = train_datagen.flow_from_directory(
    str(TRAIN_DIR),
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=True,
    seed=RANDOM_SEED
)

validation_generator = val_test_datagen.flow_from_directory(
    str(VAL_DIR),
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=False
)

test_generator = val_test_datagen.flow_from_directory(
    str(TEST_DIR),
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=False
)

# Save class indices
class_indices = train_generator.class_indices
with open(MODELS_DIR / 'class_indices.json', 'w') as f:
    json.dump(class_indices, f, indent=4)

print(f"\n‚úì Generators created:")
print(f"  Train: {train_generator.samples} images")
print(f"  Val:   {validation_generator.samples} images")
print(f"  Test:  {test_generator.samples} images")
print(f"  Classes: {NUM_CLASSES}")

## 5Ô∏è‚É£ Build Model (GlobalAveragePooling2D)

In [None]:
%%time

def build_model(num_classes, trainable=False):
    """Build model with GlobalAveragePooling2D + strong regularization"""
    
    # Base model
    base_model = ResNet50V2(
        weights='imagenet',
        include_top=False,
        input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)
    )
    
    base_model.trainable = trainable
    
    # Build model with multiple regularization techniques
    inputs = Input(shape=(IMG_WIDTH, IMG_HEIGHT, 3))
    x = base_model(inputs, training=False)
    x = GlobalAveragePooling2D()(x)  # ‚úÖ FIXED: GAP instead of Flatten
    
    # First dropout layer
    x = Dropout(0.5)(x)
    
    # Dense layer with L2 regularization
    x = Dense(DENSE_UNITS, activation='relu', 
              kernel_regularizer=l2(L2_REG))(x)
    
    # Second dropout layer (higher rate)
    x = Dropout(DROPOUT_RATE)(x)
    
    outputs = Dense(num_classes, activation='softmax')(x)
    
    model = Model(inputs, outputs)
    return model

# Build model
model = build_model(NUM_CLASSES, trainable=False)

print("\n" + "="*80)
print("MODEL SUMMARY - ANTI-OVERFITTING ARCHITECTURE")
print("="*80)
model.summary()

# Count parameters
total_params = model.count_params()
trainable_params = sum([tf.size(w).numpy() for w in model.trainable_weights])

print(f"\nTotal parameters: {total_params:,}")
print(f"Trainable: {trainable_params:,}")
print(f"Non-trainable: {total_params - trainable_params:,}")
print(f"\nRegularization applied:")
print(f"  - 2x Dropout layers (0.5 + {DROPOUT_RATE})")
print(f"  - L2 regularization ({L2_REG})")
print(f"  - Reduced Dense units ({DENSE_UNITS})")

## 6Ô∏è‚É£ Compile Model - Stage 1

In [None]:
model.compile(
    optimizer=Adam(learning_rate=LEARNING_RATE_STAGE1),
    loss='categorical_crossentropy',
    metrics=['accuracy', tf.keras.metrics.TopKCategoricalAccuracy(k=5, name='top5_acc')]
)

print("‚úì Model compiled for Stage 1")

## 7Ô∏è‚É£ Callbacks

In [None]:
def get_callbacks(stage="stage1"):
    """Get training callbacks"""
    callbacks = []
    
    # ModelCheckpoint
    checkpoint_path = MODELS_DIR / f'best_{stage}.keras'
    callbacks.append(ModelCheckpoint(
        filepath=str(checkpoint_path),
        monitor='val_accuracy',
        mode='max',
        save_best_only=True,
        verbose=1
    ))
    
    # EarlyStopping
    callbacks.append(EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True,
        verbose=1
    ))
    
    # ReduceLROnPlateau
    callbacks.append(ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.2,
        patience=5,
        min_lr=1e-7,
        verbose=1
    ))
    
    # CSVLogger
    callbacks.append(CSVLogger(
        str(REPORTS_DIR / f'training_{stage}.log'),
        separator=',',
        append=False
    ))
    
    return callbacks

callbacks_stage1 = get_callbacks("stage1")
print("‚úì Callbacks configured")

## 8Ô∏è‚É£ STAGE 1: Feature Extraction (Fixed steps_per_epoch)

In [None]:
%%time

# Calculate steps correctly (ceiling division)
steps_per_epoch = int(np.ceil(train_generator.samples / BATCH_SIZE))
validation_steps = int(np.ceil(validation_generator.samples / BATCH_SIZE))

print("\n" + "="*80)
print("STAGE 1: FEATURE EXTRACTION (Base Frozen)")
print("="*80)
print(f"Steps per epoch: {steps_per_epoch}")
print(f"Validation steps: {validation_steps}")
print(f"Epochs: {EPOCHS_STAGE1}")
print(f"Learning rate: {LEARNING_RATE_STAGE1}\n")

# Train
history_stage1 = model.fit(
    train_generator,
    epochs=EPOCHS_STAGE1,
    steps_per_epoch=steps_per_epoch,  # ‚úÖ FIXED
    validation_data=validation_generator,
    validation_steps=validation_steps,
    callbacks=callbacks_stage1,
    verbose=1
)

print("\n‚úì Stage 1 complete!")

## 9Ô∏è‚É£ Plot Stage 1 Results

In [None]:
def plot_history(history, stage):
    """Plot training history"""
    if hasattr(history, 'history'):
        history = history.history
    
    fig, axes = plt.subplots(1, 3, figsize=(18, 5))
    
    # Loss
    axes[0].plot(history['loss'], label='Train')
    axes[0].plot(history['val_loss'], label='Val')
    axes[0].set_title(f'{stage.upper()} - Loss')
    axes[0].set_xlabel('Epoch')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Accuracy
    axes[1].plot(history['accuracy'], label='Train')
    axes[1].plot(history['val_accuracy'], label='Val')
    axes[1].set_title(f'{stage.upper()} - Accuracy')
    axes[1].set_xlabel('Epoch')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # Top-5
    if 'top5_acc' in history:
        axes[2].plot(history['top5_acc'], label='Train')
        axes[2].plot(history['val_top5_acc'], label='Val')
        axes[2].set_title(f'{stage.upper()} - Top-5 Accuracy')
        axes[2].set_xlabel('Epoch')
        axes[2].legend()
        axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig(PLOTS_DIR / f'history_{stage}.png', dpi=150, bbox_inches='tight')
    plt.show()

plot_history(history_stage1, "stage1")

## üîü STAGE 2: Fine-tuning

In [None]:
print("\n" + "="*80)
print("STAGE 2: FINE-TUNING")
print("="*80)

# Unfreeze top layers
base_model = model.layers[1]
base_model.trainable = True

for layer in base_model.layers[:-UNFREEZE_LAYERS]:
    layer.trainable = False

trainable_params = sum([tf.size(w).numpy() for w in model.trainable_weights])
print(f"Trainable parameters: {trainable_params:,}")

# Recompile
model.compile(
    optimizer=Adam(learning_rate=LEARNING_RATE_STAGE2),
    loss='categorical_crossentropy',
    metrics=['accuracy', tf.keras.metrics.TopKCategoricalAccuracy(k=5, name='top5_acc')]
)

print(f"Learning rate: {LEARNING_RATE_STAGE2}\n")

In [None]:
%%time

callbacks_stage2 = get_callbacks("stage2")

history_stage2 = model.fit(
    train_generator,
    epochs=EPOCHS_STAGE2,
    steps_per_epoch=steps_per_epoch,
    validation_data=validation_generator,
    validation_steps=validation_steps,
    callbacks=callbacks_stage2,
    verbose=1
)

print("\n‚úì Stage 2 complete!")

In [None]:
plot_history(history_stage2, "stage2")

## 1Ô∏è‚É£1Ô∏è‚É£ Test Evaluation

In [None]:
%%time

print("\n" + "="*80)
print("TEST EVALUATION")
print("="*80)

test_generator.reset()
test_steps = int(np.ceil(test_generator.samples / BATCH_SIZE))

# Predictions
predictions = model.predict(test_generator, steps=test_steps, verbose=1)
predicted_classes = np.argmax(predictions, axis=1)
true_classes = test_generator.classes

# Metrics
from sklearn.metrics import accuracy_score, top_k_accuracy_score

test_acc = accuracy_score(true_classes, predicted_classes)
test_top5 = top_k_accuracy_score(true_classes, predictions, k=5)

print(f"\n‚úì Test Accuracy: {test_acc:.4f} ({test_acc*100:.2f}%)")
print(f"‚úì Test Top-5 Accuracy: {test_top5:.4f} ({test_top5*100:.2f}%)")

## 1Ô∏è‚É£2Ô∏è‚É£ Confusion Matrix

In [None]:
# Classification report
class_names = list(train_generator.class_indices.keys())
report = classification_report(true_classes, predicted_classes, target_names=class_names)

with open(REPORTS_DIR / 'classification_report.txt', 'w') as f:
    f.write(report)

print("Classification Report (first 20 lines):")
print("\n".join(report.split('\n')[:20]))
print("...")
print(f"\n‚úì Full report saved to {REPORTS_DIR / 'classification_report.txt'}")

In [None]:
# Confusion matrix
cm = confusion_matrix(true_classes, predicted_classes)

plt.figure(figsize=(20, 18))
sns.heatmap(cm, annot=False, fmt='d', cmap='Blues',
            xticklabels=class_names, yticklabels=class_names)
plt.title('Confusion Matrix', fontsize=16, fontweight='bold', pad=20)
plt.ylabel('True Label', fontsize=12)
plt.xlabel('Predicted Label', fontsize=12)
plt.xticks(rotation=90)
plt.yticks(rotation=0)
plt.tight_layout()
plt.savefig(PLOTS_DIR / 'confusion_matrix.png', dpi=150, bbox_inches='tight')
plt.show()

print(f"‚úì Confusion matrix saved")

## 1Ô∏è‚É£3Ô∏è‚É£ Save Final Model & Summary

In [None]:
# Save final model
final_model_path = MODELS_DIR / 'cat_breed_classifier_final.keras'
model.save(str(final_model_path))
print(f"‚úì Final model saved: {final_model_path}")

# Training summary
summary = {
    'timestamp': datetime.now().isoformat(),
    'model': 'ResNet50V2',
    'num_classes': NUM_CLASSES,
    'total_parameters': int(model.count_params()),
    'stage1': {
        'epochs': len(history_stage1.history['loss']),
        'best_val_acc': float(max(history_stage1.history['val_accuracy'])),
        'best_val_loss': float(min(history_stage1.history['val_loss']))
    },
    'stage2': {
        'epochs': len(history_stage2.history['loss']),
        'best_val_acc': float(max(history_stage2.history['val_accuracy'])),
        'best_val_loss': float(min(history_stage2.history['val_loss']))
    },
    'test': {
        'accuracy': float(test_acc),
        'top5_accuracy': float(test_top5)
    }
}

with open(REPORTS_DIR / 'training_summary.json', 'w') as f:
    json.dump(summary, f, indent=4)

print(f"‚úì Summary saved: {REPORTS_DIR / 'training_summary.json'}")

print("\n" + "="*80)
print("TRAINING COMPLETE! üéâ")
print("="*80)
print(json.dumps(summary, indent=2))

## 1Ô∏è‚É£4Ô∏è‚É£ Download Instructions

**All outputs are in `/kaggle/working/`:**

```
models/
  ‚îú‚îÄ‚îÄ cat_breed_classifier_final.keras  ‚Üê Final model
  ‚îú‚îÄ‚îÄ best_stage1.keras                 ‚Üê Best from stage 1
  ‚îú‚îÄ‚îÄ best_stage2.keras                 ‚Üê Best from stage 2
  ‚îî‚îÄ‚îÄ class_indices.json                ‚Üê Class mapping

plots/
  ‚îú‚îÄ‚îÄ history_stage1.png
  ‚îú‚îÄ‚îÄ history_stage2.png
  ‚îî‚îÄ‚îÄ confusion_matrix.png

reports/
  ‚îú‚îÄ‚îÄ training_summary.json
  ‚îú‚îÄ‚îÄ classification_report.txt
  ‚îú‚îÄ‚îÄ training_stage1.log
  ‚îî‚îÄ‚îÄ training_stage2.log
```

**To download:**
1. Click "Save Version" ‚Üí "Save & Run All"
2. After completion: Output ‚Üí Download output files