# Cats vs Dogs Model Training - Google Colab

This notebook allows you to train the Cats vs Dogs classification model on Google Colab with GPU support and save the results using DVC.

## Prerequisites
- Google account
- Access to your GitHub repository
- DVC remote credentials (Backblaze B2 or S3)

## Steps:
1. Setup Colab environment and GPU
2. Clone your repository
3. Install dependencies
4. Configure DVC
5. Pull data from DVC
6. Train the model
7. Save model with DVC
8. Push to DVC remote and GitHub

## 1. Check GPU Availability

Make sure you've enabled GPU in Colab:
- Go to Runtime > Change runtime type
- Select GPU (T4, A100, or V100)

In [None]:
# Check if GPU is available
import tensorflow as tf

print("TensorFlow version:", tf.__version__)
print("GPU available:", tf.config.list_physical_devices('GPU'))
print("CUDA available:", tf.test.is_built_with_cuda())

if len(tf.config.list_physical_devices('GPU')) > 0:
    print("\n✅ GPU is available! Training will be much faster.")
else:
    print("\n⚠️ No GPU detected. Please enable GPU in Runtime > Change runtime type")

## 2. Clone Your Repository

Clone your GitHub repository to access the training code.

In [None]:
# Clone the repository
import os

# Set your GitHub username and repo name
GITHUB_USERNAME = "bigalex95"  # Change this to your username
REPO_NAME = "are-you-a-cat-mlops-pipeline"
REPO_URL = f"https://github.com/{GITHUB_USERNAME}/{REPO_NAME}.git"

# Remove if already exists
if os.path.exists(REPO_NAME):
    !rm -rf {REPO_NAME}

# Clone the repository
!git clone {REPO_URL}

# Change to repository directory
%cd {REPO_NAME}

## 3. Install Dependencies

Install all required packages.

In [None]:
# Install required packages
!pip install -q tensorflow tensorflow-datasets
!pip install -q dvc boto3 s3fs
!pip install -q numpy pillow matplotlib seaborn scikit-learn

print("\n✅ All dependencies installed!")

## 4. Configure DVC Remote

Set up your DVC remote credentials. You'll need your Backblaze B2 (or S3) credentials.

**Security Note:** Use Colab secrets for sensitive data or enter credentials here (they won't be saved in the notebook).

In [None]:
import os
from getpass import getpass

# Option 1: Use Colab secrets (recommended)
try:
    from google.colab import userdata
    AWS_ACCESS_KEY_ID = userdata.get('AWS_ACCESS_KEY_ID')
    AWS_SECRET_ACCESS_KEY = userdata.get('AWS_SECRET_ACCESS_KEY')
    print("✅ Using credentials from Colab secrets")
except:
    # Option 2: Enter credentials manually
    print("Enter your DVC remote credentials (Backblaze B2 or S3):")
    AWS_ACCESS_KEY_ID = getpass("Access Key ID: ")
    AWS_SECRET_ACCESS_KEY = getpass("Secret Access Key: ")

# Set environment variables for DVC
os.environ['AWS_ACCESS_KEY_ID'] = AWS_ACCESS_KEY_ID
os.environ['AWS_SECRET_ACCESS_KEY'] = AWS_SECRET_ACCESS_KEY

print("\n✅ DVC credentials configured!")

## 5. Pull Data from DVC

Download the processed training data from your DVC remote.

In [None]:
# Initialize DVC and pull data
!dvc pull data/processed.dvc

# Verify data is downloaded
!ls -lh data/processed/

print("\n✅ Training data downloaded!")

## 6. Load and Verify Data

Load the preprocessed data and check its shape.

In [None]:
import numpy as np
import sys

# Add src to Python path
sys.path.append('src')

# Load processed data
print("Loading processed data...")
X_train = np.load('data/processed/train_images.npy')
y_train = np.load('data/processed/train_labels.npy')
X_val = np.load('data/processed/val_images.npy')
y_val = np.load('data/processed/val_labels.npy')
X_test = np.load('data/processed/test_images.npy')
y_test = np.load('data/processed/test_labels.npy')

print(f"\nData shapes:")
print(f"  Training:   {X_train.shape} images, {y_train.shape} labels")
print(f"  Validation: {X_val.shape} images, {y_val.shape} labels")
print(f"  Test:       {X_test.shape} images, {y_test.shape} labels")

print(f"\nClass distribution:")
print(f"  Training:   {np.sum(y_train == 0)} cats, {np.sum(y_train == 1)} dogs")
print(f"  Validation: {np.sum(y_val == 0)} cats, {np.sum(y_val == 1)} dogs")
print(f"  Test:       {np.sum(y_test == 0)} cats, {np.sum(y_test == 1)} dogs")

print("\n✅ Data loaded successfully!")

## 7. Visualize Sample Images

Let's look at some sample images to verify the data.

In [None]:
import matplotlib.pyplot as plt

# Visualize some training images
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
axes = axes.ravel()

for i in range(10):
    axes[i].imshow(X_train[i])
    label = "Dog" if y_train[i] == 1 else "Cat"
    axes[i].set_title(f"{label}")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

print("Sample images displayed!")

## 8. Build and Compile Model

Create the CNN model for training.

In [None]:
from model_train import build_cnn_model, compile_model

# Build the model
print("Building model...")
model = build_cnn_model(
    input_shape=(150, 150, 3),
    num_classes=1
)

# Display model architecture
model.summary()

# Compile the model
print("\nCompiling model...")
model = compile_model(
    model,
    learning_rate=0.001,
    optimizer='adam',
    loss='binary_crossentropy'
)

print("\n✅ Model built and compiled!")

## 9. Set Up Training Callbacks

Configure callbacks for better training (early stopping, model checkpointing, etc.).

In [None]:
from model_train import create_callbacks

# Create callbacks
callbacks = create_callbacks(
    model_save_path='models/best_model_colab.keras',
    monitor='val_loss',
    patience=5
)

print("✅ Training callbacks configured!")

## 10. Train the Model

Now let's train the model! This may take 15-30 minutes depending on your GPU.

In [None]:
from model_train import train_model
import time

# Training configuration
EPOCHS = 20
BATCH_SIZE = 32

print(f"Starting training...")
print(f"  Epochs: {EPOCHS}")
print(f"  Batch size: {BATCH_SIZE}")
print(f"  Training samples: {len(X_train)}")
print(f"  Validation samples: {len(X_val)}")
print("\n" + "="*80)

start_time = time.time()

# Train the model
history = train_model(
    model,
    train_data=(X_train, y_train),
    val_data=(X_val, y_val),
    epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    callbacks=callbacks,
    verbose=1
)

training_time = time.time() - start_time
print("\n" + "="*80)
print(f"✅ Training completed in {training_time/60:.2f} minutes!")

## 11. Visualize Training History

Plot the training and validation metrics to see how the model performed.

In [None]:
import matplotlib.pyplot as plt

# Plot training history
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Plot accuracy
axes[0].plot(history.history['accuracy'], label='Training Accuracy')
axes[0].plot(history.history['val_accuracy'], label='Validation Accuracy')
axes[0].set_title('Model Accuracy')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Accuracy')
axes[0].legend()
axes[0].grid(True)

# Plot loss
axes[1].plot(history.history['loss'], label='Training Loss')
axes[1].plot(history.history['val_loss'], label='Validation Loss')
axes[1].set_title('Model Loss')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Loss')
axes[1].legend()
axes[1].grid(True)

plt.tight_layout()
plt.show()

# Print final metrics
print("\nFinal Training Metrics:")
print(f"  Training Accuracy: {history.history['accuracy'][-1]:.4f}")
print(f"  Validation Accuracy: {history.history['val_accuracy'][-1]:.4f}")
print(f"  Training Loss: {history.history['loss'][-1]:.4f}")
print(f"  Validation Loss: {history.history['val_loss'][-1]:.4f}")

## 12. Evaluate on Test Set

Test the model on unseen data.

In [None]:
# Evaluate on test set
print("Evaluating model on test set...")
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=1)

print(f"\n" + "="*80)
print(f"Test Results:")
print(f"  Test Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")
print(f"  Test Loss: {test_loss:.4f}")
print("="*80)

## 13. Save Final Model

Save the trained model in the models directory.

In [None]:
from model_train import save_model
from datetime import datetime

# Create models directory
!mkdir -p models

# Save with timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
model_filename = f"models/cats_vs_dogs_model_colab_{timestamp}.keras"

# Also save as the main model
main_model_path = "models/cats_vs_dogs_model.keras"

save_model(model, model_filename)
save_model(model, main_model_path)

print(f"\n✅ Model saved to:")
print(f"  - {model_filename}")
print(f"  - {main_model_path}")

## 14. Add Model to DVC

Track the model with DVC for version control.

In [None]:
# Add model to DVC
!dvc add models/cats_vs_dogs_model.keras

print("\n✅ Model added to DVC!")

# Show what was created
!ls -lh models/*.dvc

## 15. Push Model to DVC Remote

Upload the model to your DVC remote storage.

In [None]:
# Push model to DVC remote
!dvc push models/cats_vs_dogs_model.keras.dvc

print("\n✅ Model pushed to DVC remote!")

## 16. Commit and Push to GitHub

Save the DVC metadata files to GitHub.

In [None]:
# Configure git (replace with your info)
!git config --global user.email "your.email@example.com"
!git config --global user.name "Your Name"

# Add DVC files
!git add models/cats_vs_dogs_model.keras.dvc models/.gitignore

# Commit changes
!git commit -m "Add trained model from Colab - Test Accuracy: {test_accuracy:.4f}"

print("\n✅ Changes committed!")
print("\nTo push to GitHub, you'll need to authenticate.")
print("Run the following command manually:")
print("  git push origin model-development")
print("\nOr authenticate with a personal access token.")

## 17. Download Model (Optional)

Download the trained model to your local machine if needed.

In [None]:
from google.colab import files

# Download the model file
print("Downloading model...")
files.download(main_model_path)

print("\n✅ Model downloaded to your local machine!")

## 18. Summary and Next Steps

### What we accomplished:
1. ✅ Set up Google Colab with GPU
2. ✅ Cloned your repository
3. ✅ Installed dependencies
4. ✅ Configured DVC remote
5. ✅ Pulled training data from DVC
6. ✅ Trained the CNN model
7. ✅ Evaluated on test set
8. ✅ Saved model with DVC
9. ✅ Pushed model to DVC remote

### Next Steps:
1. Push the changes to GitHub (you may need to do this locally with credentials)
2. Pull the model on your local machine: `dvc pull models/cats_vs_dogs_model.keras.dvc`
3. Test the model with your inference pipeline
4. Deploy the model using your Streamlit app

### Tips:
- Save this notebook to your Google Drive for future training sessions
- You can experiment with different hyperparameters (learning rate, batch size, etc.)
- Try data augmentation for better performance
- Monitor training in real-time with the progress bars

## Optional: Test Model Predictions

Test the model on some sample images.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Test on random samples
num_samples = 10
random_indices = np.random.choice(len(X_test), num_samples, replace=False)

fig, axes = plt.subplots(2, 5, figsize=(15, 6))
axes = axes.ravel()

for i, idx in enumerate(random_indices):
    # Get image and true label
    image = X_test[idx]
    true_label = "Dog" if y_test[idx] == 1 else "Cat"
    
    # Make prediction
    prediction = model.predict(np.expand_dims(image, axis=0), verbose=0)[0][0]
    predicted_label = "Dog" if prediction > 0.5 else "Cat"
    confidence = prediction if prediction > 0.5 else 1 - prediction
    
    # Display
    axes[i].imshow(image)
    color = 'green' if true_label == predicted_label else 'red'
    axes[i].set_title(f"True: {true_label}\nPred: {predicted_label} ({confidence:.2%})", color=color)
    axes[i].axis('off')

plt.tight_layout()
plt.show()

print("Green = Correct prediction, Red = Incorrect prediction")