# Notebook 7: Final CNN Challenge 🏆

**Course:** 21CSE558T - Deep Neural Network Architectures  
**Module 4:** CNNs - Practical Session  
**Type:** FINAL SUBMISSION ASSIGNMENT  
**Due:** [Instructor will specify]  
**Weight:** 25 points (major assignment)

---

## 🎯 Challenge Objective

Build the **best possible CNN** for Fashion-MNIST classification using all techniques you learned:

✅ Convolution fundamentals  
✅ CNN architecture design  
✅ Regularization (Dropout, BatchNorm)  
✅ Data augmentation  
✅ Hyperparameter tuning  

---

## 🏁 Target Performance

| Grade | Test Accuracy | Description |
|-------|--------------|-------------|
| ⭐⭐⭐⭐⭐ Excellent | ≥93% | Outstanding performance |
| ⭐⭐⭐⭐ Very Good | 91-92.9% | Strong CNN design |
| ⭐⭐⭐ Good | 89-90.9% | Solid understanding |
| ⭐⭐ Acceptable | 87-88.9% | Basic proficiency |
| ⭐ Needs Work | <87% | Requires improvement |

**Baseline to beat:** Simple CNN achieves ~88%. Your optimized model should do better!

---

## 📋 Assignment Requirements

### Mandatory Components:

1. ✅ **Architecture Design** - Justify your CNN structure
2. ✅ **Regularization** - Use at least 2 techniques
3. ✅ **Data Augmentation** - Apply appropriate transformations
4. ✅ **Training Strategy** - Show training curves
5. ✅ **Evaluation** - Test set performance + analysis
6. ✅ **Documentation** - Explain your choices

### Bonus Points (up to +5):

- 🌟 **+2 points:** Achieve ≥93% test accuracy
- 🌟 **+1 point:** Learning rate scheduling
- 🌟 **+1 point:** Ensemble of multiple models
- 🌟 **+1 point:** Detailed error analysis (confusion matrix, per-class metrics)

---

## 🚀 Setup

In [None]:
# Import all necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import (
    Conv2D, MaxPooling2D, AveragePooling2D, GlobalAveragePooling2D,
    Flatten, Dense, Dropout, BatchNormalization, Activation
)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from tensorflow.keras.regularizers import l2
from tensorflow.keras.utils import to_categorical
from sklearn.metrics import classification_report, confusion_matrix
import pandas as pd
import time
import warnings
warnings.filterwarnings('ignore')

print(f"✅ TensorFlow version: {tf.__version__}")
print(f"✅ GPU available: {len(tf.config.list_physical_devices('GPU')) > 0}")

# Set seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

print("\n🎯 Challenge Started! Good luck!")

---

## Part 1: Load and Explore Data

In [None]:
# Load Fashion-MNIST
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print(f"Training samples: {x_train.shape[0]:,}")
print(f"Test samples: {x_test.shape[0]:,}")
print(f"Image shape: {x_train.shape[1:]}")
print(f"Number of classes: {len(class_names)}")

# Visualize class distribution
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Training set distribution
unique, counts = np.unique(y_train, return_counts=True)
axes[0].bar(unique, counts, color='steelblue', alpha=0.8)
axes[0].set_xticks(unique)
axes[0].set_xticklabels([class_names[i][:8] for i in unique], rotation=45, ha='right')
axes[0].set_xlabel('Class', fontsize=11)
axes[0].set_ylabel('Count', fontsize=11)
axes[0].set_title('Training Set Class Distribution', fontsize=13, fontweight='bold')
axes[0].grid(True, alpha=0.3, axis='y')

# Sample images
axes[1].axis('off')
sample_grid = np.zeros((28*2, 28*5))
for i in range(10):
    idx = np.where(y_train == i)[0][0]
    row = i // 5
    col = i % 5
    sample_grid[row*28:(row+1)*28, col*28:(col+1)*28] = x_train[idx]
axes[1].imshow(sample_grid, cmap='gray')
axes[1].set_title('Sample Images (One per Class)', fontsize=13, fontweight='bold')

plt.tight_layout()
plt.show()

print("\n💡 Dataset is balanced - all classes have similar counts")

---

## Part 2: Data Preprocessing

**TODO:** Implement your preprocessing strategy

In [None]:
# ==========================================
# YOUR CODE HERE: Preprocess the data
# ==========================================

# Reshape to add channel dimension
x_train = x_train.reshape(-1, 28, 28, 1).astype('float32')
x_test = x_test.reshape(-1, 28, 28, 1).astype('float32')

# Normalize pixel values to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0

# One-hot encode labels
y_train_cat = to_categorical(y_train, 10)
y_test_cat = to_categorical(y_test, 10)

# Split training data for validation
validation_split = 0.1
split_idx = int((1 - validation_split) * len(x_train))

x_train_split = x_train[:split_idx]
y_train_split = y_train_cat[:split_idx]
x_val_split = x_train[split_idx:]
y_val_split = y_train_cat[split_idx:]

print(f"✅ Preprocessing complete")
print(f"Training: {x_train_split.shape[0]:,} samples")
print(f"Validation: {x_val_split.shape[0]:,} samples")
print(f"Test: {x_test.shape[0]:,} samples")

---

## Part 3: Data Augmentation Strategy

**TODO:** Design your data augmentation pipeline

**Questions to answer:**
1. Which augmentations are appropriate for Fashion-MNIST?
2. What are good parameter ranges?
3. Why did you choose these specific transformations?

In [None]:
# ==========================================
# YOUR CODE HERE: Configure data augmentation
# ==========================================

# Training data generator with augmentation
train_datagen = ImageDataGenerator(
    rotation_range=15,          # TODO: Adjust this
    width_shift_range=0.1,      # TODO: Adjust this
    height_shift_range=0.1,     # TODO: Adjust this
    zoom_range=0.1,             # TODO: Adjust this
    horizontal_flip=True,       # TODO: Keep or remove?
    fill_mode='nearest'
)

# Validation data generator (no augmentation!)
val_datagen = ImageDataGenerator()

# Create generators
train_generator = train_datagen.flow(x_train_split, y_train_split, batch_size=128)
val_generator = val_datagen.flow(x_val_split, y_val_split, batch_size=128)

print("✅ Data augmentation configured")

# Visualize augmented samples
sample_img = x_train[0:1]
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
axes = axes.ravel()

axes[0].imshow(sample_img[0].reshape(28, 28), cmap='gray')
axes[0].set_title('Original', fontsize=11, fontweight='bold')
axes[0].axis('off')

aug_iter = train_datagen.flow(sample_img, batch_size=1)
for i in range(1, 10):
    aug_img = next(aug_iter)[0]
    axes[i].imshow(aug_img.reshape(28, 28), cmap='gray')
    axes[i].set_title(f'Augmented {i}', fontsize=10)
    axes[i].axis('off')

plt.suptitle('Your Data Augmentation Examples', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

### 📝 Document Your Augmentation Choices

**✍️ Answer these questions in the cell below:**

### YOUR ANSWERS:

**1. Which augmentation techniques did you use and why?**

[Write your answer here]

**2. What parameter ranges did you choose? Why?**

[Write your answer here]

**3. Are there any augmentations you specifically avoided? Why?**

[Write your answer here]

---

## Part 4: Build Your CNN Architecture

**TODO:** Design your best CNN architecture

**Design considerations:**
- How many convolutional blocks?
- Filter sizes and counts?
- Pooling strategy?
- Regularization techniques?
- Dense layer configuration?

**Recommended pattern:**
```
Conv → BatchNorm → ReLU → Conv → BatchNorm → ReLU → MaxPool → Dropout
```

In [None]:
# ==========================================
# YOUR CODE HERE: Build your CNN model
# ==========================================

# Example architecture (you should improve this!)
model = Sequential([
    # Block 1
    Conv2D(64, (3, 3), padding='same', input_shape=(28, 28, 1)),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(64, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.25),
    
    # Block 2
    Conv2D(128, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(128, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.25),
    
    # Block 3 - Add more blocks if needed
    Conv2D(256, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(256, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.25),
    
    # Classifier
    GlobalAveragePooling2D(),
    Dense(512),
    BatchNormalization(),
    Activation('relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
], name='My_Optimized_CNN')

# Display architecture
model.summary()

print(f"\n📊 Total parameters: {model.count_params():,}")
print("\n💡 TIP: Try to keep parameters under 5 million for efficiency!")

### 📝 Document Your Architecture Choices

### YOUR ANSWERS:

**1. Describe your CNN architecture. How many blocks? What's the filter progression?**

[Write your answer here]

**2. Which regularization techniques did you use and where?**

[Write your answer here]

**3. Why did you choose GlobalAveragePooling vs Flatten? (or explain your choice)**

[Write your answer here]

**4. What design principles guided your architecture?**

[Write your answer here]

---

## Part 5: Configure Training Strategy

**TODO:** Set up optimizer, callbacks, and training parameters

In [None]:
# ==========================================
# YOUR CODE HERE: Configure training
# ==========================================

# Compile model
model.compile(
    optimizer='adam',  # TODO: Try different optimizers or learning rates
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Callbacks
callbacks = [
    # Early stopping
    EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True,
        verbose=1
    ),
    
    # Learning rate reduction (BONUS POINT!)
    ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=5,
        min_lr=1e-7,
        verbose=1
    ),
    
    # Save best model
    ModelCheckpoint(
        'best_model.h5',
        monitor='val_accuracy',
        save_best_only=True,
        verbose=1
    )
]

print("✅ Training configuration complete")
print("\n📋 Callbacks configured:")
print("  • Early Stopping (patience=10)")
print("  • Learning Rate Reduction (patience=5)")
print("  • Model Checkpoint (save best)")

---

## Part 6: Train Your Model

**This is it! Train your best model!**

In [None]:
# ==========================================
# TRAIN YOUR MODEL
# ==========================================

print("🚀 Training started...\n")
print("=" * 70)

start_time = time.time()

history = model.fit(
    train_generator,
    steps_per_epoch=len(x_train_split) // 128,
    epochs=50,  # Early stopping will terminate if needed
    validation_data=val_generator,
    validation_steps=len(x_val_split) // 128,
    callbacks=callbacks,
    verbose=1
)

training_time = time.time() - start_time

print("\n" + "=" * 70)
print(f"✅ Training complete! Time: {training_time/60:.1f} minutes")
print("=" * 70)

---

## Part 7: Visualize Training History

In [None]:
# Plot training history
fig, axes = plt.subplots(1, 2, figsize=(16, 5))

# Accuracy
axes[0].plot(history.history['accuracy'], 'b-o', label='Training', linewidth=2, markersize=4)
axes[0].plot(history.history['val_accuracy'], 'r-o', label='Validation', linewidth=2, markersize=4)
axes[0].set_xlabel('Epoch', fontsize=12)
axes[0].set_ylabel('Accuracy', fontsize=12)
axes[0].set_title('Model Accuracy', fontsize=14, fontweight='bold')
axes[0].legend(fontsize=11)
axes[0].grid(True, alpha=0.3)

# Loss
axes[1].plot(history.history['loss'], 'b-o', label='Training', linewidth=2, markersize=4)
axes[1].plot(history.history['val_loss'], 'r-o', label='Validation', linewidth=2, markersize=4)
axes[1].set_xlabel('Epoch', fontsize=12)
axes[1].set_ylabel('Loss', fontsize=12)
axes[1].set_title('Model Loss', fontsize=14, fontweight='bold')
axes[1].legend(fontsize=11)
axes[1].grid(True, alpha=0.3)

plt.suptitle('Training Performance', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

# Training summary
final_train_acc = history.history['accuracy'][-1]
final_val_acc = history.history['val_accuracy'][-1]
best_val_acc = max(history.history['val_accuracy'])
epochs_trained = len(history.history['accuracy'])

print(f"\n📊 Training Summary:")
print(f"Epochs trained: {epochs_trained}")
print(f"Final training accuracy: {final_train_acc:.2%}")
print(f"Final validation accuracy: {final_val_acc:.2%}")
print(f"Best validation accuracy: {best_val_acc:.2%}")
print(f"Overfitting gap: {final_train_acc - final_val_acc:.2%}")

---

## Part 8: Evaluate on Test Set

**The moment of truth!**

In [None]:
# Evaluate on test set
print("🎯 Evaluating on test set...\n")

test_loss, test_acc = model.evaluate(x_test, y_test_cat, verbose=0)

print("=" * 70)
print("FINAL TEST RESULTS")
print("=" * 70)
print(f"Test Loss:     {test_loss:.4f}")
print(f"Test Accuracy: {test_acc:.2%}")
print("=" * 70)

# Grade based on accuracy
if test_acc >= 0.93:
    grade = "⭐⭐⭐⭐⭐ EXCELLENT"
    print(f"\n🏆 {grade}! Outstanding performance!")
elif test_acc >= 0.91:
    grade = "⭐⭐⭐⭐ VERY GOOD"
    print(f"\n👍 {grade}! Strong CNN design!")
elif test_acc >= 0.89:
    grade = "⭐⭐⭐ GOOD"
    print(f"\n✅ {grade}! Solid understanding!")
elif test_acc >= 0.87:
    grade = "⭐⭐ ACCEPTABLE"
    print(f"\n✓ {grade}. Basic proficiency shown.")
else:
    grade = "⭐ NEEDS WORK"
    print(f"\n⚠️ {grade}. Try experimenting more!")

print(f"\nYour Grade: {grade}")

---

## Part 9: Detailed Error Analysis (BONUS +1 point)

In [None]:
# Make predictions
predictions = model.predict(x_test, verbose=0)
predicted_classes = np.argmax(predictions, axis=1)

# Classification report
print("📋 Classification Report:\n")
print(classification_report(y_test, predicted_classes, target_names=class_names, digits=4))

# Per-class accuracy
per_class_acc = []
for i in range(10):
    mask = y_test == i
    acc = (predicted_classes[mask] == y_test[mask]).mean()
    per_class_acc.append(acc)

# Plot per-class accuracy
plt.figure(figsize=(12, 5))
bars = plt.bar(range(10), per_class_acc, color='steelblue', alpha=0.8)
plt.xticks(range(10), [class_names[i] for i in range(10)], rotation=45, ha='right')
plt.xlabel('Class', fontsize=12)
plt.ylabel('Accuracy', fontsize=12)
plt.title('Per-Class Accuracy', fontsize=14, fontweight='bold')
plt.ylim([0.8, 1.0])
plt.grid(True, alpha=0.3, axis='y')

# Add value labels on bars
for i, (bar, acc) in enumerate(zip(bars, per_class_acc)):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.005,
            f'{acc:.1%}', ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.show()

# Identify best and worst classes
best_class = np.argmax(per_class_acc)
worst_class = np.argmin(per_class_acc)

print(f"\n🎯 Best performing class: {class_names[best_class]} ({per_class_acc[best_class]:.2%})")
print(f"⚠️ Worst performing class: {class_names[worst_class]} ({per_class_acc[worst_class]:.2%})")

In [None]:
# Confusion Matrix
cm = confusion_matrix(y_test, predicted_classes)

plt.figure(figsize=(12, 10))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=class_names, yticklabels=class_names,
            cbar_kws={'label': 'Count'})
plt.xlabel('Predicted Label', fontsize=13, fontweight='bold')
plt.ylabel('True Label', fontsize=13, fontweight='bold')
plt.title('Confusion Matrix', fontsize=16, fontweight='bold')
plt.xticks(rotation=45, ha='right')
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()

# Find most confused pairs
cm_no_diag = cm.copy()
np.fill_diagonal(cm_no_diag, 0)
most_confused_idx = np.unravel_index(cm_no_diag.argmax(), cm_no_diag.shape)
true_class = class_names[most_confused_idx[0]]
pred_class = class_names[most_confused_idx[1]]
confusion_count = cm_no_diag[most_confused_idx]

print(f"\n🔍 Most confused pair: {true_class} misclassified as {pred_class} ({confusion_count} times)")

In [None]:
# Visualize correct and incorrect predictions
correct_idx = np.where(predicted_classes == y_test)[0]
incorrect_idx = np.where(predicted_classes != y_test)[0]

fig, axes = plt.subplots(2, 5, figsize=(16, 7))

# Correct predictions
for i in range(5):
    idx = correct_idx[i]
    axes[0, i].imshow(x_test[idx].reshape(28, 28), cmap='gray')
    axes[0, i].set_title(f'✅ True: {class_names[y_test[idx]]}\nPred: {class_names[predicted_classes[idx]]}\nConf: {predictions[idx][predicted_classes[idx]]:.2%}',
                        fontsize=9, color='green')
    axes[0, i].axis('off')

# Incorrect predictions
for i in range(5):
    idx = incorrect_idx[i]
    axes[1, i].imshow(x_test[idx].reshape(28, 28), cmap='gray')
    axes[1, i].set_title(f'❌ True: {class_names[y_test[idx]]}\nPred: {class_names[predicted_classes[idx]]}\nConf: {predictions[idx][predicted_classes[idx]]:.2%}',
                        fontsize=9, color='red')
    axes[1, i].axis('off')

plt.suptitle('Sample Predictions: Correct (Top) vs Incorrect (Bottom)', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

print(f"\nTotal correct: {len(correct_idx):,} ({len(correct_idx)/len(y_test):.2%})")
print(f"Total incorrect: {len(incorrect_idx):,} ({len(incorrect_idx)/len(y_test):.2%})")

### 📝 Error Analysis Discussion

### YOUR ANSWERS:

**1. Which class(es) did your model struggle with most? Why do you think this happened?**

[Write your answer here]

**2. Looking at the confusion matrix, which classes are most commonly confused with each other?**

[Write your answer here]

**3. What could you do to improve performance on the worst-performing classes?**

[Write your answer here]

---

## Part 10: Final Summary and Reflection

In [None]:
# Generate comprehensive summary
summary_data = {
    'Metric': [
        'Test Accuracy',
        'Test Loss',
        'Total Parameters',
        'Epochs Trained',
        'Training Time',
        'Best Validation Accuracy',
        'Overfitting Gap'
    ],
    'Value': [
        f"{test_acc:.2%}",
        f"{test_loss:.4f}",
        f"{model.count_params():,}",
        f"{epochs_trained}",
        f"{training_time/60:.1f} min",
        f"{best_val_acc:.2%}",
        f"{final_train_acc - final_val_acc:.2%}"
    ]
}

df_summary = pd.DataFrame(summary_data)

print("\n" + "="*70)
print("FINAL PROJECT SUMMARY")
print("="*70)
print(df_summary.to_string(index=False))
print("="*70)

print(f"\n🎯 Final Grade: {grade}")
print(f"\n💡 Estimated Score: {20 + (2 if test_acc >= 0.93 else 0)} / 25 points")
if test_acc >= 0.93:
    print("   (Base: 20 points + Bonus: 2 points for ≥93% accuracy)")
else:
    print("   (Can earn up to +5 bonus points with additional features!)")

---

## 📝 Final Reflection (Required)

**Answer ALL questions below:**

### YOUR FINAL REFLECTION:

**1. What was your overall strategy for achieving high accuracy?**

[Write a detailed answer - at least 3-4 sentences]

**2. Which technique(s) had the biggest impact on performance? (Architecture, regularization, augmentation, etc.)**

[Write your answer]

**3. What challenges did you face? How did you overcome them?**

[Write your answer]

**4. If you had more time/resources, what would you try next to improve the model?**

[Write your answer]

**5. What are the 3 most important lessons you learned from this CNN project?**

1. [Lesson 1]
2. [Lesson 2]
3. [Lesson 3]

**6. How would you apply this knowledge to a real-world computer vision problem?**

[Write your answer]

---

## 📤 Submission Checklist

Before submitting, verify:

### Mandatory Requirements:
- [ ] All code cells run without errors
- [ ] Model achieves >87% test accuracy
- [ ] Data augmentation implemented and explained
- [ ] Architecture documented with justification
- [ ] Training curves displayed
- [ ] Test set evaluation complete
- [ ] All reflection questions answered

### Optional (for bonus points):
- [ ] Learning rate scheduling used (+1 point)
- [ ] Detailed error analysis with confusion matrix (+1 point)
- [ ] Test accuracy ≥93% (+2 points)
- [ ] Ensemble model attempted (+1 point)

---

## 🎯 Grading Rubric (25 points total)

| Component | Points | Criteria |
|-----------|--------|----------|
| **Architecture Design** | 5 | Well-designed CNN with proper justification |
| **Data Augmentation** | 4 | Appropriate augmentation with explanation |
| **Regularization** | 4 | Multiple techniques used effectively |
| **Test Accuracy** | 6 | 87%:2pts, 89%:4pts, 91%:6pts |
| **Documentation** | 4 | Clear explanations and reflections |
| **Code Quality** | 2 | Clean, well-commented code |
| **BONUS** | +5 | Extra features and high performance |

---

## 🚀 Submission Instructions

1. **Save this notebook** with all outputs visible
2. **Rename** to: `YourName_CNN_Challenge.ipynb`
3. **Upload** to Google Classroom / Course Portal
4. **Include** the saved model file (`best_model.h5`) if required

---

## 🎉 Congratulations!

You've completed the CNN practical challenge! You've learned:

✅ 1D and 2D convolution fundamentals  
✅ CNN architecture design principles  
✅ Regularization techniques (Dropout, BatchNorm)  
✅ Data augmentation strategies  
✅ Training optimization and callbacks  
✅ Model evaluation and error analysis  

**These skills are fundamental to modern computer vision and deep learning!**

---

*⏱️ Expected time: 3-4 hours*  
*💪 Difficulty: Advanced*  
*🎓 Value: 25 points (major assignment)*  
*🏆 Challenge level: Production-ready CNN skills*