# Lab 7, Module 2: Visualizing CNN Feature Maps

**Estimated time:** 20 minutes

---

## **Opening: From Hand-Designed Filters to Learned Filters**

In **Module 1**, you applied hand-designed filters (Sobel, blur, sharpen) to images.

**Now the big question:** What filters does a real CNN actually learn?

In this module, you'll:
1. Load **MobileNetV2** (a production-quality image classifier)
2. Extract its convolutional layers at different depths
3. Visualize **feature maps** (what each layer detects)
4. See how layers build from simple ‚Üí complex patterns

### **Connection to Lab 6**

- **Lab 6 (Saliency):** Showed **WHERE** the model looks (which pixels matter)
- **Lab 7 Module 2 (Feature Maps):** Shows **WHAT** the model extracts (which patterns are detected)

**Together:** You'll understand both **what features are extracted** and **which extracted features drive predictions**!

---

## üß± **What Is a Feature Map?**

**Feature map** = The output of a convolutional layer when applied to an image.

**Intuition:**
- Each convolutional layer has multiple filters (32, 64, 128, etc.)
- Each filter looks for a specific pattern
- When a filter finds its pattern, the feature map "lights up" at that location

**Example:**
- **Filter 1:** Vertical edge detector ‚Üí Feature map shows where vertical edges are
- **Filter 2:** Horizontal edge detector ‚Üí Feature map shows where horizontal edges are
- **Filter 3:** Diagonal texture ‚Üí Feature map shows where that texture appears

**Key insight:** Each layer produces **many feature maps**, one for each filter!

---

## üèóÔ∏è **About MobileNetV2**

We'll use **MobileNetV2**, a lightweight CNN:
- Pre-trained on **ImageNet** (1.4 million images, 1000 classes)
- Only **14 MB** (runs fast on CPU!)
- Used in production on phones, edge devices, embedded systems
- Recognizes: animals, vehicles, objects, food, etc.

**Architecture overview:**
- 53 layers total
- Multiple convolutional blocks at different depths
- Early layers: Simple patterns (edges)
- Mid layers: Textures and shapes
- Deep layers: Object parts and complex patterns

**We'll visualize 3 layers:**
1. `block_1_conv1` (early layer ~2-3)
2. `block_3_conv1` (middle layer ~10-12)
3. `block_6_conv1` (deeper layer ~20-22)

---

In [None]:
# Setup: Install and import libraries
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
from tensorflow.keras import Model
from PIL import Image
from google.colab import files
import io

print(f"‚úÖ TensorFlow version: {tf.__version__}")
print("‚úÖ Libraries imported successfully!")

---

## üì• **Load Pre-Trained MobileNetV2**

This will download the model (14 MB) the first time you run it.

---

In [None]:
# Load full MobileNetV2 model
print("Loading MobileNetV2 (this may take ~30 seconds the first time)...\n")
base_model = MobileNetV2(weights='imagenet', include_top=True)

print("‚úÖ Model loaded successfully!")
print(f"\nModel details:")
print(f"  - Total layers: {len(base_model.layers)}")
print(f"  - Output classes: {base_model.output_shape[1]} (ImageNet categories)")
print(f"  - Input shape: {base_model.input_shape}")

# List some layer names for reference
print(f"\nSample convolutional layer names:")
conv_layers = [layer.name for layer in base_model.layers if 'conv' in layer.name]
for name in conv_layers[:10]:
    print(f"  - {name}")
print(f"  ... and {len(conv_layers)-10} more")

---

## üîç **Create Feature Extraction Model**

We'll create a new model that outputs **intermediate layer activations** (feature maps) instead of just the final prediction.

**Layers we'll examine:**
1. **`block_1_conv1`** (early) ‚Äì Simple patterns like edges
2. **`block_3_conv1`** (middle) ‚Äì Textures and small shapes
3. **`block_6_conv1`** (deep) ‚Äì Complex patterns and object parts

---

In [None]:
# Select layers to visualize
layer_names = ['block_1_conv1', 'block_3_conv1', 'block_6_conv1']

# Get layer outputs
layer_outputs = [base_model.get_layer(name).output for name in layer_names]

# Create feature extraction model
feature_model = Model(inputs=base_model.input, outputs=layer_outputs)

print("‚úÖ Feature extraction model created!\n")
print("This model will output activations from 3 layers:")
for i, name in enumerate(layer_names, 1):
    layer = base_model.get_layer(name)
    print(f"{i}. {name}")
    print(f"   Shape: {layer.output_shape}")
    print(f"   # Filters: {layer.output_shape[-1]}\n")

---

## üñºÔ∏è **Load and Classify a Sample Image**

Let's start with a sample image to see what the model detects.

---

In [None]:
# Create a simple test image (geometric pattern)
# In practice, you'd load an actual photo

def create_test_image():
    """Create a simple test image with various patterns"""
    img = np.ones((224, 224, 3), dtype=np.uint8) * 200
    
    # Add vertical stripes (left side)
    img[:, 20:40, :] = [50, 50, 50]
    img[:, 50:70, :] = [50, 50, 50]
    
    # Add horizontal stripes (top)
    img[20:40, :, :] = [100, 100, 100]
    
    # Add diagonal rectangle (bottom right)
    img[140:180, 140:180, :] = [30, 30, 30]
    
    # Add circle (center-ish)
    y, x = np.ogrid[:224, :224]
    mask = (x - 112)**2 + (y - 112)**2 <= 30**2
    img[mask] = [150, 150, 150]
    
    return img

# Create test image
test_img = create_test_image()

# Display
plt.figure(figsize=(6, 6))
plt.imshow(test_img)
plt.title('Test Image: Geometric Patterns', fontsize=14, fontweight='bold')
plt.axis('off')
plt.show()

# Preprocess for MobileNetV2
img_array = test_img.astype('float32')
img_array = np.expand_dims(img_array, axis=0)
img_array = preprocess_input(img_array)

# Classify
predictions = base_model.predict(img_array, verbose=0)
decoded = decode_predictions(predictions, top=3)[0]

print("\nTop 3 predictions:")
for i, (imagenet_id, label, score) in enumerate(decoded, 1):
    print(f"{i}. {label}: {score*100:.2f}%")

print("\n(Note: Predictions may be unexpected for this synthetic image!)")

---

## üî• **Extract Feature Maps**

Now let's pass the image through our feature extraction model and get the activations from each layer!

---

In [None]:
# Get feature maps from all 3 layers
print("Extracting feature maps from 3 layers...\n")
activations = feature_model.predict(img_array, verbose=0)

print("‚úÖ Feature maps extracted!\n")
for i, (name, activation) in enumerate(zip(layer_names, activations), 1):
    print(f"{i}. {name}")
    print(f"   Shape: {activation.shape}")
    print(f"   (batch, height, width, filters) = {activation.shape}\n")

---

## üé® **Visualize Layer 1 Feature Maps (Early Layer)**

**Expected:** Simple patterns like edges, gradients, and color transitions.

We'll visualize the first 16 filters (out of 32 total).

---

In [None]:
# Visualize first 16 feature maps from Layer 1
layer1_activation = activations[0][0]  # Remove batch dimension

fig, axes = plt.subplots(4, 4, figsize=(14, 14))
axes = axes.flatten()

for i in range(16):
    feature_map = layer1_activation[:, :, i]
    axes[i].imshow(feature_map, cmap='viridis')
    axes[i].set_title(f'Filter {i+1}', fontsize=10)
    axes[i].axis('off')

plt.tight_layout()
plt.suptitle(f'Layer 1 ({layer_names[0]}): Early Feature Maps\n(First 16 of 32 filters)', 
             fontsize=14, fontweight='bold', y=1.01)
plt.show()

print("\n" + "="*70)
print("WHAT TO OBSERVE IN LAYER 1:")
print("="*70)
print("\n- Each square shows one filter's response")
print("- Bright/yellow regions = filter detected its pattern")
print("- Dark/purple regions = filter didn't detect pattern\n")
print("Expected patterns at this early layer:")
print("  ‚úì Vertical edges")
print("  ‚úì Horizontal edges")
print("  ‚úì Diagonal gradients")
print("  ‚úì Color transitions")
print("  ‚úì Simple contrasts\n")
print("Notice: Similar to the Sobel filters from Module 1!")
print("="*70)

---

## üìù **Question Q10 (Prediction)**

### **Q10. Before viewing the feature maps for Layers 2 and 3, predict: What will Layer 3 (middle layer) detect? What about Layer 6 (deep layer)?**

*Hint: If Layer 1 detects edges, what might Layer 3 build from those edges? What might Layer 6 build from Layer 3's patterns?*

**Record your prediction in the Answer Sheet BEFORE continuing!**

---

## üé® **Visualize Layer 3 Feature Maps (Middle Layer)**

**Expected:** More complex patterns like corners, textures, and small shapes.

---

In [None]:
# Visualize first 16 feature maps from Layer 3
layer3_activation = activations[1][0]  # Remove batch dimension

fig, axes = plt.subplots(4, 4, figsize=(14, 14))
axes = axes.flatten()

for i in range(16):
    feature_map = layer3_activation[:, :, i]
    axes[i].imshow(feature_map, cmap='viridis')
    axes[i].set_title(f'Filter {i+1}', fontsize=10)
    axes[i].axis('off')

plt.tight_layout()
plt.suptitle(f'Layer 3 ({layer_names[1]}): Middle Feature Maps\n(First 16 filters)', 
             fontsize=14, fontweight='bold', y=1.01)
plt.show()

print("\n" + "="*70)
print("WHAT TO OBSERVE IN LAYER 3:")
print("="*70)
print("\nExpected patterns at this middle layer:")
print("  ‚úì Corners and junctions")
print("  ‚úì Texture patterns (stripes, dots)")
print("  ‚úì Curves and arcs")
print("  ‚úì Simple shapes (circles, rectangles)")
print("  ‚úì Combinations of edges\n")
print("Notice: More abstract than Layer 1!")
print("Feature maps are smaller (image downsampled via pooling)")
print("="*70)

---

## üé® **Visualize Layer 6 Feature Maps (Deep Layer)**

**Expected:** Complex patterns like object parts, contextual features.

---

In [None]:
# Visualize first 16 feature maps from Layer 6
layer6_activation = activations[2][0]  # Remove batch dimension

fig, axes = plt.subplots(4, 4, figsize=(14, 14))
axes = axes.flatten()

for i in range(16):
    feature_map = layer6_activation[:, :, i]
    axes[i].imshow(feature_map, cmap='viridis')
    axes[i].set_title(f'Filter {i+1}', fontsize=10)
    axes[i].axis('off')

plt.tight_layout()
plt.suptitle(f'Layer 6 ({layer_names[2]}): Deep Feature Maps\n(First 16 filters)', 
             fontsize=14, fontweight='bold', y=1.01)
plt.show()

print("\n" + "="*70)
print("WHAT TO OBSERVE IN LAYER 6:")
print("="*70)
print("\nExpected patterns at this deep layer:")
print("  ‚úì Object parts (if real photo: wheels, windows, ears, etc.)")
print("  ‚úì Complex shapes and structures")
print("  ‚úì Contextual/semantic features")
print("  ‚úì Combinations of mid-level patterns\n")
print("Notice: Much more abstract than Layers 1 and 3!")
print("Feature maps are even smaller (more pooling)")
print("Harder to interpret visually (very high-level representations)")
print("="*70)

---

## üìù **Questions (Q11-Q15)**

Record your answers in the **Answer Sheet**.

---

### **Q11. Looking at Layer 1 feature maps, which filters activated strongly? What patterns did they detect?**

*Hint: Look for bright (yellow/white) regions. Which filters show strong activation? Do they correspond to edges, corners, or other patterns?*

**Record your answer in the Answer Sheet.**

---

### **Q12. Compare Layer 1 and Layer 6 feature maps. How are they different? What does this tell you about hierarchical learning?**

*Hint: Compare the patterns you see. Are Layer 1 patterns simple or complex? What about Layer 6? How does this support the idea of building from simple to complex features?*

**Record your answer in the Answer Sheet.**

---

### **Q13. Find a feature map in Layer 3 that activated strongly for one part of the image. What pattern was it detecting?**

*Hint: Look for filters that "light up" on specific features (circle, rectangle, stripes). What visual pattern caused that activation?*

**Record your answer in the Answer Sheet.**

---

### **Q14. Why do deeper layers show more abstract/complex patterns than early layers?**

*Hint: Each layer builds on the previous one. If Layer 1 detects edges, Layer 3 can combine edges into shapes. What can Layer 6 do with Layer 3's shapes?*

**Record your answer in the Answer Sheet.**

---

### **Q15. How does this connect to the saliency maps from Lab 6?**

*Hint: Saliency shows importance; feature maps show what's extracted. How do they work together to explain CNN predictions?*

**Record your answer in the Answer Sheet.**

---

## üì§ **Upload Your Own Image (Optional)**

Try feature map visualization with your own photo!

**Best results with:**
- Photos of animals (clear features like faces, fur)
- Buildings/architecture (lots of edges and shapes)
- Vehicles (distinctive parts like wheels, windows)

---

In [None]:
# Upload your own image
print("üì§ Click 'Choose Files' and select an image...\n")

uploaded = files.upload()

if uploaded:
    # Load image
    filename = list(uploaded.keys())[0]
    img = Image.open(io.BytesIO(uploaded[filename]))
    img = img.convert('RGB').resize((224, 224))
    
    # Display
    plt.figure(figsize=(6, 6))
    plt.imshow(img)
    plt.title(f'Your Image: {filename}', fontsize=14, fontweight='bold')
    plt.axis('off')
    plt.show()
    
    # Preprocess
    img_array = np.array(img).astype('float32')
    img_array = np.expand_dims(img_array, axis=0)
    img_array = preprocess_input(img_array)
    
    # Classify
    predictions = base_model.predict(img_array, verbose=0)
    decoded = decode_predictions(predictions, top=3)[0]
    print("\nTop 3 predictions:")
    for i, (imagenet_id, label, score) in enumerate(decoded, 1):
        print(f"{i}. {label}: {score*100:.2f}%")
    
    # Extract features
    print("\nExtracting feature maps...\n")
    activations_user = feature_model.predict(img_array, verbose=0)
    
    # Visualize all 3 layers side by side
    fig, axes = plt.subplots(3, 6, figsize=(18, 10))
    
    for layer_idx in range(3):
        activation = activations_user[layer_idx][0]
        for i in range(6):
            axes[layer_idx, i].imshow(activation[:, :, i], cmap='viridis')
            axes[layer_idx, i].set_title(f'Filter {i+1}', fontsize=9)
            axes[layer_idx, i].axis('off')
        
        # Add layer label
        axes[layer_idx, 0].text(-0.3, 0.5, f'Layer {layer_idx+1}\n{layer_names[layer_idx]}', 
                               transform=axes[layer_idx, 0].transAxes,
                               fontsize=11, fontweight='bold',
                               rotation=90, va='center')
    
    plt.tight_layout()
    plt.suptitle(f'Feature Maps for Your Image: {filename}', 
                 fontsize=14, fontweight='bold', y=1.01)
    plt.show()
    
    print("\n‚úÖ Feature maps generated for your image!")
    print("\nCompare the 3 layers:")
    print("  Layer 1 (top row): Simple edge patterns")
    print("  Layer 3 (middle row): Textures and shapes")
    print("  Layer 6 (bottom row): Complex, abstract patterns")
else:
    print("No image uploaded. Skipping this section.")

---

## üîó **Connection to Lab 6: Saliency vs. Feature Maps**

Let's connect what you learned in Lab 6 to what you're seeing today:

| Concept | Lab 6 (Saliency) | Lab 7 Module 2 (Feature Maps) |
|---------|------------------|-------------------------------|
| **Question** | Which pixels matter? | What patterns are detected? |
| **Method** | Gradients w.r.t. input | Forward pass through layers |
| **Visualization** | Heatmap overlay | Grid of activation maps |
| **Interpretation** | Importance of input regions | Patterns found by each filter |
| **Reveals** | WHERE the model looks | WHAT the model extracts |

### **Together, They Explain CNNs:**

1. **Feature Maps (Lab 7):** Convolution layers extract hierarchical features
   - Layer 1: Edges
   - Layer 3: Textures and shapes
   - Layer 6: Object parts

2. **Saliency (Lab 6):** Shows which extracted features matter for the final decision
   - "The dog's ears and snout are important"
   - "The background grass is not important"

**Full picture:**
```
Input Image
    ‚Üì
Layer 1 extracts edges ‚Üí Feature maps show edge locations ‚Üí Some edges more important than others
    ‚Üì
Layer 3 combines edges into shapes ‚Üí Feature maps show shapes ‚Üí Some shapes more important
    ‚Üì
Layer 6 combines shapes into parts ‚Üí Feature maps show parts ‚Üí Some parts more important
    ‚Üì
Final prediction: "Dog"
    ‚Üë
Saliency map reveals: Dog's face drove this decision
```

---

## ‚úÖ Module 2 Complete!

You now understand:
- **What feature maps are** (outputs of convolutional layers)
- **How to visualize them** (using intermediate layer outputs)
- **What different layers detect** (edges ‚Üí textures ‚Üí shapes ‚Üí parts)
- **Hierarchical learning** (simple patterns ‚Üí complex patterns)
- **Connection to saliency** (feature maps show WHAT, saliency shows WHERE)

**Key insight:**
> CNNs automatically learn hierarchical feature detectors‚Äîstarting with simple edges and building up to complex object parts‚Äîall through gradient descent!

**Ready to understand WHY this hierarchy works?**

Move on to **Module 3: Hierarchical Feature Extraction**, where you'll learn the principles behind this powerful architecture!

---