# CMSC 178IP - Digital Image Processing
# Final Examination

**Student Name:** _______________________

**Student Number:** _______________________

**Date:** _______________________

---

## Exam Information

| Item | Details |
|------|---------|
| **Total Points** | 100 points |
| **Time Allocation** | 3-4 hours (self-paced) |
| **Deadline** | 1 week from release |
| **Format** | Jupyter Notebook + PDF export |

## Exam Structure

| Part | Topic | Points |
|------|-------|--------|
| Part 1 | Image Fundamentals | 20 |
| Part 2 | Image Processing & Filtering | 20 |
| Part 3 | Feature Extraction & Segmentation | 20 |
| Part 4 | Deep Learning for Computer Vision | 20 |
| Part 5 | Generative Models | 20 |
| Bonus | Integrated Application | 10 |

## Instructions

1. **Answer all written questions** in the designated markdown cells
2. **Complete all code cells** marked with `# TODO`
3. **Run all cells** before submission (outputs must be visible)
4. **Export to PDF** and submit both `.ipynb` and `.pdf`
5. You may use course materials, textbooks, and online resources
6. LLM usage (ChatGPT, Claude, etc.) is permitted - document what you learned
7. An **oral examination** (10-15 minutes) will follow to verify understanding

## Personalized Parameters

Use your **student number as a random seed** for unique values:

```python
MY_SEED = int("YOUR_STUDENT_NUMBER"[-6:])  # Last 6 digits
```

---

## Setup and Imports

Run this cell first to import all required libraries.

In [None]:
# Standard imports
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

# Image processing
from skimage import data, color, filters, morphology, measure, exposure, transform
from skimage.util import random_noise
from skimage.feature import canny
from scipy import ndimage
from scipy.signal import convolve2d

# Utilities
import warnings
warnings.filterwarnings('ignore')

# Display settings
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100

# ============================================================
# PERSONALIZED SEED - REPLACE WITH YOUR STUDENT NUMBER!
# ============================================================
MY_SEED = int("123456")  # TODO: REPLACE WITH YOUR STUDENT NUMBER!

np.random.seed(MY_SEED)

# Generate personalized parameters
MY_NOISE_VAR = 0.01 + (MY_SEED % 100) / 2000
MY_SP_AMOUNT = 0.03 + (MY_SEED % 50) / 1000
MY_LATENT_DIM = 16 + (MY_SEED % 32)

print("All imports successful!")
print(f"\nYour personalized parameters:")
print(f"   Noise variance: {MY_NOISE_VAR:.4f}")
print(f"   S&P amount: {MY_SP_AMOUNT:.4f}")
print(f"   Latent dimension: {MY_LATENT_DIM}")

In [None]:
# Helper functions
def show_images(images, titles=None, cmap='gray', figsize=(15, 5)):
    """Display multiple images in a row."""
    n = len(images)
    fig, axes = plt.subplots(1, n, figsize=figsize)
    if n == 1:
        axes = [axes]
    for i, (img, ax) in enumerate(zip(images, axes)):
        if img.ndim == 2:
            ax.imshow(img, cmap=cmap)
        else:
            ax.imshow(img)
        if titles:
            ax.set_title(titles[i])
        ax.axis('off')
    plt.tight_layout()
    plt.show()

print("Helper functions loaded!")

---

# Part 1: Image Fundamentals (20 points)

This section covers image representation, quantization, sampling, and color spaces.

## 1.1 Image Representation and Quantization (6 points)

A grayscale image uses 8-bit quantization (256 gray levels).

**Q1.1a (2 pts):** If we reduce the quantization to 4 bits, how many gray levels would be available?

*Your answer:*


**Q1.1b (2 pts):** Describe TWO visible artifacts that would appear when reducing from 8-bit to 2-bit quantization.

*Your answer:*


**Q1.1c (2 pts):** Why does the human eye perceive these artifacts more strongly in smooth gradient regions than in textured regions?

*Your answer:*

## 1.2 Sampling and Aliasing (6 points)

**Q1.2a (2 pts):** State the Nyquist-Shannon sampling theorem in your own words.

*Your answer:*


**Q1.2b (2 pts):** A camera captures an image of a striped shirt. The stripes have a spatial frequency of 50 cycles per cm. If the camera samples at 80 samples per cm, will aliasing occur? Explain your reasoning.

*Your answer:*


**Q1.2c (2 pts):** Describe ONE practical method to prevent aliasing in digital cameras.

*Your answer:*

## 1.3 Color Spaces (8 points)

**Q1.3a (2 pts):** Why is YCbCr color space preferred over RGB for image/video compression?

*Your answer:*


**Q1.3b (2 pts):** JPEG compression applies 4:2:0 chroma subsampling. Explain what this means and why it's perceptually acceptable.

*Your answer:*

In [None]:
# Practical: Color Space Exploration (4 pts)
# Load and explore color images

color_img = data.astronaut()

# TODO: Convert to different color spaces and display
gray_img = color.rgb2gray(color_img)
hsv_img = color.rgb2hsv(color_img)

fig, axes = plt.subplots(2, 3, figsize=(15, 10))

# Original RGB
axes[0, 0].imshow(color_img)
axes[0, 0].set_title('Original RGB')
axes[0, 0].axis('off')

# Grayscale
axes[0, 1].imshow(gray_img, cmap='gray')
axes[0, 1].set_title('Grayscale')
axes[0, 1].axis('off')

# HSV - Hue channel
axes[0, 2].imshow(hsv_img[:, :, 0], cmap='hsv')
axes[0, 2].set_title('Hue (H)')
axes[0, 2].axis('off')

# TODO: Display Saturation and Value channels
# axes[1, 0].imshow(hsv_img[:, :, 1], cmap='gray')
# axes[1, 0].set_title('Saturation (S)')

# axes[1, 1].imshow(hsv_img[:, :, 2], cmap='gray')
# axes[1, 1].set_title('Value (V)')

plt.tight_layout()
plt.show()

# Q: Why is HSV useful for color-based object detection?
# Your answer in the markdown cell below:

**Q1.3c (4 pts):** Based on your exploration above, why is HSV color space useful for image processing tasks like object detection based on color? Give a specific example.

*Your answer:*

---

# Part 2: Image Processing & Filtering (20 points)

This section covers convolution, frequency domain, histogram operations, and noise filtering.

## 2.1 Convolution and Correlation (6 points)

**Q2.1a (2 pts):** What is the fundamental difference between convolution and correlation? When does this difference matter?

*Your answer:*


**Q2.1b (2 pts):** Given a 3x3 kernel, explain why we need to "flip" it for convolution but not for correlation.

*Your answer:*


**Q2.1c (2 pts):** A separable 5x5 filter can be decomposed into two 1D filters. How many multiplications are saved when applying a separable filter to a 512x512 image compared to a non-separable filter? Show your calculation.

*Your answer:*

## 2.2 Frequency Domain Processing (6 points)

**Q2.2a (2 pts):** What type of image features correspond to LOW frequencies in the Fourier domain? What about HIGH frequencies?

*Your answer:*


**Q2.2b (2 pts):** A researcher applies an ideal low-pass filter (sharp cutoff) in the frequency domain. They observe "ringing" artifacts in the output image. Explain why this occurs.

*Your answer:*


**Q2.2c (2 pts):** What filter shape would reduce ringing while still removing high frequencies?

*Your answer:*

In [None]:
# Practical: Noise Filtering (8 pts)
# Different types of noise require different filtering approaches

original = data.camera() / 255.0

# Create noisy versions using YOUR personalized parameters
noisy_gaussian = random_noise(original, mode='gaussian', var=MY_NOISE_VAR)
noisy_sp = random_noise(original, mode='s&p', amount=MY_SP_AMOUNT)

print(f"Your personalized noise levels:")
print(f"   Gaussian variance: {MY_NOISE_VAR:.4f}")
print(f"   S&P amount: {MY_SP_AMOUNT:.4f}")

show_images([original, noisy_gaussian, noisy_sp], 
            ['Original', 'Gaussian Noise', 'Salt & Pepper Noise'])

# TODO: Apply appropriate filters for each noise type
# For Gaussian noise: try Gaussian blur
filtered_gaussian = None  # TODO: filters.gaussian(noisy_gaussian, sigma=?)

# For Salt & Pepper: try median filter
filtered_sp = None  # TODO: filters.median(noisy_sp, morphology.disk(?))

# TODO: Display filtered results and compare
# show_images([noisy_gaussian, filtered_gaussian, noisy_sp, filtered_sp],
#             ['Gaussian Noise', 'Filtered', 'S&P Noise', 'Filtered'])

**Q2.2d (4 pts):** Why does median filter work better for salt-and-pepper noise than Gaussian blur? What would happen if you used Gaussian blur on S&P noise?

*Your answer:*


**Q2.2e (4 pts):** Why is a bilateral filter often preferred over Gaussian blur for denoising photographs of faces?

*Your answer:*

---

# Part 3: Feature Extraction & Segmentation (20 points)

This section covers edge detection, feature descriptors, and image segmentation.

## 3.1 Edge Detection (6 points)

**Q3.1a (3 pts):** The Canny edge detector uses non-maximum suppression. What is its purpose and how does it improve edge detection compared to simple thresholding?

*Your answer:*


**Q3.1b (3 pts):** Compare the Sobel operator and the Laplacian of Gaussian (LoG) for edge detection. When would you choose one over the other?

*Your answer:*

In [None]:
# Practical: Edge Detection (6 pts)

edge_image = data.camera() / 255.0

# TODO: Apply Sobel edge detection
sobel_x = filters.sobel_h(edge_image)  # Horizontal edges
sobel_y = filters.sobel_v(edge_image)  # Vertical edges
sobel_magnitude = None  # TODO: Calculate magnitude = sqrt(sobel_x^2 + sobel_y^2)

# TODO: Apply Canny edge detection with different sigma values
canny_sigma1 = None  # TODO: canny(edge_image, sigma=1)
canny_sigma3 = None  # TODO: canny(edge_image, sigma=3)

# Display results
# show_images([edge_image, sobel_magnitude, canny_sigma1, canny_sigma3],
#             ['Original', 'Sobel Magnitude', 'Canny σ=1', 'Canny σ=3'])

**Q3.1c (2 pts):** How does the sigma parameter in Canny affect the results? What happens with σ=1 vs σ=3?

*Your answer:*

## 3.2 Feature Descriptors (4 points)

**Q3.2a (2 pts):** SIFT (Scale-Invariant Feature Transform) is described as "invariant to scale and rotation." Explain how SIFT achieves scale invariance.

*Your answer:*


**Q3.2b (2 pts):** Why are feature descriptors like SIFT/ORB useful for image stitching (creating panoramas)? What could go wrong if the images have very different lighting conditions?

*Your answer:*

## 3.3 Segmentation (4 points)

**Q3.3a (2 pts):** Otsu's thresholding automatically selects a threshold value. What criterion does it optimize, and why might it fail on images with uneven illumination?

*Your answer:*


**Q3.3b (2 pts):** Compare region-based segmentation (e.g., region growing) with edge-based segmentation. Give one advantage and one disadvantage of each approach.

*Your answer:*

In [None]:
# Practical: Thresholding & Segmentation (6 pts)

coins = data.coins()

# TODO: Apply different thresholding methods

# 1. Manual threshold
threshold_manual = 100
binary_manual = coins > threshold_manual

# 2. Otsu's automatic threshold
threshold_otsu = None  # TODO: filters.threshold_otsu(coins)
binary_otsu = None  # TODO: coins > threshold_otsu

# 3. Adaptive (local) threshold
binary_adaptive = None  # TODO: filters.threshold_local(coins, block_size=35)

# Display results
# show_images([coins, binary_manual, binary_otsu, binary_adaptive],
#             ['Original', f'Manual (t={threshold_manual})', 
#              f'Otsu (t={threshold_otsu})', 'Adaptive'])

---

# Part 4: Deep Learning for Computer Vision (20 points)

This section covers CNN architectures, training, and object detection.

## 4.1 CNN Fundamentals (8 points)

**Q4.1a (3 pts):** A convolutional layer uses 32 filters of size 3x3 on an input with 3 channels (RGB). How many learnable parameters does this layer have (including biases)? Show your calculation.

*Your answer:*


**Q4.1b (3 pts):** Explain the purpose of pooling layers in CNNs. What is the trade-off between using max pooling vs. average pooling?

*Your answer:*


**Q4.1c (2 pts):** Why do Conv2D layers have far fewer parameters than Dense layers, even though they process the entire image?

*Your answer:*

In [None]:
# Practical: CNN Parameter Calculation (4 pts)
# Consider this simple CNN for CIFAR-10 (32x32x3 input, 10 classes)

"""
Layer 1: Conv2D(filters=32, kernel_size=3x3, input_channels=3)
Layer 2: MaxPool2D(pool_size=2x2)
Layer 3: Conv2D(filters=64, kernel_size=3x3)
Layer 4: MaxPool2D(pool_size=2x2)
Layer 5: Flatten
Layer 6: Dense(128)
Layer 7: Dense(10)  # Output layer
"""

# TODO: Calculate output shape after each layer
print("=== Output Shapes ===")
print("Input:                32 × 32 × 3")
print("After Conv2D(32,3×3): ___ × ___ × ___")  # TODO
print("After MaxPool(2×2):   ___ × ___ × ___")  # TODO
print("After Conv2D(64,3×3): ___ × ___ × ___")  # TODO
print("After MaxPool(2×2):   ___ × ___ × ___")  # TODO
print("After Flatten:        ___")              # TODO

# TODO: Calculate parameters for each layer
print("\n=== Parameters ===")
print("Conv2D(32, 3×3, in=3):  (3 × 3 × 3 + 1) × 32 = ___")  # TODO
print("Conv2D(64, 3×3, in=32): (3 × 3 × ___ + 1) × 64 = ___")  # TODO
print("Dense(128):             (___ + 1) × 128 = ___")  # TODO
print("Dense(10):              (128 + 1) × 10 = ___")  # TODO
print("\nTotal parameters: ___")

## 4.2 Training Deep Networks (4 points)

**Q4.2a (2 pts):** A student trains a CNN on a small dataset of 500 images. The training accuracy reaches 99% but validation accuracy is only 60%. Diagnose the problem and suggest TWO concrete solutions.

*Your answer:*


**Q4.2b (2 pts):** Explain how transfer learning works and why it's especially valuable when you have limited training data.

*Your answer:*

## 4.3 Object Detection and Segmentation (4 points)

**Q4.3a (2 pts):** Compare two-stage detectors (like Faster R-CNN) with one-stage detectors (like YOLO). What is the main trade-off between them?

*Your answer:*


**Q4.3b (2 pts):** What is the difference between semantic segmentation and instance segmentation? Give an example scenario where instance segmentation would be necessary but semantic segmentation would be insufficient.

*Your answer:*

---

# Part 5: Generative Models (20 points)

This section covers autoencoders, VAEs, and GANs.

## 5.1 Autoencoders and VAEs (10 points)

**Q5.1a (3 pts):** A standard autoencoder can compress and reconstruct images, but it's not good for generating NEW images. Explain why.

*Your answer:*


**Q5.1b (3 pts):** How does a Variational Autoencoder (VAE) solve this problem? Specifically, explain the role of the KL divergence term in the VAE loss function.

*Your answer:*


**Q5.1c (2 pts):** In β-VAE, we use β > 1 to weight the KL term more heavily. What is the trade-off when increasing β?

*Your answer:*

In [None]:
# Practical: Autoencoder Architecture Design (2 pts)
# Design an autoencoder for MNIST (28x28 = 784 pixels)

print(f"Your personalized latent dimension: {MY_LATENT_DIM}")
print(f"Design your autoencoder to compress to {MY_LATENT_DIM} dimensions\n")

# TODO: Define your architecture and calculate parameters
"""
Encoder:
    Input(784) 
    → Dense(?) + ReLU
    → Dense(?) + ReLU  
    → Dense(MY_LATENT_DIM)

Decoder:
    Input(MY_LATENT_DIM)
    → Dense(?) + ReLU
    → Dense(?) + ReLU
    → Dense(784) + Sigmoid
"""

print("Describe your architecture and calculate total parameters:")
# Your calculation here

## 5.2 Generative Adversarial Networks (10 points)

**Q5.2a (3 pts):** Describe the "adversarial game" between the generator and discriminator in a GAN. What is each network trying to optimize?

*Your answer:*


**Q5.2b (3 pts):** "Mode collapse" is a common problem in GAN training. What is mode collapse and what causes it?

*Your answer:*

In [None]:
# Practical: GAN Training Dynamics Analysis (4 pts)
# Analyze these simulated training curves

np.random.seed(42)
epochs = 100

# Scenario A: Healthy training
d_loss_healthy = 0.7 - 0.2 * (1 - np.exp(-np.arange(epochs)/30)) + 0.05 * np.random.randn(epochs)
g_loss_healthy = 2.0 - 1.3 * (1 - np.exp(-np.arange(epochs)/40)) + 0.08 * np.random.randn(epochs)

# Scenario B: Mode collapse  
d_loss_collapse = np.concatenate([0.7 - 0.3 * np.arange(30)/30, np.ones(70) * 0.1 + 0.02 * np.random.randn(70)])
g_loss_collapse = np.concatenate([2.0 - 0.5 * np.arange(30)/30, np.ones(70) * 0.3 + 0.05 * np.random.randn(70)])

# Scenario C: Discriminator too strong
d_loss_strong_d = 0.7 * np.exp(-np.arange(epochs)/10) + 0.02 * np.random.randn(epochs)
g_loss_strong_d = 2.0 + 0.5 * np.log(1 + np.arange(epochs)/20) + 0.1 * np.random.randn(epochs)

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

axes[0].plot(d_loss_healthy, 'b-', label='D Loss')
axes[0].plot(g_loss_healthy, 'r-', label='G Loss')
axes[0].set_title('Scenario A')
axes[0].legend()
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')

axes[1].plot(d_loss_collapse, 'b-', label='D Loss')
axes[1].plot(g_loss_collapse, 'r-', label='G Loss')
axes[1].set_title('Scenario B')
axes[1].legend()
axes[1].set_xlabel('Epoch')

axes[2].plot(d_loss_strong_d, 'b-', label='D Loss')
axes[2].plot(g_loss_strong_d, 'r-', label='G Loss')
axes[2].set_title('Scenario C')
axes[2].legend()
axes[2].set_xlabel('Epoch')

plt.tight_layout()
plt.show()

**Q5.2c (4 pts):** For each scenario (A, B, C), describe what is happening during training and whether it represents healthy or problematic training.

*Scenario A:*

*Scenario B:*

*Scenario C:*

## 5.3 Comparing Generative Models

Complete this comparison table:

| Aspect | Autoencoder | VAE | GAN |
|--------|-------------|-----|-----|
| Training stability | | | |
| Output quality (sharpness) | | | |
| Latent space interpolation | | | |
| Can generate new samples? | | | |

---

# Bonus: Integrated Application (10 points)

You are tasked with building an automated quality control system for a manufacturing line that produces printed circuit boards (PCBs). The system must detect defects such as missing components, misaligned parts, and solder bridges.

Design a complete image processing pipeline that addresses this problem. Your answer should include:

**a) Image acquisition considerations (lighting, camera setup) (2 pts)**

*Your answer:*


**b) Preprocessing steps to normalize images and reduce noise (2 pts)**

*Your answer:*


**c) The main detection approach (traditional CV, deep learning, or hybrid) with justification (4 pts)**

*Your answer:*


**d) How you would handle the challenge of limited defect samples for training (2 pts)**

*Your answer:*

---

# LLM Usage Log

Document all LLM interactions. Be honest - this helps you reflect on your learning.

| Question/Task | LLM Used | What I LEARNED |
|---------------|----------|----------------|
| Example: "How does KL divergence work" | ChatGPT | I learned it measures how different two distributions are |
| | | |
| | | |
| | | |

---

# Submission Checklist

- [ ] Student name and ID filled in at the top
- [ ] MY_SEED replaced with YOUR student number
- [ ] All code cells executed (outputs visible)
- [ ] All written questions answered
- [ ] All TODO items completed
- [ ] LLM usage documented
- [ ] Notebook exported as PDF
- [ ] Both `.ipynb` and `.pdf` ready for submission

**File naming:** `LastName_FirstName_Finals.ipynb` and `.pdf`

---

## UP Honor Code Statement

*"Honor and Excellence" (Karangalan at Kahusayan)*

By submitting this exam, I affirm that:
1. All answers represent my own understanding
2. I have documented all external help received
3. I can explain and defend any answer I wrote

**Student Signature:** ______________________ 

**Date:** __________