# **Representing Images as Matrices**

### **Introduction and Objectives**

In this lab, we will explore how images are represented numerically in a machine learning pipeline. By the end of this lab, you will:
- Understand how to load images as NumPy arrays.
- Learn how color channels (RGB) translate to 3D arrays.
- Perform simple manipulations (e.g., resizing, reshaping, normalizing).
- Prepare images as model inputs in a way that is acceptable to typical ML frameworks.


[![Watch the video](https://img.youtube.com/vi/06OHflWNCOE/0.jpg)](https://www.youtube.com/watch?v=06OHflWNCOE)


## 📚 **Imports and Libraries**

**Why these libraries?**
- **OpenCV (cv2)**: A powerful library for reading, writing, and manipulating images and videos. We will use its functions to load images, resize them, convert them to grayscale, and apply a threshold.

### 🔗 **References:**
- [OpenCV Documentation](https://docs.opencv.org/master/)


## **1️⃣ Load Grayscale and Apply Threshold**

In [1]:
!git clone https://github.com/AmmarMohanna/EECE490-690.git

Cloning into 'EECE490-690'...
remote: Enumerating objects: 212, done.[K
remote: Counting objects: 100% (212/212), done.[K
remote: Compressing objects: 100% (181/181), done.[K
remote: Total 212 (delta 51), reused 179 (delta 24), pack-reused 0 (from 0)[K
Receiving objects: 100% (212/212), 36.34 MiB | 22.81 MiB/s, done.
Resolving deltas: 100% (51/51), done.


In [2]:
import cv2

# We will demonstrate loading an image in grayscale mode, resizing it, and applying a threshold.
image = cv2.imread('/content/EECE490-690/Chapters/C2 - Preparing Data for Statistical Machine Learning Algorithms /dino.png', cv2.IMREAD_GRAYSCALE)  # Load as grayscale
image = cv2.resize(image, (15, 15))  # Resize to 15x15 for simplicity

# Apply a binary threshold
# Any pixel value above 100 becomes 1, below 100 becomes 0.
_, binary_image = cv2.threshold(image, 100, 1, cv2.THRESH_BINARY)

# Display the resulting binary image array
binary_image


**Explanation:**

1. **`cv2.imread('/content/dino.png', cv2.IMREAD_GRAYSCALE)`** loads the specified image file in grayscale mode.  
2. **`cv2.resize(image, (15, 15))`** resizes the image to a 15×15 pixel dimension.  
3. **`cv2.threshold(image, 100, 1, cv2.THRESH_BINARY)`** applies a binary threshold:  
   - Threshold value = 100  
   - Values above 100 become 1  
   - Values below 100 become 0  

The output, `binary_image`, is a 2D array (matrix) of 0s and 1s. Each cell in the matrix corresponds to a pixel in the thresholded image.


## **2️⃣ Load and Display Grayscale**

In [3]:
# Load and display the grayscale image as a matrix
image = cv2.imread('/content/EECE490-690/Chapters/C2 - Preparing Data for Statistical Machine Learning Algorithms /dino.png', cv2.IMREAD_GRAYSCALE)
image = cv2.resize(image, (15, 15))

# Display the grayscale image array
image

**Explanation:**

In this cell, we:
- Reload the original image (`/content/dino.png`) in **grayscale**.
- Resize the image to a **15×15** pixel dimension.
- Simply display the **NumPy array** of grayscale pixel values (ranging from 0 to 255).

You can see the shape `(15, 15)` if you print `image.shape`. Every element corresponds to a single pixel's intensity.

**Example**:
```python
print("Image Shape:", image.shape)
print("A pixel example (row=0, col=0):", image[0, 0])


## **3️⃣ Load and Display Color Image**

In [4]:
# Load and display the color image as a matrix
image = cv2.imread('/content/EECE490-690/Chapters/C2 - Preparing Data for Statistical Machine Learning Algorithms /dino.png')  # Load the image in color
image = cv2.resize(image, (15, 15))     # Resize to 15x15

# Display the color image array
image

**Explanation:**

1. **`cv2.imread('/content/images/data_for_labs/chapter_2/dino.png')`** defaults to loading images in **BGR** (Blue, Green, Red) format in OpenCV. Unlike the grayscale image, this now has 3 channels.  
2. **`cv2.resize(image, (15, 15))`** again resizes to 15×15.  
3. Printing the array displays a shape of **(15, 15, 3)**, indicating **height = 15**, **width = 15**, and **3 color channels**.

For color images, each pixel is represented by three intensity values, one for each channel (B, G, R).


## **Key Takeaways**

- **Images as Arrays**: A grayscale image can be viewed as a 2D array, while a color image generally has three channels (e.g., BGR or RGB).
- **Resizing**: Changing the height and width can be crucial for memory constraints and for standardizing input sizes in ML.
- **Thresholding**: Binary thresholding transforms each pixel into a 0 or 1, often used for segmentation or turning images into a simpler form.
- **OpenCV Default**: By default, OpenCV reads images in the BGR channel ordering, while other libraries (like Matplotlib) might expect RGB.


## **Reflection Questions**

1. Why might we want to resize an image before passing it to a machine learning model?
2. What are some downstream computer vision tasks that benefit from binary thresholding?
3. How do the shapes of grayscale vs. color images differ numerically?


# **OPTIONAL: Basic Image Normalization**

When working with images in Machine Learning, we often normalize or scale pixel values. This step can help many algorithms and neural networks train more effectively.

For instance, **min-max normalization** maps pixel values from the range [0, 255] to [0, 1], making the image data more suitable for models that expect small input values or that are sensitive to large variations.

**Why Normalize?**
- It ensures consistent data ranges (e.g., [0, 1]) across different images.
- Some machine learning models (like neural networks) converge faster when inputs are normalized.
- Normalization can reduce the impact of extreme pixel values (e.g., very bright or dark).

Below is a quick demonstration of how to normalize a grayscale image to a [0, 1] range.


In [5]:
# OPTIONAL CODE EXAMPLE: Basic Image Normalization

import cv2
import numpy as np

# 1) Load the image in grayscale
image = cv2.imread('/content/EECE490-690/Chapters/C2 - Preparing Data for Statistical Machine Learning Algorithms /dino.png', cv2.IMREAD_GRAYSCALE)

# 2) Convert the image to float type for precise calculations
image_float = image.astype(np.float32)

# 3) Compute min and max pixel values
min_val = image_float.min()
max_val = image_float.max()

# 4) Perform min-max normalization to the range [0, 1]
normalized_image = (image_float - min_val) / (max_val - min_val)

print("Before normalization:")
print("  Data type:", image.dtype)
print("  Pixel range:", image.min(), "to", image.max())

print("\nAfter normalization:")
print("  Data type:", normalized_image.dtype)  # Should be float
print("  Pixel range:", normalized_image.min(), "to", normalized_image.max())


Before normalization:
  Data type: uint8
  Pixel range: 0 to 255

After normalization:
  Data type: float32
  Pixel range: 0.0 to 1.0


**Explanation:**

1. We load the image in grayscale (cv2.IMREAD_GRAYSCALE).
2. We convert its data type to float32 for precision (originally it’s uint8).
3. We apply the standard formula for min-max normalization:

$$\text{normalized_pixel} = \frac{\text{pixel} - \text{min_val}}{\text{max_val} - \text{min_val}}$$

4. After normalization, the pixel range should be [0.0, 1.0].


###**TODO: Practice Normalizing an Image**

Try one or more of the following mini-tasks to solidify your understanding:

1. **Custom Normalization Range**  
   - Modify the code above so the image is scaled to the range `[0, 255]` again (effectively re-scaling after normalization).  
   - *Hint:* You might multiply the normalized image by 255 and convert it back to `uint8`.

2. **Color Image Normalization**  
   - Load the **color** image (BGR) instead of grayscale.  
   - Convert it to a float type and normalize each channel to [0, 1].  
   - Print or observe the pixel range for each channel.

3. **Compare Histograms** (Optional)  
   - Use `cv2.calcHist()` or any other method to compute the histogram of the original grayscale image versus the normalized one.  
   - See how normalization alters the distribution of pixel values.
   
Document your observations in a new Markdown cell. If you feel comfortable, show a quick visualization of the normalized image using `matplotlib` to confirm that the image still “looks” the same, just with different numerical values.


In [6]:
#TODO HERE

# 🔗 **Additional Resources**

- **OpenCV Documentation:** [https://docs.opencv.org/master/](https://docs.opencv.org/master/)
- **NumPy Documentation (array manipulations):** [https://numpy.org/doc/](https://numpy.org/doc/)
- **Stanford CS231n (Image Data):** [http://cs231n.github.io/python-numpy-tutorial/#images](http://cs231n.github.io/python-numpy-tutorial/#images)
- **OpenCV Basic Image Operations:** [https://docs.opencv.org/master/dc/d71/tutorial_py_emi.html](https://docs.opencv.org/master/dc/d71/tutorial_py_emi.html)


---

# 📚 **Further Suggested Readings**

- **Pillow (PIL) Documentation:** [https://pillow.readthedocs.io/en/stable/](https://pillow.readthedocs.io/en/stable/)  
  Useful for opening and processing images in Python without OpenCV.

- **scikit-image Documentation:** [https://scikit-image.org/docs/stable/](https://scikit-image.org/docs/stable/)  
  A Python library for advanced image processing tasks such as edge detection, filters, and morphological operations.

- **Image Preprocessing Tips:**  
  Articles and tutorials on techniques like normalization, augmentation, and how these impact model performance:
  1. [How to Configure Image Data Augmentation When Training Deep Learning Neural Networks (Machine Learning Mastery)](https://machinelearningmastery.com/how-to-configure-image-data-augmentation-when-training-deep-learning-neural-networks/)  
  2. [Keras Image Preprocessing Layers (Official Keras Docs)](https://keras.io/api/preprocessing/image/)  
  3. [Data Augmentation | TensorFlow Core](https://www.tensorflow.org/tutorials/images/data_augmentation)  
  4. [CS231n: Normalization and BatchNorm](http://cs231n.github.io/neural-networks-2/#batchnorm)

---

# 🔗 **Kaggle Resources and Example Notebooks**

- [**Image Processing Basics with OpenCV for Beginners**](https://www.kaggle.com/code/zeeshanlatif/image-processing-basics-with-opencv-for-beginners): A beginner-friendly notebook demonstrating fundamental OpenCV operations like reading images, color space conversions, and basic transformations.  
- [**Complete Guide to Image Processing with OpenCV**](https://www.kaggle.com/code/natigmamishov/complete-guide-to-image-processing-with-opencv): Covers advanced OpenCV techniques, including filtering, edge detection, and morphological transformations.  
- [**Learn OpenCV by Examples - with Python**](https://www.kaggle.com/code/bulentsiyah/learn-opencv-by-examples-with-python): Provides practical examples for tasks like image thresholding, contour detection, and histogram equalization using OpenCV and Python.  