# Computer Vision 2025 Assignment 1: Image filtering.

In this assignment you will research, implement and test some image filtering operations. Image filtering by convolution is a fundamental step in many computer vision tasks, and you will find it useful to have a firm grasp of how it works.
For example, later in the course we will come across Convolutional Neural Networks (CNNs) which are a combination of convolutional filters learned to solve specific tasks.

The main aims of the assignment are:

- to understand the basics of how images are stored and processed in memory;
- to gain exposure to several common image filters, and understand how they work;
- to get practical experience implementing convolutional image filters;
- to test your intuition about image filtering by running some experiments;
- to report your results in a clear and concise manner.

*This assignment relates to the following ACS CBOK areas: abstraction, design, hardware and software, data and information, HCI and programming.*

## General instructions

Follow the instructions in this Python notebook and the accompanying file *a1code.py* to answer each question. It's your responsibility to make sure your answer to each question is clearly labelled and easy to understand. Note that most questions require some combination of Python code, graphical output, and text analysing or describing your results. Although we will check your code as needed, marks will be assigned primarily based on the quality of your report rather than for the code itself! We are more interested in your understanding of the topic but code clarify, logic, and commenting reflect tge depth of your understanding as well!

Only a small amount of code is required to answer each question. We will make extensive use of the Python libraries

- [numpy](numpy.org) for mathematical functions
- [skimage](https://scikit-image.org) for image loading and processing
- [matplotlib](https://matplotlib.org/stable/index.html) for displaying graphical results
- [jupyter](https://jupyter.org) for Jupyter Notebooks

You should get familiar with the documentation for these libraries so that you can use them effectively.

# The Questions

To get started, below is some setup code to import the libraries we need. You should not need to edit it.

In [1]:
# Numpy is the main package for scientific computing with Python.
import numpy as np

#from skimage import io

# Imports all the methods we define in the file a1code.py
from a1code import *

# Matplotlib is a useful plotting library for python
import matplotlib.pyplot as plt
# This code is to make matplotlib figures appear inline in the
# notebook rather than in a new window.
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Some more magic so that the notebook will reload external python modules;
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2
%reload_ext autoreload

## Question 0: Numpy warm up! (Not Assesed. This part is for you to understand the basic of numpy)

Before starting the assignment, make sure you have a working Python 3 installation, with up to date versions of the libraries mentioned above. If this is all new to you, I'd suggest  downloading an all in one Python installation such as [Anaconda](https://www.anaconda.com/products/individual). Alternatively you can use a Python package manager such as pip or conda, to get the libraries you need. If you're struggling with this please ask a question on the MyUni discussion forum.

For this assignment, you need some familiarity with numpy syntax. The numpy QuickStart should be enough to get you started:

https://numpy.org/doc/stable/user/quickstart.html

Here are a few warm up exercises to make sure you understand the basics. Answer them in the space below. Be sure to print the output of each question so we can see it!

1. Create a 1D numpy array Z with 12 elements. Fill with values 1 to 12.
2. Reshape Z into a 2D numpy array A with 3 rows and 4 columns.
3. Reshape Z into a 2D numpy array B with 4 rows and 3 columns.
4. Calculate the *matrix* product of A and B.
5. Calculate the *element wise* product of $A$ and $B^T$ (B transpose).


In [None]:
# 1. Create a 1D numpy array Z with 12 elements. Fill with values 1 to 12.
Z = np.arange(1, 13)
print("1. Z:", Z)

# 2. Reshape Z into a 2D numpy array A with 3 rows and 4 columns.
A = Z.reshape(3, 4)
print("\n2. A:\n", A)

# 3. Reshape Z into a 2D numpy array B with 4 rows and 3 columns.
B = Z.reshape(4, 3)
print("\n3. B:\n", B)

# 4. Calculate the *matrix* product of A and B.
matrix_product = np.dot(A, B)
print("\n4. Matrix product of A and B:\n", matrix_product)

# 5. Calculate the *element wise* product of A and B^T (B transpose).
element_wise_product = A * B.T
print("\n5. Element-wise product of A and B.T:\n", element_wise_product)

You need to be comfortable with numpy arrays because that is how we store images. Let's do that next!

## Question 1: Loading and displaying an image (10%)

Below is a function to display an image using the pyplot module in matplotlib. Implement the `load()` and `print_stats()` functions in a1code.py so that the following code loads the mandrill image, displays it and prints its height, width and channel.

In [2]:
def display(img, caption=''):
    # Show image using pyplot
    plt.figure()
    plt.imshow(img)
    plt.title(caption)
    plt.axis('off')
    plt.show()

In [None]:
image1 = load('whipbird.jpg')

display(image1, 'whipbird')

print_stats(image1)

Return to this question after reading through the rest of the assignment. Find **at least 2 more images** to use as test cases in this assignment for all the following questions and display them below. Use your print_stats() function to display their height, width and number of channels. Explain *why* you have chosen each image.

In [None]:
### Your code to load and display your images here
# Load your two new images. Make sure the files are in the same folder as the notebook.
try:
    image2 = load('image2.jpg') 
    display(image2, 'Test Image 2')
    print_stats(image2)
except FileNotFoundError:
    print("ACTION REQUIRED: Please add an image named 'image2.jpg' to your folder.")

try:
    image3 = load('image3.jpg')
    display(image3, 'Test Image 3')
    print_stats(image3)
except FileNotFoundError:
    print("ACTION REQUIRED: Please add an image named 'image3.jpg' to your folder.")

***Your explanation of images here***

What happens when a processed pixel value becomes < 0 or > 255,
and what effect does this have on later processing?

***Your explanation of the question above here***

Apply the point process only to one channel of the image (red, green or blue). Display the resulting RGB image

In [None]:
### Your code to apply point processing here
# Make a copy to avoid modifying the original image
image1_processed = image1.copy()

# Apply a contrast change with a factor of 2, but only to the RED channel (index 0)
image1_processed[:, :, 0] = change_contrast(image1_processed[:, :, 0], 2.0)

display(image1_processed, 'Contrast increased on Red channel only')

## Question 2: Image processing (20%)

Now that you have an image stored as a numpy array, let's try some operations on it.

1. Implement the `crop()` function in a1code.py. Use array slicing to crop the image.
2. Implement the `resize()` function in a1code.py.
3. Implement the `change_contrast()` function in a1code.py.
4. Implement the `greyscale()` function in a1code.py.
5. Implement the `binary()` function in a1code.py.

What do you observe when you change the threshold of the binary function?

Apply all these functions with different parameters on your own test images.

In [None]:
# This should crop the bird from the  image; you will need to adjust the parameters for the correct crop size and location
crop_img = crop(image1, 278, 5, 508, 272)
display(crop_img)
print_stats(crop_img)

resize_img = resize(crop_img, 500, 600)
display(resize_img)
print_stats(resize_img)

contrast_img = change_contrast(image1, 0.5)
display(contrast_img)
print_stats(contrast_img)

contrast_img = change_contrast(image1, 1.5)
display(contrast_img)
print_stats(contrast_img)

grey_img = greyscale(image1)
display(grey_img)
print_stats(grey_img)

binary_img = binary(grey_img, 0.3)
display(binary_img)
print_stats(binary_img)

binary_img = binary(grey_img, 0.7)
display(binary_img)
print_stats(binary_img)

# Add your own tests here...


## Question 3: Image filtering (20%)

### 3.1(a) 2D cross-correlation

Using the definition of 2D cross-correlation from week 1, implement the cross-correlation operation in the function `xcorr2D()` in a1code.py.


In [None]:
test_xcorr2D()
print("xcorr2D test passed!")

### 3.1(b) 2D convolution

Using the definition of 2D convolution from week 1, implement the convolution operation in the function `conv2D()` in a1code.py.

In [None]:
test_conv2D()
print("conv2D test passed!")

### 3.1(c) RGB convolution

In the function `conv` in a1code.py, extend your function `conv2D` to work on RGB images, by applying the 2D convolution to each channel independently.

### 3.2 Gaussian filter convolution

Use the `gauss2D` function provided in a1code.py to create a Gaussian kernel, and apply it to your images with convolution. You will obtain marks for trying different tests and analysing the results, for example:

- try varying the image size, and the size and variance of the filter  
- subtract the filtered image from the original - this gives you an idea of what information is lost when filtering

What do you observe and why?

### 3.3 Sobel filters

Define a horizontal and vertical Sobel edge filter kernel and test them on your images. You will obtain marks for testing them and displaying results in interesting ways, for example:

- apply them to an image at different scales
- considering how to display positive and negative gradients
- apply different combinations of horizontal and vertical filters as asked in the Assignment sheet.

In [None]:
# Your code to answer 3.3, 3.4 and display results here.
# 3.2 Gaussian filter convolution
print("--- Gaussian Filter ---")
gauss_kernel = gauss2D(size=9, sigma=2)
blurred_img = conv(grey_img, gauss_kernel)
display(blurred_img, 'Gaussian Blurred Image (sigma=2)')

# Subtract filtered from original to see what's lost (the edges)
lost_info = grey_img - blurred_img
display(lost_info, 'Information Lost During Blurring (Edges)')

# 3.3 Sobel filters
print("--- Sobel Edge Detection ---")
sobel_x = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]], dtype=float)
sobel_y = np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]], dtype=float)
edges_x = conv(grey_img, sobel_x)
edges_y = conv(grey_img, sobel_y)
display(edges_x + 0.5, 'Horizontal Gradients (Sobel X)')
display(edges_y + 0.5, 'Vertical Gradients (Sobel Y)')

# Calculate and display the gradient magnitude
grad_magnitude = np.sqrt(edges_x**2 + edges_y**2)
display(grad_magnitude, 'Gradient Magnitude (Edges)')

***Your comments/analysis of your results here...***

## Question 4: Image sampling and pyramids (25%)

### 4.1 Image Sampling

- Apply your `resize()` function to reduce an image (I) to 0.5\*height and 0.5\*width

- Repeat the above procedure, but apply a Gaussian blur filter to your original image before downsampling it. How does the result compare to your previous output, and to the original image? Why?


### 4.2 Image Pyramids
- Create a Gaussian pyramid as described in week2's lecture on an image.

- Apply a Gaussian kernel to an image I, and resize it with ratio 0.5, to get $I_1$. Repeat this step to get $I_2$, $I_3$ and $I_4$.

- Display these four images in a manner analogus to the example shown in the lectures.




In [None]:
# Your answers to question 4 here
# 4.1 Image Sampling
print("--- Image Sampling ---")
# 1. Reduce size directly
resized_native = resize(image1, int(image1.shape[0] * 0.5), int(image1.shape[1] * 0.5))
display(resized_native, 'Downsampled (Nearest Neighbor)')

# 2. Blur first, then reduce size
gauss_kernel_q4 = gauss2D(size=5, sigma=1.0)
blurred_for_resize = conv(image1, gauss_kernel_q4)
resized_antialiased = resize(blurred_for_resize, int(image1.shape[0] * 0.5), int(image1.shape[1] * 0.5))
display(resized_antialiased, 'Downsampled with Pre-blur (Anti-aliased)')

# 4.2 Image Pyramids
print("--- Gaussian Pyramid ---")
def create_gaussian_pyramid(image, num_levels=4):
    pyramid = [image]
    current_image = image
    for _ in range(num_levels - 1):
        gauss_kernel = gauss2D(size=5, sigma=1)
        blurred = conv(current_image, gauss_kernel)
        downsampled = resize(blurred, int(current_image.shape[0] * 0.5), int(current_image.shape[1] * 0.5))
        pyramid.append(downsampled)
        current_image = downsampled
    return pyramid

pyramid = create_gaussian_pyramid(image1, num_levels=4)

# Display the pyramid
rows, cols, _ = image1.shape
composite_image = np.zeros((rows, cols + cols // 2, 3))
composite_image[:rows, :cols, :] = pyramid[0]
y_offset = 0
for i in range(1, len(pyramid)):
    h, w, _ = pyramid[i].shape
    composite_image[y_offset:y_offset + h, cols:cols + w, :] = pyramid[i]
    y_offset += h
display(composite_image, "Gaussian Pyramid")

***Your comments/analysis of your results here...***

## Question 5: Implement a blob detector (25%)

The image filtering lectures, particularly Lecture 2, have covered the details related to this question.

### 5.1 Apply and analyse a blob detector

- Create a Laplacian of Gaussian (LoG) filter in the function `LoG2D()` and visualize its response on your images. You can use the template function (and hints therein) for the task if you wish.

- Modify parameters of the LoG filters and apply them to an image of your choice. Show how these variations are manifested in the output.

- Repeat the experiment by rescaling the image with a combination of appropriate filters designed by you for these assignment. What correlations do you find when changing the scale or modifying the filters?

- How does the response of LoG filter change when you rotate the image by 90 degrees? You can write a function to rotate the image or use an externally rotated image for this task.





In [None]:
# Your code to answer question 5 and display results here
# 5.1 Apply and analyse a blob detector
print("--- Laplacian of Gaussian (LoG) Blob Detector ---")
# Create LoG filters with different sigmas
log_filter_small = LoG2D(size=9, sigma=1.5)
log_filter_medium = LoG2D(size=15, sigma=3.0)
log_filter_large = LoG2D(size=25, sigma=6.0)

# Apply filters to the greyscale image
log_response_small = conv(grey_img, log_filter_small)
log_response_medium = conv(grey_img, log_filter_medium)
log_response_large = conv(grey_img, log_filter_large)

# Display the responses
display(np.abs(log_response_small), 'LoG Response for Small Blobs (sigma=1.5)')
display(np.abs(log_response_medium), 'LoG Response for Medium Blobs (sigma=3.0)')
display(np.abs(log_response_large), 'LoG Response for Large Blobs (sigma=6.0)')

# Test rotational invariance
print("--- Testing Rotational Invariance ---")
rotated_grey = np.rot90(grey_img)
rotated_response = conv(rotated_grey, log_filter_medium)
display(np.abs(rotated_response), 'Response on 90-degree Rotated Image')