
Visualizing the Data
--------------------
You have been given the red, green, and blue channels of an image that were taken separately using this technique. These files are named red.npy, green.npy, and blue.npy respectively. Let's load these images, and visualize them.

In [None]:
# Load libraries and convenience functions

from IPython import display
import matplotlib.pyplot as plt
import numpy as np

def load_image(filename):
    img = np.load(filename)
    img = img.astype("float32") / 255.
    return img

def gray2rgb(image):
    return np.repeat(np.expand_dims(image, 2), 3, axis=2)

def show_image(img):
    plt.imshow(img, interpolation='nearest')

In [None]:
# Download data
!wget https://www.cs.columbia.edu/~vondrick/class/coms4732/hw0/blue.npy
!wget https://www.cs.columbia.edu/~vondrick/class/coms4732/hw0/red.npy
!wget https://www.cs.columbia.edu/~vondrick/class/coms4732/hw0/green.npy

In [None]:
import base64

images = [load_image('red.npy'),
          load_image('green.npy'),
          load_image('blue.npy')]

show_image(gray2rgb(np.concatenate(images, axis=1)))

The Problem
-----------
Because these images were taken separately, just combining them in a 3-channel matrix may not work. The code below shows what happens if you simply combine the images without shifting any of the channels. Run the code to visualize what happens if you just combine them without correctly aligning them.

In [None]:
show_image(np.stack(images, axis=-1))

Your Task
---------

Your job is to write a function that takes these three images, and correctly aligns them. Since you have to process many of these images, you do not want to manually align them. Instead, your task is to write a program that automatically finds the alignment, then combines them together to produce the final image.

The easiest way to do this is to find the alignment between two pairs of channels at a time. For example, you can figure out how to align the red channel with the green channel, and the red channel with the blue channel. Then, you can combine them together.

We have given you code to get you started. You should fill in three areas:

1. **score_function(im1, im2)** should take in two images, and return a floating point score to indicate how well the two images are aligned. The lower the score, the better they are aligned. There are many scoring functions you can experiment with. The simplest is Euclidean distance betwen the two images.

2. **align_channels(chan1, chan2)** should take in two images, and return a tuple (dy, dx) indicating how to shift one image into the other. This funnction should call **score_function()** to perform this task. For simplicity, you can assume that the shift is at most -30 pixels to 30 pixels.

3. **combine_images()** should use the found alignments to correctly combine the images into a color image.

Submission
----------

You should export your completed notebook as a PDF and upload to Courseworks. The completed notebook should show your code, as well as the final combined image you created.

In [None]:
# Store the height and width of the images
height, width = images[0].shape

# Pad each image with black by 30 pixels. You do not need to use this, but
# padding may make your implementation easier.
pad_size = 30
images_pad = [np.pad(x, pad_size, mode='constant') for x in images]

# Given two matrices, write a function that returns a number of how well they are aligned.
# The lower the number, the better they are aligned. There are a variety of scoring functions
# you can use. The simplest one is the Euclidean distance between the two matrices.
def score_function(im1, im2):
    return np.sqrt(np.sum((im1 - im2) ** 2))

# Given two matrices chan1 and chan2, return a tuple of how to shift chan2 into chan1. This
# function should search over many different shifts, and find the best shift that minimizes
# the scoring function defined above.
def align_channels(chan1, chan2):
    best_score = float('inf')
    best_offset = (0, 0)

    # Search over possible shifts
    for dy in range(-30, 31):
        for dx in range(-30, 31):
            # Shift chan2 by (dy, dx)
            shifted_chan2 = np.roll(chan2, (dy, dx), axis=(0, 1))

            # Calculate the score
            score = score_function(chan1, shifted_chan2)

            # Update the best score and offset
            if score < best_score:
                best_score = score
                best_offset = (dy, dx)

    return best_offset

# Use the best alignments to now combine the three images. You should use any of the variables
# above to return a tensor that is (Height)x(Width)x3, which is a color image that you can visualize.
def combine_images():
    # Load the padded images
    red_pad = images_pad[0]
    green_pad = images_pad[1]
    blue_pad = images_pad[2]

    # Get the alignment offsets
    rg_dy, rg_dx = align_channels(red_pad, green_pad)
    rb_dy, rb_dx = align_channels(red_pad, blue_pad)

    # Shift the green and blue channels to align with the red channel
    green_aligned = np.roll(green_pad, (rg_dy, rg_dx), axis=(0, 1))
    blue_aligned = np.roll(blue_pad, (rb_dy, rb_dx), axis=(0, 1))

    # Crop the images to remove the padding
    red_cropped = red_pad[pad_size:-pad_size, pad_size:-pad_size]
    green_cropped = green_aligned[pad_size:-pad_size, pad_size:-pad_size]
    blue_cropped = blue_aligned[pad_size:-pad_size, pad_size:-pad_size]

    # Combine the channels into a single color image
    final_image = np.stack([red_cropped, green_cropped, blue_cropped], axis=-1)

    # Normalize the image to the range [0, 1] for visualization
    final_image = (final_image - final_image.min()) / (final_image.max() - final_image.min())

    return final_image

# Load the images (assuming they are already loaded into images_pad)
images_pad = [np.load('red.npy'), np.load('green.npy'), np.load('blue.npy')]

# Combine the images
final_image = combine_images()

# Visualize the final image
if final_image is not None:
    import matplotlib.pyplot as plt
    plt.figure(figsize=(10, 10))
    plt.imshow(final_image)
    plt.axis('off')
    plt.show()

Acknowledgements
----------------

This homework is based on assignments from Subhransu Maji at University of Massachusetts, Amherst, Alyosha Efros at University of California, Berkeley, Jia Deng at University of Michigan, and Deva Ramanan at Carnegie Mellon University.