# 1 Camera projection Matrix [30 pts]
Study the `Projection and Dolly Zoom` notebook (`dolly_zoom.py` or at [link](https://colab.research.google.com/drive/1LlOwsp5zCV-dIfZRlmWMJfbeArUTclQo) ) and finish the following tasks.

(a) Write a function `rotY()` which takes an angle theta (in radian) and outputs the 3D rotation matrix of
rotating by theta about the y-axis (right-hand rule). You may refer to this Wikipedia entry: https:
//en.wikipedia.org/wiki/Rotation matrix#Basic rotations After you are done, refer to the starter code to
generate and submit cube.gif of a cube rotating around itself. (5 pts)

In [1]:
from resources.hw1_starter import *

# Problem 1.(a)
def rotY(theta_radian):
    # takes an angle theta (in radian) and outputs the 3D rotation matrix of
    # rotating by theta about the y-axis (right-hand rule).
    # theta_radian: R=np.eye(3)
    return np.array([
        [ np.cos(theta_radian), 0, np.sin(theta_radian)],
        [ 0,                    1, 0                   ],
        [-np.sin(theta_radian), 0, np.cos(theta_radian)]
    ])

generate_gif()



(b) Similarly, write a function `rotX()` which rotates about the x-axis. Let $θ = \frac{\pi}{4}$, consider the following
two transformations:
1. rotX(theta), followed by rotY(theta)
2. rotY(theta), followed by rotX(theta)
Using `renderCube()` in the same way, plot the resulting view of the cube from two transformations.
Are 3D rotation matrices commutative? (5 pts)

In [None]:
# for the rest of the code of Task 1, Please check in the hw1_starter.py. 

(c) Combine rotX() and rotY(), choose an appropriate order and a pair of parameters so that renderCube() draws
a projection of the cube where one diagonal of the cube is projected to a single point, as shown
in Figure 1 (left). Report the order and parameters you choose. (10 pts)
> Hint: The diagonal of the cube rotates together with the cube. When it projects to a single point in
2D, it is horizontal and perpendicular to the camera plane in 3D. You can either make a mathematical
calculation or perform a numerical search.

(d) Implement an orthographic camera by either adding a branch to function projectLines(), or refer to it
and write a new one. Then plot the same rotated cube in the previous part with this orthographic
camera. It should look like Figure 1 (right). (10 pts)
![](Fig/1.png "Figure 1: The diagonal of a cube projected to a single point")

# 2 Prokudin-Gorskii: Color from grayscale photographs [50 pts]

In this part, you are tasked with implementing the dream of Russian photographer, Sergei Mikhailovich
Prokudin-Gorskii (1863-1944), via a project invented by Russian-American vision researcher, Alexei A.
Efros (1975-present). Sergei was a visionary who believed in a future with color photography (which we
now take for granted). During his lifetime, he traveled across the Russian Empire taking photographs
through custom color filters at the whim of the czar. To capture color for the photographers of the future,
he decided to take three separate black-and-white pictures of every scene, each with a red, green, or blue
color filter in front of the camera. His hope was that you, as a student in the future, would come along and
produce beautiful color images by combining his 3 separate, filtered images.

## Task 1: Combine (5 pts) 
We will provide you with a folder of Prokudin-Gorskii’s black-and-white
(grayscale) image composites (prokudin-gorskii/ in the assignment zip). Each composite (alter- natively
triple-framed image or triptych) contains three grayscale photos preserved from the early 1900s. The
composite looks like a three-panel vertical comic strip, with each grayscale photo in the composite
positioned vertically above one another. These photos represent images captured with a blue, green, and
red filter. Choose a single composite from this folder (your favorite) and write a program in Python that
takes the three grayscale panels and simply stacks them across the third color channel dimension to
produce a single, colored image. We expect this stacked photo to look wonky and unaligned- fixing this is
what you will do in Task 2. Make sure to save your images as RGB instead of BGR and include them in
your report.
Specifically: Write a function that loads a grayscale tripled-framed image from prokudin-gorskii/ with
something like plt.imread(), chops it vertically into thirds, then saves a color image with each third as the
correct color channel. Save the output colored image in your report.

In [2]:
import os
import numpy as np
from PIL import Image

folder_path = './resources/prokudin-gorskii'
file_names = os.listdir('./resources/prokudin-gorskii')

if not os.path.isdir(os.path.join('./Homework_result/Problem_2',"Task_1")):
    os.mkdir(os.path.join('./Homework_result/Problem_2',"Task_1"))

for file_name in file_names:
    file_path = os.path.join(folder_path, file_name)
    
    if file_name.lower().endswith(('.jpg')):
        try:
            image = np.array(Image.open(file_path).convert('L'))
            
            print(f"Successful open：{file_name}")
            
            split_height = image.shape[0] // 3
            
            image  = image[ : split_height * 3, :]
            
            retVal = np.dstack((
                image[2 * split_height :                 , :],  # R
                image[split_height     : 2 * split_height, :],  # G
                image[0                :     split_height, :]   # B
            ))
            
            # Image.fromarray(retVal.astype(np.uint8)).save()
            
            Image.fromarray(retVal).save(os.path.join('./Homework_result/Problem_2',"Task_1", file_name))
            
        except Exception as e:
            print(f"Cannot open {file_name}：{e}")
    else:
        print(f"Skip：{file_name}, not an image.")
        

Successful open：00125v.jpg
Successful open：00149v.jpg
Successful open：00153v.jpg
Successful open：00351v.jpg
Successful open：00398v.jpg
Successful open：01112v.jpg
Skip：Task_1, not an image.
Skip：Task_2, not an image.


## Task 2: Alignment (25 pts) 
As you will have noticed, the photos are misaligned due to inadvertent
jostling of the camera between each shot. Your second task is to fix this. You need to search over possible
pixel offsets in the range of [-15, 15] to find the best alignment for the different R, G, and B channels.
The simplest way is to keep one channel fixed, say R, and align the G and B channels to it by searching
over the offset range both horizontally and vertically. Pick the alignment that maximizes a similarity
metric (of your choice) between the channels. One such measure is dot product, i.e, R G. Another is
normalized cross- correlation, which is simply the dot product between the L2 normalized R and G
vectors. After writing this function, run it on all of the images in prokudin-gorskii/ and also on ’efros
tableau.jpg’, so Professor Efros can have his photo restored to color. Include these aligned images and the
offsets in your report. For full credit, your report needs to include properly aligned images - find a
similarity metric that will accomplish this.

In [3]:
import os
import numpy as np
from PIL import Image

def ncc(image1, image2):
    image1_mean = np.mean(image1)
    image2_mean = np.mean(image2)
    numerator = np.sum((image1 - image1_mean) * (image2 - image2_mean))
    denominator = np.sqrt(np.sum((image1 - image1_mean) ** 2) * np.sum((image2 - image2_mean) ** 2))
    return numerator / denominator if denominator != 0 else 0

def align_channel(ref_channel, target_channel, search_range):
    max_ncc = -1
    best_dx = 0
    best_dy = 0
    for dx in range(-search_range, search_range + 1):
        for dy in range(-search_range, search_range + 1):
            shifted_channel = np.roll(target_channel, shift=(dy, dx), axis=(0, 1))
            h, w = ref_channel.shape
            crop_margin = search_range
            ref_crop = ref_channel[crop_margin:-crop_margin, crop_margin:-crop_margin]
            shifted_crop = shifted_channel[crop_margin:-crop_margin, crop_margin:-crop_margin]
            current_ncc = ncc(ref_crop, shifted_crop)
            if current_ncc > max_ncc:
                max_ncc = current_ncc
                best_dx = dx
                best_dy = dy
    return best_dx, best_dy

folder_path = './resources/prokudin-gorskii'
output_folder = os.path.join('./Homework_result/Problem_2',"Task_2")

if not os.path.isdir(output_folder):
    os.mkdir(output_folder)

file_names = os.listdir(folder_path)

for file_name in file_names:
    file_path = os.path.join(folder_path, file_name)
    
    if file_name.lower().endswith('.jpg'):
        try:
            image = np.array(Image.open(file_path).convert('L'))
            print(f"Processing：{file_name}")
            
            height = image.shape[0] // 3
            image = image[:height*3, :] 
            B = image[0:height, :]
            G = image[height:2*height, :]
            R = image[2*height:3*height, :]
            
            search_range = 15 
            
            dx_g, dy_g = align_channel(R, G, search_range)
            print(f"Best offset for G channel: dx={dx_g}, dy={dy_g}")
            G_aligned = np.roll(G, shift=(dy_g, dx_g), axis=(0, 1))
            
            dx_b, dy_b = align_channel(R, B, search_range)
            print(f"Best offset for B channel: dx={dx_b}, dy={dy_b}")
            B_aligned = np.roll(B, shift=(dy_b, dx_b), axis=(0, 1))
            
            aligned_image = np.dstack((R, G_aligned, B_aligned))
            aligned_image = aligned_image.astype(np.uint8)
            
            output_path = os.path.join(output_folder, file_name)
            Image.fromarray(aligned_image).save(output_path)
            print(f"Saved aligned image to: {output_path}\n")
            
        except Exception as e:
            print(f"Cannot process {file_name}：{e}\n")
    else:
        print(f"Skip：{file_name}, not an image.\n")


Processing：00125v.jpg
Best offset for G channel: dx=1, dy=-4
Best offset for B channel: dx=-1, dy=-10
Saved aligned image to: ./Homework_result/Problem_2\Task_2\00125v.jpg

Processing：00149v.jpg
Best offset for G channel: dx=0, dy=-5
Best offset for B channel: dx=-1, dy=-9
Saved aligned image to: ./Homework_result/Problem_2\Task_2\00149v.jpg

Processing：00153v.jpg
Best offset for G channel: dx=-2, dy=-7
Best offset for B channel: dx=0, dy=-15
Saved aligned image to: ./Homework_result/Problem_2\Task_2\00153v.jpg

Processing：00351v.jpg
Best offset for G channel: dx=-1, dy=-9
Best offset for B channel: dx=-1, dy=-13
Saved aligned image to: ./Homework_result/Problem_2\Task_2\00351v.jpg

Processing：00398v.jpg
Best offset for G channel: dx=-1, dy=-6
Best offset for B channel: dx=-4, dy=-11
Saved aligned image to: ./Homework_result/Problem_2\Task_2\00398v.jpg

Processing：01112v.jpg
Best offset for G channel: dx=-2, dy=-5
Best offset for B channel: dx=-2, dy=-6
Saved aligned image to: ./Homewo

**Specifically**: Write a function to align the 3 channels of the image produced by Task 1. This function
should output the (x,y) offsets required for shifting two of the color channels with respect to third. The
third channel might then have a (0,0) offset. Save the newly aligned images from prokudin-gorskii/ and
’efros tableau.jpg’ in your report, along with the offsets for each color channel. Report the similarity
metric you choose.

**Hint**: To offset the channels while keeping the same dimensions among them, you can use either np.roll()
to roll over boundaries, or np.pad() for padding.

## Task 3: Pyramid (20 pts) 
For very large offsets (and high-resolution images), comparing all the alignments
for a broad range of displacements (e.g. [-30, 30]) can be computationally intensive. We will have
you implement a recursive version of your algorithm that starts by estimating an image’s alignment on a
low-resolution version of itself, before refining it on higher resolutions. To implement this, you will build
a two-level image pyramid. To do this, you must first scale the triple-frame images down by a factor of 2
(both the width and height should end up halved). Starting with your shrunk, coarse images, execute your
alignment from Task 2 over the following range of offsets [-15, 15]. Choose the best alignment based on
your similarity metric and treat it as the new current alignment. Then in the full resolution images, use
this new current alignment as a starting place to again run the alignment from Task 2 in a small range [-
15, 15]. Run this Pyramid task on the ’seoul tableau.jpg’ and ’vancouver tableau.jpg’ images. If your
course project goes well.

**Specifically**: Use cv2.resize() to shrink each image in the triptych. Use your code from Task 2 to align
them and get the intermediate offset. Shift the original images accordingly and align again at full
resolution. Report the intermediate offset (at the coarse resolution), the next offset at the full resolution,
and what the overall total offset was that includes both of these. Save the aligned images in color in your
report.

**Hint**: If you’re struggling, use a different color channel as your anchor!

**Report** You must submit a report that includes the offsets, color output images, and description required
above. Your description should be such that a reader could implement what you’ve done after reading your
report.

In [4]:
import os
import numpy as np
from PIL import Image
import cv2  # 导入 OpenCV 库用于图像缩放

def ncc(image1, image2):
    image1_mean = np.mean(image1)
    image2_mean = np.mean(image2)
    numerator = np.sum((image1 - image1_mean) * (image2 - image2_mean))
    denominator = np.sqrt(np.sum((image1 - image1_mean) ** 2) * np.sum((image2 - image2_mean) ** 2))
    return numerator / denominator if denominator != 0 else 0

def align_channel(ref_channel, target_channel, search_range, start_dx=0, start_dy=0):
    max_ncc = -1
    best_dx = start_dx
    best_dy = start_dy
    h, w = ref_channel.shape
    crop_margin = search_range
    ref_crop = ref_channel[crop_margin:-crop_margin, crop_margin:-crop_margin]
    
    for dx in range(start_dx - search_range, start_dx + search_range + 1):
        for dy in range(start_dy - search_range, start_dy + search_range + 1):
            shifted_channel = np.roll(target_channel, shift=(dy, dx), axis=(0, 1))
            shifted_crop = shifted_channel[crop_margin:-crop_margin, crop_margin:-crop_margin]
            current_ncc = ncc(ref_crop, shifted_crop)
            if current_ncc > max_ncc:
                max_ncc = current_ncc
                best_dx = dx
                best_dy = dy
    return best_dx, best_dy

folder_path = './resources/'
output_folder = os.path.join('./Homework_result/Problem_2',"Task_3")

if not os.path.isdir(output_folder):
    os.mkdir(output_folder)

file_names = [
    'seoul_tableau.jpg',
    'vancouver_tableau.jpg',
    'efros_tableau.jpg'
]

for file_name in file_names:
    file_path = os.path.join(folder_path, file_name)
    
    if file_name.lower().endswith('.jpg'):
        try:
            # 读取原始图像并转换为灰度图像
            image = np.array(Image.open(file_path).convert('L'))
            print(f"Processing：{file_name}")
            
            # 将图像分割为三个通道
            height = image.shape[0] // 3
            image = image[:height*3, :] 
            B = image[0:height, :]
            G = image[height:2*height, :]
            R = image[2*height:3*height, :]
            
            # 使用 cv2.resize() 将通道缩小一半
            R_small = cv2.resize(R, (R.shape[1] // 2, R.shape[0] // 2), interpolation=cv2.INTER_AREA)
            G_small = cv2.resize(G, (G.shape[1] // 2, G.shape[0] // 2), interpolation=cv2.INTER_AREA)
            B_small = cv2.resize(B, (B.shape[1] // 2, B.shape[0] // 2), interpolation=cv2.INTER_AREA)
            
            # 在缩小的图像上进行粗略对齐
            search_range_coarse = 15
            print("Performing coarse alignment on downsampled images...")
            dx_g_coarse, dy_g_coarse = align_channel(R_small, G_small, search_range_coarse)
            dx_b_coarse, dy_b_coarse = align_channel(R_small, B_small, search_range_coarse)
            print(f"Coarse alignment offsets for G channel: dx={dx_g_coarse}, dy={dy_g_coarse}")
            print(f"Coarse alignment offsets for B channel: dx={dx_b_coarse}, dy={dy_b_coarse}")
            
            # 将粗略偏移量放大两倍，作为高分辨率对齐的起始偏移量
            dx_g_initial = dx_g_coarse * 2
            dy_g_initial = dy_g_coarse * 2
            dx_b_initial = dx_b_coarse * 2
            dy_b_initial = dy_b_coarse * 2
            
            # 在高分辨率的图像上进行精细对齐
            search_range_fine = 15
            print("Refining alignment on full-resolution images...")
            dx_g_fine, dy_g_fine = align_channel(R, G, search_range_fine, start_dx=dx_g_initial, start_dy=dy_g_initial)
            dx_b_fine, dy_b_fine = align_channel(R, B, search_range_fine, start_dx=dx_b_initial, start_dy=dy_b_initial)
            print(f"Fine alignment offsets for G channel: dx={dx_g_fine}, dy={dy_g_fine}")
            print(f"Fine alignment offsets for B channel: dx={dx_b_fine}, dy={dy_b_fine}")
            
            # 计算总的偏移量
            total_dx_g = dx_g_fine
            total_dy_g = dy_g_fine
            total_dx_b = dx_b_fine
            total_dy_b = dy_b_fine
            print(f"Total offsets for G channel: dx={total_dx_g}, dy={total_dy_g}")
            print(f"Total offsets for B channel: dx={total_dx_b}, dy={total_dy_b}")
            
            # 对齐通道
            G_aligned = np.roll(G, shift=(total_dy_g, total_dx_g), axis=(0, 1))
            B_aligned = np.roll(B, shift=(total_dy_b, total_dx_b), axis=(0, 1))
            
            # 合并对齐后的通道
            aligned_image = np.dstack((R, G_aligned, B_aligned))
            aligned_image = aligned_image.astype(np.uint8)
            
            # 保存对齐后的图像
            output_path = os.path.join(output_folder, file_name)
            Image.fromarray(aligned_image).save(output_path)
            print(f"Saved aligned image to: {output_path}\n")
            
            # 报告偏移量
            print("Offset Summary:")
            print(f"G channel - Coarse Offset: dx={dx_g_coarse}, dy={dy_g_coarse}")
            print(f"G channel - Fine Offset Adjustment: dx={dx_g_fine - dx_g_initial}, dy={dy_g_fine - dy_g_initial}")
            print(f"G channel - Total Offset: dx={total_dx_g}, dy={total_dy_g}")
            print(f"B channel - Coarse Offset: dx={dx_b_coarse}, dy={dy_b_coarse}")
            print(f"B channel - Fine Offset Adjustment: dx={dx_b_fine - dx_b_initial}, dy={dy_b_fine - dy_b_initial}")
            print(f"B channel - Total Offset: dx={total_dx_b}, dy={total_dy_b}\n")
            
        except Exception as e:
            print(f"Cannot process {file_name}：{e}\n")
    else:
        print(f"Skip：{file_name}, not an image.\n")


Processing：seoul_tableau.jpg
Performing coarse alignment on downsampled images...
Coarse alignment offsets for G channel: dx=3, dy=0
Coarse alignment offsets for B channel: dx=1, dy=-1
Refining alignment on full-resolution images...
Fine alignment offsets for G channel: dx=6, dy=0
Fine alignment offsets for B channel: dx=1, dy=-2
Total offsets for G channel: dx=6, dy=0
Total offsets for B channel: dx=1, dy=-2
Saved aligned image to: ./Homework_result/Problem_2\Task_3\seoul_tableau.jpg

Offset Summary:
G channel - Coarse Offset: dx=3, dy=0
G channel - Fine Offset Adjustment: dx=0, dy=0
G channel - Total Offset: dx=6, dy=0
B channel - Coarse Offset: dx=1, dy=-1
B channel - Fine Offset Adjustment: dx=-1, dy=0
B channel - Total Offset: dx=1, dy=-2

Processing：vancouver_tableau.jpg
Performing coarse alignment on downsampled images...
Coarse alignment offsets for G channel: dx=11, dy=-1
Coarse alignment offsets for B channel: dx=6, dy=4
Refining alignment on full-resolution images...
Fine al

# 3. Color Spaces and illuminance [20 pts]

The same color may look different under different lighting conditions. Images indoor.png and outdoor.png are two photos of a
same Rubik’s cube under different illuminances2.
1. Load the images and plot their R, G, B channels separately as grayscale images using plt.imshow() (beware of
normalization). Then convert them into LAB color space using cv2.cvtColor() and plot the three channels again. Include the
plots in your report. (5 pts)
2. How do you know the illuminance change is better separated in LAB color space? (5 pts)
3. Choose two different lighting conditions and take two photos of a non-specular object. Try to make the same color look
as different as possible (a large distance on AB plane in LAB space). Below is an example of two photos of the same piece
of paper, taken in the basement and by the window respectively.


Submit the two images with names im1.jpg and im2.jpg, both cropped and scaled to 256X256. Under the same folder, also
submit a file info.txt that contains two lines: Line 1 contains four integers x1,y1,x2,y2 where we will take a 32X32 patch
around the coordinate on each image and compare colors. (You can use plt.imshow() and plt.show() to bring up a window
where you can select pixel with coordinates.) Line 2 is a description of the lighting conditions that you choose. An example of
this file is provided for you in the folder. (10 pts)

In [9]:
import matplotlib
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

# 定义一个函数来绘制 RGB 和 LAB 通道并保存在同一个图像中
def plot_channels(image_rgb, image_lab, title_prefix, save_path):
    R_channel = image_rgb[:, :, 0]
    G_channel = image_rgb[:, :, 1]
    B_channel = image_rgb[:, :, 2]

    L_channel = image_lab[:, :, 0]
    A_channel = image_lab[:, :, 1]
    B_channel_lab = image_lab[:, :, 2]

    # 归一化处理 LAB 通道
    L_channel_normalized = cv2.normalize(L_channel, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
    A_channel_normalized = cv2.normalize(A_channel, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
    B_channel_lab_normalized = cv2.normalize(B_channel_lab, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)

    # 绘制
    plt.figure(figsize=(12, 6))
    
    # 绘制 RGB 通道
    plt.subplot(3, 3, 1)
    plt.imshow(R_channel, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - Red')
    plt.axis('off')

    plt.subplot(3, 3, 2)
    plt.imshow(G_channel, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - Green')
    plt.axis('off')

    plt.subplot(3, 3, 3)
    plt.imshow(B_channel, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - Blue')
    plt.axis('off')

    # 绘制 LAB 通道（原始数据）
    plt.subplot(3, 3, 4)
    plt.imshow(L_channel, cmap='gray', vmin=0, vmax=100)
    plt.title(f'{title_prefix} - L (Original)')
    plt.axis('off')

    plt.subplot(3, 3, 5)
    plt.imshow(A_channel, cmap='gray', vmin=-128, vmax=127)
    plt.title(f'{title_prefix} - A (Original)')
    plt.axis('off')

    plt.subplot(3, 3, 6)
    plt.imshow(B_channel_lab, cmap='gray', vmin=-128, vmax=127)
    plt.title(f'{title_prefix} - B (Original)')
    plt.axis('off')

    # 绘制 LAB 通道（归一化后）
    plt.subplot(3, 3, 7)
    plt.imshow(L_channel_normalized, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - L (Normalized)')
    plt.axis('off')

    plt.subplot(3, 3, 8)
    plt.imshow(A_channel_normalized, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - A (Normalized)')
    plt.axis('off')

    plt.subplot(3, 3, 9)
    plt.imshow(B_channel_lab_normalized, cmap='gray', vmin=0, vmax=255)
    plt.title(f'{title_prefix} - B (Normalized)')
    plt.axis('off')

    # 保存图像
    plt.savefig(save_path, dpi=300, bbox_inches='tight')
    plt.close()




# 1. 加载图像并分别绘制其 R、G、B 和 LAB 通道
# 加载室内和室外图像（请确保图像文件位于当前目录）
indoor_img = cv2.imread('./resources/indoor.png')
outdoor_img = cv2.imread('./resources/outdoor.png')

# 将图像从 BGR 转换为 RGB（OpenCV 默认使用 BGR 格式）
indoor_img_rgb = cv2.cvtColor(indoor_img, cv2.COLOR_BGR2RGB)
outdoor_img_rgb = cv2.cvtColor(outdoor_img, cv2.COLOR_BGR2RGB)

# 将图像从 RGB 转换为 LAB 色彩空间
indoor_lab = cv2.cvtColor(indoor_img_rgb, cv2.COLOR_RGB2LAB)
outdoor_lab = cv2.cvtColor(outdoor_img_rgb, cv2.COLOR_RGB2LAB)

# 绘制室内和室外图像的 RGB 和 LAB 通道
plot_channels(indoor_img_rgb, indoor_lab, 'Task1_Indoor', './task1_indoor_channels.png')
plot_channels(outdoor_img_rgb, outdoor_lab, 'Task1_Outdoor', './task1_outdoor_channels.png')

# 2. 对室内和室外图像进行润色
# 选择图像的中心坐标
x1, y1 = 44, 44
x2, y2 = 44, 44

print(f"Indoor image center coordinate: x1={x1}, y1={y1}")
print(f"Outdoor image center coordinate: x2={x2}, y2={y2}")

# 提取 32x32 的色块
indoor_patch_lab = indoor_lab[y1-10:y1+10, x1-10:x1+10, :]
outdoor_patch_lab = outdoor_lab[y2-10:y2+10, x2-10:x2+10, :]

indoor_patch_rgb = indoor_img_rgb[y1-10:y1+10, x1-10:x1+10, :]
outdoor_patch_rgb = outdoor_img_rgb[y2-10:y2+10, x2-10:x2+10, :]

# 计算色块的平均 LAB 值
mean_indoor_lab = np.mean(indoor_patch_lab.reshape(-1, 3), axis=0)
mean_outdoor_lab = np.mean(outdoor_patch_lab.reshape(-1, 3), axis=0)

# 计算色块的平均 RGB 值
mean_indoor_rgb = np.mean(indoor_patch_rgb.reshape(-1, 3), axis=0)
mean_outdoor_rgb = np.mean(outdoor_patch_rgb.reshape(-1, 3), axis=0)

print(f"Indoor image patch average LAB value: {mean_indoor_lab}")
print(f"Outdoor image patch average LAB value: {mean_outdoor_lab}")

print(f"Indoor image patch average RGB value: {mean_indoor_rgb}")
print(f"Outdoor image patch average RGB value: {mean_outdoor_rgb}")

# 使用 LAB 进行颜色调整
a_diff_lab = mean_indoor_lab[1] - mean_outdoor_lab[1]
b_diff_lab = mean_indoor_lab[2] - mean_outdoor_lab[2]

outdoor_lab_adjusted = outdoor_lab.astype(np.float32)
outdoor_lab_adjusted[:, :, 1] += a_diff_lab
outdoor_lab_adjusted[:, :, 2] += b_diff_lab
outdoor_lab_adjusted[:, :, 1] = np.clip(outdoor_lab_adjusted[:, :, 1], 0, 255)
outdoor_lab_adjusted[:, :, 2] = np.clip(outdoor_lab_adjusted[:, :, 2], 0, 255)
outdoor_lab_adjusted = outdoor_lab_adjusted.astype(np.uint8)
outdoor_rgb_adjusted_lab = cv2.cvtColor(outdoor_lab_adjusted, cv2.COLOR_LAB2RGB)

# 使用 RGB 进行颜色调整
rgb_diff = mean_indoor_rgb - mean_outdoor_rgb
outdoor_rgb_adjusted_rgb = np.clip(outdoor_img_rgb + rgb_diff, 0, 255).astype(np.uint8)

# 显示并保存调整前后的图像
plt.figure(figsize=(15, 5))

plt.subplot(1, 5, 1)
plt.imshow(indoor_img_rgb)
plt.title('Original Indoor')
plt.axis('off')

plt.subplot(1, 5, 2)
plt.imshow(outdoor_img_rgb)
plt.title('Original Outdoor')
plt.axis('off')

plt.subplot(1, 5, 3)
plt.imshow(outdoor_rgb_adjusted_lab)
plt.title('LAB Adjusted Outdoor')
plt.axis('off')

plt.subplot(1, 5, 4)
plt.imshow(outdoor_rgb_adjusted_rgb)
plt.title('RGB Adjusted Outdoor')
plt.axis('off')

plt.subplot(1, 5, 5)
plt.imshow(indoor_img_rgb)
plt.title('Comparison Indoor')
plt.axis('off')

plt.tight_layout()
plt.savefig('./task2_outdoor_vs_indoor_adjusted_rgb_lab.png', dpi=300, bbox_inches='tight')
plt.close()

# 3. 对自己的图像进行同样的操作
# 加载自己的图像
im1 = cv2.imread('./resources/im1.png')
im2 = cv2.imread('./resources/im2.png')

# 将图像从 BGR 转换为 RGB
im1_rgb = cv2.cvtColor(im1, cv2.COLOR_BGR2RGB)
im2_rgb = cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)

# 将图像缩放到 256x256
im1_resized = cv2.resize(im1_rgb, (256, 256), interpolation=cv2.INTER_AREA)
im2_resized = cv2.resize(im2_rgb, (256, 256), interpolation=cv2.INTER_AREA)

# 将图像从 RGB 转换为 LAB 色彩空间
im1_lab = cv2.cvtColor(im1_resized, cv2.COLOR_RGB2LAB)
im2_lab = cv2.cvtColor(im2_resized, cv2.COLOR_RGB2LAB)

# 绘制自己的图像的 RGB 和 LAB 通道
plot_channels(im1_resized, im1_lab, 'Task3_im1', './task3_im1_channels.png')
plot_channels(im2_resized, im2_lab, 'Task3_im2', './task3_im2_channels.png')

# 选择坐标点
x1, y1 = 101, 136
x2, y2 = 100, 134

print(f"im1.png coordinate: x1={x1}, y1={y1}")
print(f"im2.png coordinate: x2={x2}, y2={y2}")

# 提取 32x32 的色块
patch1_lab = im1_lab[y1-16:y1+16, x1-16:x1+16, :]
patch2_lab = im2_lab[y2-16:y2+16, x2-16:x2+16, :]

patch1_rgb = im1_resized[y1-16:y1+16, x1-16:x1+16, :]
patch2_rgb = im2_resized[y2-16:y2+16, x2-16:x2+16, :]

# 计算色块的平均 LAB 值
mean_lab1 = np.mean(patch1_lab.reshape(-1, 3), axis=0)
mean_lab2 = np.mean(patch2_lab.reshape(-1, 3), axis=0)

# 计算色块的平均 RGB 值
mean_rgb1 = np.mean(patch1_rgb.reshape(-1, 3), axis=0)
mean_rgb2 = np.mean(patch2_rgb.reshape(-1, 3), axis=0)

print(f"im1.png 色块的平均 LAB 值：{mean_lab1}")
print(f"im2.png 色块的平均 LAB 值：{mean_lab2}")

print(f"im1.png 色块的平均 RGB 值：{mean_rgb1}")
print(f"im2.png 色块的平均 RGB 值：{mean_rgb2}")

# 使用 LAB 进行颜色调整
a_diff_lab = mean_lab1[1] - mean_lab2[1]
b_diff_lab = mean_lab1[2] - mean_lab2[2]

im2_lab_adjusted = im2_lab.astype(np.float32)
im2_lab_adjusted[:, :, 1] += a_diff_lab
im2_lab_adjusted[:, :, 2] += b_diff_lab
im2_lab_adjusted[:, :, 1] = np.clip(im2_lab_adjusted[:, :, 1], 0, 255)
im2_lab_adjusted[:, :, 2] = np.clip(im2_lab_adjusted[:, :, 2], 0, 255)
im2_lab_adjusted = im2_lab_adjusted.astype(np.uint8)
im2_rgb_adjusted_lab = cv2.cvtColor(im2_lab_adjusted, cv2.COLOR_LAB2RGB)

# 使用 RGB 进行颜色调整
rgb_diff = mean_rgb1 - mean_rgb2
im2_rgb_adjusted_rgb = np.clip(im2_resized + rgb_diff, 0, 255).astype(np.uint8)

# 显示调整前后的图像并保存
plt.figure(figsize=(15, 5))

plt.subplot(1, 5, 1)
plt.imshow(im1_resized)
plt.title('Original im1')
plt.axis('off')

plt.subplot(1, 5, 2)
plt.imshow(im2_resized)
plt.title('Original im2')
plt.axis('off')

plt.subplot(1, 5, 3)
plt.imshow(im2_rgb_adjusted_lab)
plt.title('LAB Adjusted im2')
plt.axis('off')

plt.subplot(1, 5, 4)
plt.imshow(im2_rgb_adjusted_rgb)
plt.title('RGB Adjusted im2')
plt.axis('off')

plt.subplot(1, 5, 5)
plt.imshow(im1_resized)
plt.title('Comparison im1')
plt.axis('off')

plt.tight_layout()
plt.savefig('./task3_im2_vs_im1_adjusted_rgb_lab.png', dpi=300, bbox_inches='tight')
plt.close()


Indoor image center coordinate: x1=44, y1=44
Outdoor image center coordinate: x2=44, y2=44
Indoor image patch average LAB value: [131.925  127.5075 136.4925]
Outdoor image patch average LAB value: [233.96  124.65  125.275]
Indoor image patch average RGB value: [128.875  122.9175 108.8825]
Outdoor image patch average RGB value: [221.3725 233.4225 236.5225]
im1.png coordinate: x1=101, y1=136
im2.png coordinate: x2=100, y2=134
im1.png 色块的平均 LAB 值：[192.50976562 142.7734375  154.63964844]
im2.png 色块的平均 LAB 值：[210.28417969 128.54003906 119.0546875 ]
im1.png 色块的平均 RGB 值：[228.0546875  174.93457031 137.98144531]
im2.png 色块的平均 RGB 值：[197.87695312 206.10449219 222.44335938]


In [15]:
def plot_channels(image_rgb, image_lab, title_prefix, save_path, selected_coords=None):
    R_channel = image_rgb[:, :, 0]
    G_channel = image_rgb[:, :, 1]
    B_channel = image_rgb[:, :, 2]

    L_channel = image_lab[:, :, 0]
    A_channel = image_lab[:, :, 1]
    B_channel_lab = image_lab[:, :, 2]

    # 绘制
    plt.figure(figsize=(12, 4))
    
    # 绘制 RGB 通道
    plt.subplot(2, 3, 1)
    plt.imshow(R_channel, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)  # 标记坐标
    plt.title(f'{title_prefix} - Red')
    plt.axis('off')

    plt.subplot(2, 3, 2)
    plt.imshow(G_channel, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)
    plt.title(f'{title_prefix} - Green')
    plt.axis('off')

    plt.subplot(2, 3, 3)
    plt.imshow(B_channel, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)
    plt.title(f'{title_prefix} - Blue')
    plt.axis('off')

    # 绘制 LAB 通道
    plt.subplot(2, 3, 4)
    plt.imshow(L_channel, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)
    plt.title(f'{title_prefix} - L')
    plt.axis('off')

    plt.subplot(2, 3, 5)
    plt.imshow(A_channel, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)
    plt.title(f'{title_prefix} - A')
    plt.axis('off')

    plt.subplot(2, 3, 6)
    plt.imshow(B_channel_lab, cmap='gray')
    if selected_coords:
        plt.scatter([selected_coords[0]], [selected_coords[1]], color='red', s=50)
    plt.title(f'{title_prefix} - B')
    plt.axis('off')

    # 保存图像
    plt.savefig(save_path, dpi=300, bbox_inches='tight')
    plt.close()

# 修改后的 process_images 函数，将选择的坐标可视化在原始图像上
def process_images(img1_path, img2_path, x1, y1, x2, y2, output_path):
    # 加载图像并转换为 RGB 和 LAB
    img1 = cv2.imread(img1_path)
    img2 = cv2.imread(img2_path)

    img1_rgb = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
    img2_rgb = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)

    img1_lab = cv2.cvtColor(img1_rgb, cv2.COLOR_RGB2LAB)
    img2_lab = cv2.cvtColor(img2_rgb, cv2.COLOR_RGB2LAB)

    # 提取色块
    patch_size = 10  # 色块半径
    img1_patch_lab = img1_lab[y1-patch_size:y1+patch_size, x1-patch_size:x1+patch_size, :]
    img2_patch_lab = img2_lab[y2-patch_size:y2+patch_size, x2-patch_size:x2+patch_size, :]

    img1_patch_rgb = img1_rgb[y1-patch_size:y1+patch_size, x1-patch_size:x1+patch_size, :]
    img2_patch_rgb = img2_rgb[y2-patch_size:y2+patch_size, x2-patch_size:x2+patch_size, :]

    # 使用 LAB 进行颜色和亮度调整
    img2_lab_adjusted = adjust_lab(img2_lab, img1_patch_lab)
    img2_rgb_adjusted_lab = cv2.cvtColor(img2_lab_adjusted, cv2.COLOR_LAB2RGB)

    # 使用 RGB 进行颜色和亮度调整
    img2_rgb_adjusted_rgb = adjust_rgb(img2_rgb, img1_patch_rgb)

    # 显示并保存调整前后的图像
    plt.figure(figsize=(15, 5))

    plt.subplot(1, 5, 1)
    plt.imshow(img1_rgb)
    plt.scatter([x1], [y1], color='red', s=50)  # 标记选择的坐标
    plt.title('Original Image 1 (Marked)')
    plt.axis('off')

    plt.subplot(1, 5, 2)
    plt.imshow(img2_rgb)
    plt.scatter([x2], [y2], color='red', s=50)  # 标记选择的坐标
    plt.title('Original Image 2 (Marked)')
    plt.axis('off')

    plt.subplot(1, 5, 3)
    plt.imshow(img2_rgb_adjusted_lab)
    plt.title('LAB Adjusted Image 2')
    plt.axis('off')

    plt.subplot(1, 5, 4)
    plt.imshow(img2_rgb_adjusted_rgb)
    plt.title('RGB Adjusted Image 2')
    plt.axis('off')

    plt.subplot(1, 5, 5)
    plt.imshow(img1_rgb)
    plt.title('Comparison Image 1')
    plt.axis('off')

    plt.tight_layout()
    plt.savefig(output_path, dpi=300, bbox_inches='tight')
    plt.close()

# 处理室内和室外图像，带有坐标标记
process_images(
    img1_path='./resources/indoor.png',
    img2_path='./resources/outdoor.png',
    x1=44, y1=44,
    x2=44, y2=44,
    output_path='./task2_outdoor_vs_indoor_adjusted_rgb_lab_marked.png'
)

# 处理自定义图像 im1 和 im2，带有坐标标记
process_images(
    img1_path='./resources/im1.png',
    img2_path='./resources/im2.png',
    x1=101, y1=136,
    x2=100, y2=134,
    output_path='./task3_im2_vs_im1_adjusted_rgb_lab_marked.png'
)
