<img src="seminarlogo.png" class="center" style="width:100%;"> </img>

# <span style="color:navy"> **Seminar Project:** </span> <span style="color:green"> **Video Coding**</span>

<div style="background-color:#ebeff7; padding:1.25em; border-radius:1em; border: 1px solid black">

### <span style = "color: navy"> **General Procedure:** </span>
1. Make group of $2$ or $3$ students while coordinating with each other and select one group coordinator.
2. Select one the techniques (MDCT or Lapped Transform)
4. Send us the the selected task (technique) and the list of group members with their <b>a)</b> Matriculation Numbers <b>b)</b> email IDs, and highlighting the group coordinator. Make it in a **table** form.
5. The duration of the seminar project is from **11.06.2024** to **31.07.2024**, with a mid-term meeting with each group from *01.07.2024 to 06.07.2024* (Book an appointment at your convenience)
6. There will be a report at the end of project. The submission date for the report is **31.07.2024**
7. The report and results presentation will held from *01.08.2024 to 15.08.2024* (The specific date for each group will be selected after mutual consensus)


<span style="color:blue"> Note: Further Instructions and procedures may be updated from time to time</span>
    
</div>

## <span style = "color: navy"> **Introduction:** </span>
There are two seminar projects, with the aim to compress the given images (Only use the given image in the base repository). The projects are:
<br>
### **Project-1:** <span style = "color: green"> Image-Compression using MDCT </span>
### **Project-2:** <span style = "color: green"> Image-Compression using Lapped Transform </span>

The projects are team-based work with a group of **2** (minimum) to **3** (maximum) students. You (as a team) are required to select one project and implement the technique (either *MDCT* or *Lapped* Transform). <span style = "color: red">To make a balance in project selection, we can assign either Project-1 or Project-2 to any group.</span>

### <span style = "color: navy"> GET STARTED </span>
The basic building blocks (base algorithm and/or material) of the Projects are provided in the git-repository (https://github.com/Karanraj06/image-compression).
In this repository, the implementation of image compression is done by using simple **DCT** which you can take as your staring point and further develop or modify this base-repository according to the selected project i.e. techniques (*MDCT* or *Lapped* Transform) of selected project.

### <span style = "color: navy"> Project Tasks </span>
1. Take the base repository as base or reference
2. Modify the DCT part with your selected technique (either MDCT or Lapped Transform)
    - For the "MDCT" you need to find out the optimum window, try different windows
    - The "lapped transform" needs to be implemented by Pytorch's Conv2D. For that, the kernel size can be chosen like the MDCT, which is 16x16. In order to get good transform filters, it needs to be trained, and for that you need to use the images in the image coder repository.
5. Compare the results of compression of your modified version with the reference, which is DCT, by using Perceptual Similarity Metric and compression ratio. The repository for Perceptual Similarity Metric measurement is given here (https://github.com/richzhang/PerceptualSimilarity)
6. The expected calculations among others should include the results of "bits per pixel"

### <span style = "color: navy"> Some use full links </span>
1. https://github.com/TUIlmenauAMS/Python-Audio-Coder
2. https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

##### <span style="color:blue"> Further details will be discussed with the group coordinators and the they are responsible for group coordination</span>

<div style="background-color:#ebeff7; padding:1.25em; border-radius:1em; border: 1px solid black">

### <span style="color:navy; font-size: 1.25em"> **TASK 01:** DCT </span>
#### <span style="color:blue"> Below Enter Matriculation Numbers of each member </span>

</div>

In [1]:
MatriculationNumber = '64365, 64722, 66160';

In [2]:
!pip install PyWavelets
!pip install pillow
!apt install lpips
!pip install scikit-image

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
[1;31mE: [0mCould not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)[0m
[1;31mE: [0mUnable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?[0m
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new rele

In [3]:
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt
import lpips
import torch
import pandas as pd
import path
import skimage

In [4]:
from PIL import Image
from skimage import color
from skimage import io
from scipy.fftpack import dct, idct
from torchvision import transforms
from scipy.signal import convolve2d

In [5]:
# Access the paths
path_color = path.PATH_COLOR
path_gray = path.PATH_GRAY
path_color_dct = path.PATH_COLOR_DCT
path_gray_dct = path.PATH_GRAY_DCT

In [6]:
os.makedirs(path_color_dct, exist_ok=True)
os.makedirs(path_gray_dct, exist_ok=True)

In [7]:
def absolute_file_paths(directory):
    path = os.path.abspath(directory)
    return [entry.path for entry in os.scandir(path) if entry.is_file()]

In [8]:
all_color_images = absolute_file_paths(path_color)
all_gray_images = absolute_file_paths(path_gray)

In [9]:
def process_and_save_dct_images(image_paths, output_path):
    for image_path in image_paths:
        # Load image as grayscale
        im = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        f_image = np.float32(im)

        # Apply DCT
        dct_image = cv2.dct(f_image)

        # Amplify high-frequency components (optional, adjust as needed)
        dct_image *= 2  # Experiment with different values for quality adjustment

        # Apply inverse DCT
        idct_image = cv2.idct(dct_image)

        # Clip values to valid range and convert back to uint8
        idct_image_clipped = np.clip(idct_image, 0, 255).astype(np.uint8)

        # Save the compressed image
        filename = os.path.basename(image_path)
        to_save = os.path.join(output_path, filename)
        print(f"Saving: {to_save}")
        cv2.imwrite(to_save, idct_image_clipped)

In [10]:
process_and_save_dct_images(all_color_images, path_color_dct)

Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/7.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/3.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/4.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/6.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/9.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/5.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/2.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/10.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/1.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-dct/8.png


In [11]:
process_and_save_dct_images(all_gray_images, path_gray_dct)

Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/3.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/5.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/9.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/8.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/1.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/2.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/4.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/7.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/6.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-dct/10.tif


<div style="background-color:#ebeff7; padding:1.25em; border-radius:1em; border: 1px solid black">

### <span style="color:navy; font-size: 1.25em"> **TASK 02:** LAPPED TRANSFORM </span>
#### <span style="color:blue"> Below Enter Matriculation Numbers of each member </span>

</div>

In [12]:
MatriculationNumber = '64365, 64722';

In [13]:
# Access the paths
path_color = path.PATH_COLOR
path_gray = path.PATH_GRAY
path_color_lapped = path.PATH_COLOR_LAP
path_gray_lapped = path.PATH_GRAY_LAP

In [14]:
os.makedirs(path_color_lapped, exist_ok=True)
os.makedirs(path_gray_lapped, exist_ok=True)

In [15]:
def absolute_file_paths(directory):
    path = os.path.abspath(directory)
    return [entry.path for entry in os.scandir(path) if entry.is_file()]

In [16]:
all_color_images = absolute_file_paths(path_color)
all_gray_images = absolute_file_paths(path_gray)

In [17]:
# Define parameters
BLOCK_SIZE = 8
OVERLAP = 4
STEP_SIZE = BLOCK_SIZE - OVERLAP

In [18]:
# Function to perform DCT and IDCT with overlap
def lapped_transform(im_np, block_size=BLOCK_SIZE, overlap=OVERLAP):
    height, width = im_np.shape
    step_size = block_size - overlap

    # Padding the image
    pad_height = (step_size - height % step_size) % step_size
    pad_width = (step_size - width % step_size) % step_size
    im_np_padded = np.pad(im_np, ((0, pad_height), (0, pad_width)), mode='constant')

    # Initialize arrays for compressed and decompressed data
    compressed = np.zeros_like(im_np_padded, dtype=np.float32)
    count = np.zeros_like(im_np_padded, dtype=np.float32)

    # Process each block with overlap
    for i in range(0, im_np_padded.shape[0] - overlap, step_size):
        for j in range(0, im_np_padded.shape[1] - overlap, step_size):
            block = im_np_padded[i:i + block_size, j:j + block_size]

            # Apply DCT
            dct_block = dct(dct(block, axis=0, norm='ortho'), axis=1, norm='ortho')

            # (Optional) Quantization and compression can be applied here

            # Apply inverse DCT
            idct_block = idct(idct(dct_block, axis=0, norm='ortho'), axis=1, norm='ortho')

            # Overlap-add and average
            compressed[i:i + block_size, j:j + block_size] += idct_block
            count[i:i + block_size, j:j + block_size] += 1

    # Avoid division by zero
    count[count == 0] = 1
    compressed /= count

    # Clip values to valid range
    compressed_clipped = np.clip(compressed, 0, 255).astype(np.uint8)
    return compressed_clipped

# Function to process and save Lapped Transform compressed images
def process_and_save_lapped_transform_images(image_paths, output_path):
    for image_path in image_paths:
        # Load image as grayscale
        im = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        f_image = np.float32(im)

        # Apply Lapped Transform
        compressed_image = lapped_transform(f_image)

        # Save the compressed image
        filename = os.path.basename(image_path)
        to_save = os.path.join(output_path, filename)
        print(f"Saving: {to_save}")
        cv2.imwrite(to_save, compressed_image)

In [19]:
process_and_save_lapped_transform_images(all_color_images, path_color_lapped)

Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/7.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/3.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/4.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/6.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/9.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/5.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/2.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/10.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/1.png
Saving: /home/jupyter-64365/VC_Seminar_Project/color-lapped/8.png


In [20]:
process_and_save_lapped_transform_images(all_gray_images, path_gray_lapped)

Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/3.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/5.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/9.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/8.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/1.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/2.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/4.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/7.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/6.tif
Saving: /home/jupyter-64365/VC_Seminar_Project/gray-lapped/10.tif


<div style="background-color:#ebeff7; padding:1.25em; border-radius:1em; border: 1px solid black">

### <span style="color:navy; font-size: 1.25em"> **TASK 03:** COMPRESSION RATIO </span>
#### <span style="color:blue"> Below Enter Matriculation Numbers of each member </span>

</div>

In [21]:
MatriculationNumber = '64722, 66160';

In [22]:
path_color = path.PATH_COLOR
path_gray = path.PATH_GRAY
path_color_dct = path.PATH_COLOR_DCT
path_gray_dct = path.PATH_GRAY_DCT
path_color_lapped = path.PATH_COLOR_LAP
path_gray_lapped = path.PATH_GRAY_LAP

In [23]:
# Function to get file size
def get_file_size(file_path):
    return os.path.getsize(file_path) if os.path.isfile(file_path) else 0

In [24]:
# Function to calculate compression ratio for each image
def calculate_compression_ratios(original_path, compressed_dct_path, compressed_lbt_path):
    ratios = []
    for filename in os.listdir(original_path):
        original_image_path = os.path.join(original_path, filename)
        dct_image_path = os.path.join(compressed_dct_path, filename)
        lbt_image_path = os.path.join(compressed_lbt_path, filename)

        if not os.path.isfile(dct_image_path) or not os.path.isfile(lbt_image_path):
            continue

        # File sizes
        original_size = get_file_size(original_image_path)
        dct_size = get_file_size(dct_image_path)
        lbt_size = get_file_size(lbt_image_path)

 # Compression ratios
        if dct_size < original_size : 
            ratio_dct = (100 -((dct_size*100)/original_size)) 
        else:   
            ratio_dct = (100 -((dct_size*100)/original_size)) 
        if lbt_size < original_size: 
            ratio_lbt = (100 -((lbt_size*100)/original_size)) 
        else:   
            ratio_lbt = (100 -((lbt_size*100)/original_size))
        ratios.append({
            'Filename': filename,
            'Original Size (bytes)': original_size,
            'DCT Size (bytes)': dct_size,
            'LT Size (bytes)': lbt_size,
            'DCT Comp Ratio %': ratio_dct,
            'Lapped Comp Ratio %': ratio_lbt
        })

    return ratios

In [25]:
# Calculate compression ratios for color and gray images
color_ratios = calculate_compression_ratios(path_color, path_color_dct, path_color_lapped)
gray_ratios = calculate_compression_ratios(path_gray, path_gray_dct, path_gray_lapped)

# Create DataFrames to display results
df_color = pd.DataFrame(color_ratios)
df_gray = pd.DataFrame(gray_ratios)

print("Color Images Compression Ratios:")
print(df_color)
df_color.to_csv(path.PATH_COLOR_RATIO, index=False)

print("\nGray Images Compression Ratios:")
print(df_gray)
df_gray.to_csv(path.PATH_GRAY_RATIO, index=False)

Color Images Compression Ratios:
  Filename  Original Size (bytes)  DCT Size (bytes)  LT Size (bytes)  \
0    7.png                 566322            177153           216321   
1    3.png                 502888            179540           200815   
2    4.png                 637432            220141           234133   
3    6.png                 618959            181689           239507   
4    9.png                 582899            117524           222787   
5    5.png                 785610            291705           288386   
6    2.png                 617995            261397           221222   
7   10.png                 593463            174171           228518   
8    1.png                 736501            235530           288012   
9    8.png                 788470            241369           307807   

   DCT Comp Ratio %  Lapped Comp Ratio %  
0         68.718679            61.802473  
1         64.298214            60.067649  
2         65.464395            63.269337  
3 

<div style="background-color:#ebeff7; padding:1.25em; border-radius:1em; border: 1px solid black">

### <span style="color:navy; font-size: 1.25em"> **TASK 04:** LPIPS </span>
#### <span style="color:blue"> Below Enter Matriculation Numbers of each member </span>

</div>

In [26]:
MatriculationNumber = '64365, 66160';

In [27]:
path_color_dct = path.PATH_COLOR_DCT
path_gray_dct = path.PATH_GRAY_DCT

path_color_lapped = path.PATH_COLOR_LAP
path_gray_lapped = path.PATH_GRAY_LAP

In [28]:
# Initialize LPIPS model
lpips_model = lpips.LPIPS(net='alex')  # Options: alex, vgg, squeeze

Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]




Loading model from: /home/jupyter-64365/.local/lib/python3.10/site-packages/lpips/weights/v0.1/alex.pth


In [29]:
# Define image transformation
transform = transforms.Compose([
    transforms.Resize((256, 256)),  # Resize for uniformity
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

In [30]:
# Function to calculate LPIPS for a set of images
def calculate_lpips(original_path, compressed_path):
    total_lpips = 0.0
    image_count = 0
    data = []

    for filename in os.listdir(original_path):
        original_image_path = os.path.join(original_path, filename)
        compressed_image_path = os.path.join(compressed_path, filename)

        # Check if corresponding compressed image exists and skip directories
        if not os.path.isfile(original_image_path) or not os.path.isfile(compressed_image_path):
            continue

        # Load images
        original_image = Image.open(original_image_path).convert('RGB')
        compressed_image = Image.open(compressed_image_path).convert('RGB')

        # Transform images
        original_tensor = transform(original_image).unsqueeze(0)
        compressed_tensor = transform(compressed_image).unsqueeze(0)

        # Calculate LPIPS
        lpips_value = lpips_model(original_tensor, compressed_tensor)
        lpips_value_item = lpips_value.item()*100
        total_lpips += lpips_value_item
        image_count += 1

        #print(f"LPIPS for {filename}: {lpips_value.item()}")

        # Store result
        data.append({'File Name':filename, 'LPIPS Value (Percentage Of Diff%)':lpips_value_item})

    # Create DataFrame
    df = pd.DataFrame(data)
    return df

    # Calculate average LPIPS
    #avg_lpips = total_lpips / image_count if image_count > 0 else 0
    #return avg_lpips

In [31]:
# Calculate LPIPS for color images
lpips_color = calculate_lpips(path_color, path_color_dct)
#print(f"Average LPIPS for color images: {lpips_color}")
print("LPIPS Scores for Color Images DCT:")
print(lpips_color)
lpips_color.to_csv(path.PATH_COLOR_LPIPS, index=False)

# Calculate LPIPS for color images
lpips_color = calculate_lpips(path_color, path_color_lapped)
#print(f"Average LPIPS for color images: {lpips_color}")
print("LPIPS Scores for Color Images Lapped :")
print(lpips_color)
lpips_color.to_csv(path.PATH_GRAY_LPIPS, index=False)

LPIPS Scores for Color Images DCT:
  File Name  LPIPS Value (Percentage Of Diff%)
0     7.png                          52.786756
1     3.png                          53.720617
2     4.png                          59.807819
3     6.png                          46.721593
4     9.png                          44.856182
5     5.png                          43.165866
6     2.png                          75.108820
7    10.png                          50.046206
8     1.png                          43.696877
9     8.png                          36.130676
LPIPS Scores for Color Images Lapped :
  File Name  LPIPS Value (Percentage Of Diff%)
0     7.png                          34.117484
1     3.png                          36.033899
2     4.png                          42.898655
3     6.png                          24.276394
4     9.png                          19.903517
5     5.png                          28.031820
6     2.png                          58.564758
7    10.png                      

In [32]:
# Calculate LPIPS for gray images
lpips_gray = calculate_lpips(path_gray, path_gray_dct)
#print(f"Average LPIPS for gray images: {lpips_gray}")
print("LPIPS Scores for Gray Images DCT:")
print(lpips_gray)
lpips_gray.to_csv('/home/jupyter-64365/VC_Seminar_Project/gray_lpips_results_dct.csv', index=False)

# Calculate LPIPS for gray images
lpips_gray = calculate_lpips(path_gray, path_gray_lapped)
#print(f"Average LPIPS for gray images: {lpips_gray}")
print("LPIPS Scores for Gray Images Lapped:")
print(lpips_gray)
lpips_gray.to_csv('/home/jupyter-64365/VC_Seminar_Project/gray_lpips_results_Lapped.csv', index=False)

LPIPS Scores for Gray Images DCT:
  File Name  LPIPS Value (Percentage Of Diff%)
0     3.tif                          19.953492
1     5.tif                          28.491843
2     9.tif                           9.121278
3     8.tif                          10.662258
4     1.tif                          79.879838
5     2.tif                           6.394510
6     4.tif                          59.455323
7     7.tif                           0.000000
8     6.tif                          11.840390
9    10.tif                          22.912861
LPIPS Scores for Gray Images Lapped:
  File Name  LPIPS Value (Percentage Of Diff%)
0     3.tif                           0.007921
1     5.tif                           0.032098
2     9.tif                           1.900384
3     8.tif                           0.549822
4     1.tif                           3.133356
5     2.tif                           0.001509
6     4.tif                           0.010007
7     7.tif                         

### <span style="color:blue"> END Report here </span>