# Multimedia Processing Course - Part 5: Compression

Multimedia data is huge. Compression is essential to store and transmit it. 

**Content:**
1.  **Level 1 (Theory)**: Types of Redundancy.
2.  **Level 2 (Lossless)**: Run Length Encoding (RLE).
3.  **Level 3 (Lossy)**: Discrete Cosine Transform (DCT) - The heart of JPEG.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2

# Helper function to display images
def show_image(img, title="Image", cmap_type='gray'):
    plt.imshow(img, cmap=cmap_type)
    plt.title(title)
    plt.axis('off')
    plt.show()

## Level 1: Redundancy
Data often repeats. 
-   **Spatial Redundancy**: Neighboring pixels are often similar.
-   **Temporal Redundancy**: Consecutive video frames are similar.
-   **Coding Redundancy**: Using more bits than necessary for frequent values.

## Level 2: Run Length Encoding (RLE)
If we have a sequence like `AAAAAAAAABBBCCCC`, we can store it as `9A3B4C`.
This is **Lossless** compression.

In [None]:
def rle_encode(data):
    encoding = []
    prev_char = data[0]
    count = 1
    
    for char in data[1:]:
        if char == prev_char:
            count += 1
        else:
            encoding.append((prev_char, count))
            prev_char = char
            count = 1
    encoding.append((prev_char, count))
    return encoding

# Test RLE
raw_data = "AAAAABBBCCDAA"
encoded_data = rle_encode(raw_data)

print(f"Original: {raw_data}")
print(f"Encoded: {encoded_data}")

### RLE on Images
RLE is great for simple binary images (like fax machines).

In [None]:
# Create a simple binary image
binary_img = np.zeros((10, 10), dtype=int)
binary_img[2:8, 2:8] = 1 # A white square

print("Binary Image (Flattened partially):", binary_img.flatten()[:20])

# Encode the flattened image
rle_img = rle_encode(binary_img.flatten())
print("RLE Encoded Image (First 5 chunks):", rle_img[:5])

## Level 3: Discrete Cosine Transform (DCT)
JPEG compression uses DCT to separate an image into parts of different importance.
1.  Divide image into 8x8 blocks.
2.  Apply DCT to each block.
3.  Keep low-frequency components (important visual info), discard high-frequency (details the eye can't see well).

In [None]:
# Make a simple 8x8 block with a gradient
block = np.zeros((8, 8))
for i in range(8):
    block[i, :] = i * 32

show_image(block, "Original 8x8 Block")

# Apply DCT
# DCT requires float format
block_float = np.float32(block)
dct_block = cv2.dct(block_float)

print("DCT Coefficients (Top-Left corner has low frequencies):")
print(np.round(dct_block[:4, :4], 1)) # Printing just top-left 4x4

### Explanation
The top-left value (DC coefficient) is the largest. It represents the average color.
Values towards the bottom-right are high frequencies (noise/sharp details).

**Compression happen here:** We can set small values (high frequencies) to zero and essentially lose very little visual information. This is called **Quantization**.

In [None]:
# Simple compression: keep only the top-left 2x2 coefficients
compressed_dct = np.zeros_like(dct_block)
compressed_dct[:2, :2] = dct_block[:2, :2]

# Inverse DCT to get image back
reconstructed_block = cv2.idct(compressed_dct)

plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(block, cmap='gray')
plt.title("Original")
plt.subplot(1, 2, 2)
plt.imshow(reconstructed_block, cmap='gray')
plt.title("Reconstructed (Lossy)")
plt.show()

### Result
Even with heavily discarding coefficients, the reconstructed block often resembles the original, showing why JPEG is so efficient.