### DETAILED EXPLANATION

This script generates blurred versions of each image in the dataset using three types of blur:

**Why Blurring Images?**

Blurring is a common type of image distortion that can occur due to various factors:

- Camera motion during capture (Motion Blur)
- Out-of-focus regions due to depth of field (Gaussian Blur)
- Uniform blurring due to poor image processing or resizing (Box Blur)

These blurs simulate real-world distortions, making our analysis more realistic.

**Types of Blur Used**

1. Gaussian Blur:
   - This blur applies a Gaussian kernel to each pixel, which results in a smooth, out-of-focus effect.
   - The blur is strongest at the center and gradually fades out (bell curve distribution).
   - We chose sigma range (0.5 to 3.0) because it provides visible but controlled blur.
   - Sigma defines the standard deviation of the Gaussian distribution (higher = stronger blur).

2. Motion Blur:
   - Simulates the effect of movement, either from a moving object or a moving camera.
   - The blur has a clear direction (angle) and intensity (length).
   - We chose length (5 to 30) to allow visible motion but not too extreme.
   - Angle (0 to 180) provides all possible directions of movement.

3. Box Blur:
   - Applies a simple averaging of pixel values in a square region around each pixel.
   - This blur is fast and commonly used for quick blur effects.
   - We chose kernel size (3 to 15) because it provides a visible blur without being too extreme.
   - The kernel size defines the square region used for averaging (higher = stronger blur).

**Why Save Parameters in a Parquet File?**

- Parquet is optimized for speed and storage efficiency.
- It is suitable for large datasets, ensuring fast read and write operations.
- Parquet allows us to store the blur parameters alongside the image identifiers, making it easy to track and analyze.

In [None]:
# Blur Generation for Image Deblurring Performance Analysis

import cv2
import numpy as np
import pandas as pd
import os
from pathlib import Path
import random
from tqdm import tqdm
import logging
import sys

# Aggiunge il path root del progetto per usare i moduli personalizzati
sys.path.append("../..")
import utils.constants as const

# Logging
logging.basicConfig(
    filename='logs/3_blur_generation.log',
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

# Percorsi
ORIGINAL_IMAGES_DIR = const.ORIGINAL_DATASET_PATH / "00000"
BLURRED_DIR = const.BLURRED_DATASET_PATH
DATASET_FILE = const.MAIN_DATASET_PATH

BLURRED_DIR.mkdir(parents=True, exist_ok=True)

# Intervalli dei parametri per i blur
PARAM_RANGES = {
    'gaussian': {'sigma': (0.5, 5.0)},
    'motion': {'length': (5, 50), 'angle': (0, 180)},
    'box': {'kernel_size': (3, 15)}
}

# Funzioni di blur
def apply_gaussian_blur(image, sigma):
    return cv2.GaussianBlur(image, (0, 0), sigmaX=sigma, sigmaY=sigma)

def apply_motion_blur(image, length, angle):
    kernel = np.zeros((length, length))
    center = length // 2
    rad = np.radians(angle)
    dx, dy = np.cos(rad), np.sin(rad)
    for i in range(length):
        x = int(center + (i - center) * dx)
        y = int(center + (i - center) * dy)
        if 0 <= x < length and 0 <= y < length:
            kernel[y, x] = 1
    kernel /= kernel.sum()
    return cv2.filter2D(image, -1, kernel)

def apply_box_blur(image, kernel_size):
    return cv2.blur(image, (kernel_size, kernel_size))

# Funzione principale per generare immagini blur
def process_images(sample_size=None):
    df = pd.read_parquet(DATASET_FILE)
    df = df.dropna(subset=['key'])

    if sample_size:
        df = df.sample(n=sample_size, random_state=42)

    for _, row in tqdm(df.iterrows(), total=len(df), desc="Generating blurred images"):
        image_id = row['key']
        image_path = ORIGINAL_IMAGES_DIR / f"{image_id}.png"
        if not image_path.exists():
            continue

        image = cv2.imread(str(image_path))
        if image is None:
            continue

        sigma = random.uniform(*PARAM_RANGES['gaussian']['sigma'])
        length = random.randint(*PARAM_RANGES['motion']['length'])
        angle = random.uniform(*PARAM_RANGES['motion']['angle'])
        kernel_size = random.choice(range(PARAM_RANGES['box']['kernel_size'][0], PARAM_RANGES['box']['kernel_size'][1] + 1, 2))

        gaussian_blurred = apply_gaussian_blur(image, sigma)
        motion_blurred = apply_motion_blur(image, length, angle)
        box_blurred = apply_box_blur(image, kernel_size)

        # Salva le immagini sfocate con nome coerente
        cv2.imwrite(str(BLURRED_DIR / f"{image_id}_gaussian.png"), gaussian_blurred)
        cv2.imwrite(str(BLURRED_DIR / f"{image_id}_motion.png"), motion_blurred)
        cv2.imwrite(str(BLURRED_DIR / f"{image_id}_box.png"), box_blurred)

print("Inizio generazione blur...\n")
process_images(sample_size=10)
print("\nBlur generation completata.")
