## Data Generation

This data is generated from stock images taken from Pexels.com that are in a portrait orientation. Using a script, each image has a series of different exposure gains from +1 EV to +5 EV of exposure compensation using Adobe Camera Raw. The 5 exposure gains are then printed and taken photos of using a controlled studio environment with the camera metering at 0 EV. Each exposure gain is then compared to the original using a Structrual Similarity Index to measure the similarity between the backlit and original image. The EV with the highest score is then labeled as the correct exposure compensation for the image.
Since stock photography websites typically consists of images that were taken by a DSLR and are typically of high quality, half the images in the dataset will be images taken by a phone.
### Dataset Photo Distribution:
Quality Photo Distribution
- 40% Professional Quality Photos
- 50% Phone Quality Photos
- 8% Old Photos Scans Taken with a Phone

Color Photo Distribution
- 90% Color
- 10% B&W

In [3]:
# Imports
import numpy as np
import pandas as pd
import matplotlib
from matplotlib import pyplot as plt
import seaborn as sns
import cv2

Imports all the original photos and computes black, shadows, midtones, highlights, and whites values as a percentile

In [4]:
# Function to compute percentiles of an image
def compute_percentiles(image):
    # Convert to grayscale if image is colored
    if len(image.shape) == 3:
        gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    else:
        gray_image = image

    # Compute the histogram
    hist, bins = np.histogram(gray_image.flatten(), 256, [0, 256])

    # Compute cumulative distribution function (CDF)
    cdf = hist.cumsum()
    cdf_normalized = cdf * hist.max() / cdf.max()

    # Define percentiles
    percentiles = {
        "blacks": np.percentile(gray_image, 5),  # 0-5%
        "shadows": np.percentile(gray_image, 25),  # 5-25%
        "midtones": np.percentile(gray_image, 50),  # 25-75%
        "highlights": np.percentile(gray_image, 75),  # 75-95%
        "whites": np.percentile(gray_image, 95)  # 95-100%
    }
    
    return percentiles