# Detecting Midjourney Images via Feature Engineering & Classification

**Objectives**

- **Engineer Discriminative Features**
    Develop and extract features that effectively separate Stable Diffusion (AI-generated) images from authentic camera-captured photos using frequency, color, and texture analysis

- **Build Reproducible Pipeline**

    Construct a robust feature extraction pipeline that generates a tabular dataset suitable for ML. Ensure consistency and reproducibility across experiments.

- **Train & Evaluate Classifiers**

    Implement a classifier architectures such as Random Forest, XGBoost, SVM or Neural Networks. Rigorously evaluate performance using industry-standard metrics and confusion matrices

- **Quantify Feature Relevance**

    Apply multiple interpretability techniques such as Gini importance. Permutation importance, and SHAP values to understand which features drive classification decisions.


**Dataset Structure**
```bash
imagenet_midjourney/
|----test/ 
| |----ai/ 
| | |--[AI-generated images] (Stable Diffusion/Midjourney) 
| | Label: 1 (fake) 
| |----nature/ 
| | |--[Natural camera images] 
| | (Non-AI photographs) 
| | Label: 0 (real)
```

## Imports

In [13]:
import os
import cv2
import numpy as np

from skimage import io, color, img_as_ubyte
from tqdm import tqdm

from scipy.stats import linregress, gmean

DATASET_DIR = "imagenet_midjourney/test"

CATEGORIES = {
    "ai": 1,
    "nature": 0
}

IMG_SIZE = (256, 256)
COLOR_MODE = 'rgb'

## Image Loading & Preprocessing

Load images from dataset directories, normalize formats, and prepare for feature extraction. Handle various input formats consistently.

In [11]:
def load_and_preprocess_images(base_dir, categories, img_size=(256, 256), color_mode='rgb'):
    X, y, paths = [], [], []

    for category, label in categories.items():
        folder = os.path.join(base_dir, category)
        if not os.path.exists(folder):
            print(f"Folder not encountered: {folder}")
            continue

        for filename in tqdm(os.listdir(folder), desc=f"Loading {category}"):
            file_path = os.path.join(folder, filename)

            if not filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.tiff', '.webp')):
                continue

            try:
                image = io.imread(file_path)

                if color_mode == 'rgb':
                    if image.ndim == 2:
                        image = color.gray2rgb(image)
                    elif image.shape[2] == 4:
                        image = color.rgba2rgb(image)
                elif color_mode == 'gray':
                    image = color.rgb2gray(image)

                image = cv2.resize(img_as_ubyte(image), img_size)

                image = image.astype(np.float32) / 255.0

                X.append(image)
                y.append(label)
                paths.append(file_path)
                
            except Exception as e:
                print(f"Error processing {file_path}: {e}")
                continue

    X = np.array(X)
    y = np.array(y)
    return X, y, paths

In [12]:
X, y, image_paths = load_and_preprocess_images(DATASET_DIR, CATEGORIES, IMG_SIZE, COLOR_MODE)

print(f"Total imágenes cargadas: {len(X)}")
print(f"Dimensión de ejemplo: {X[0].shape}")

Loading ai: 100%|██████████| 500/500 [00:17<00:00, 28.03it/s]
Loading nature: 100%|██████████| 500/500 [00:02<00:00, 196.48it/s]


Total imágenes cargadas: 1000
Dimensión de ejemplo: (256, 256, 3)


## Feature Computation

Extract per-image features across frequency, color, and texture domains. Generate comprehensive feature vectors for each sample.

### Feature Family I: Frequency & Spectrum Analysis (FFT)

In [14]:
def compute_fft_features(image):
    gray = cv2.cvtColor((image * 255).astype(np.uint8), cv2.COLOR_RGB2GRAY)
    gray = gray.astype(np.float32) / 255.0

    fft2 = np.fft.fft2(gray)
    fshift = np.fft.fftshift(fft2)
    magnitude_spectrum = np.abs(fshift) ** 2

    rows, cols = gray.shape
    crow, ccol = rows // 2, cols // 2
    y, x = np.ogrid[:rows, :cols]
    radius = np.sqrt((x - ccol) ** 2 + (y - crow) ** 2).astype(np.int32)

    radial_profile = np.bincount(radius.ravel(), magnitude_spectrum.ravel()) / np.bincount(radius.ravel())
    radial_profile = radial_profile[1:]

    radial_power_spectrum_mean = np.mean(radial_profile)

    freqs = np.arange(1, len(radial_profile) + 1)
    log_freqs = np.log(freqs)
    log_power = np.log(radial_profile + 1e-8)
    slope, intercept, _, _, _ = linregress(log_freqs, log_power)
    spectral_slope = slope

    spectral_flatness = gmean(radial_profile + 1e-8) / (np.mean(radial_profile) + 1e-8)

    cutoff = len(radial_profile) // 3
    high_freq_energy = np.sum(radial_profile[-cutoff:])
    total_energy = np.sum(radial_profile)
    high_freq_ratio = high_freq_energy / (total_energy + 1e-8)

    return {
        'radial_power_spectrum_mean': radial_power_spectrum_mean,
        'spectral_slope': spectral_slope,
        'spectral_flatness': spectral_flatness,
        'high_freq_ratio': high_freq_ratio
    }


## Tabular Dataset Creation

In [15]:
fft_feature_list = []

for img in tqdm(X, desc="Extracting features"):
    features = compute_fft_features(img)
    fft_feature_list.append(list(features.values()))

fft_features = np.array(fft_feature_list)

print(f"FFT features shape: {fft_features.shape}")

Extracting features: 100%|██████████| 1000/1000 [00:04<00:00, 222.67it/s]

FFT features shape: (1000, 4)



