# Notebook 3 — Error Level Analysis (ELA) & Document Forensics

**Error Level Analysis (ELA)** is a passive image forensics technique that detects regions with different JPEG compression levels. Areas of an image that have been digitally inserted or modified typically retain a different error level than the surrounding (unmodified) regions.

## Outline
1. Theory: how ELA works
2. Generating ELA maps at different quality levels
3. Wavelet-based texture analysis
4. Edge detection for structural analysis
5. OCR text extraction

## References
- **Farid (2009)**: 'Image forgery detection' — IEEE Signal Processing Magazine
- **PMC 11323046 (2024)**: ELA + ResNet50 + CBAM, 96.21% accuracy on CASIA v2
- **IJARCCE 2025**: ELA + CNN achieving 96.21% on CASIA v2

In [None]:
import sys
sys.path.insert(0, '..')

from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

# Use any JPEG document image — ship4.png from assets as example
SAMPLE_IMG = Path('../app/static/assets/ship4.png')
if not SAMPLE_IMG.exists():
    # Fall back to a copy-move test image
    SAMPLE_IMG = next(Path('../Copy Move Forgery/CopyMoveDetection/test_images').glob('*.png'), None)
print('Using image:', SAMPLE_IMG)

## 1. ELA Theory

When a JPEG image is saved, the compression algorithm introduces errors. If the image is *authentic*, re-saving at a known quality level produces a **uniform error pattern** — all regions have been compressed the same number of times.

If part of the image was **spliced in** from another source (compressed independently), that region will show a *different* error level after re-compression — it will appear brighter (higher error) or darker (lower error) in the ELA map.

```
ELA(pixel) = |original_pixel − recompressed_pixel| × scale
```

## 2. ELA at Different Quality Levels

In [None]:
from src.analysis.ela import generate_ela, ela_score

orig = Image.open(SAMPLE_IMG).convert('RGB')
qualities = [70, 85, 95]

fig, axes = plt.subplots(1, len(qualities) + 1, figsize=(18, 5))
axes[0].imshow(orig)
axes[0].set_title('Original')
axes[0].axis('off')

for ax, q in zip(axes[1:], qualities):
    ela = generate_ela(orig, quality=q, scale=15)
    score = ela_score(ela)
    ax.imshow(ela)
    ax.set_title(f'ELA q={q}\nscore={score:.4f}')
    ax.axis('off')

plt.suptitle('Error Level Analysis at different JPEG qualities', fontsize=13)
plt.tight_layout()
plt.show()

## 3. Wavelet Decomposition

In [None]:
from src.analysis.wavelet import decompose

wavelets = ['haar', 'db4', 'sym4']
fig, axes = plt.subplots(1, len(wavelets) + 1, figsize=(18, 5))
axes[0].imshow(orig)
axes[0].set_title('Original')
axes[0].axis('off')

for ax, w in zip(axes[1:], wavelets):
    result = decompose(orig, wavelet=w, level=3)
    ax.imshow(result['heatmap'])
    ax.set_title(f'{w} (level 3)')
    ax.axis('off')

plt.suptitle('Wavelet Decomposition — High-Frequency Detail Heatmaps', fontsize=13)
plt.tight_layout()
plt.show()

## 4. Edge Detection Comparison

In [None]:
from src.analysis.edge_detection import detect_all

edges = detect_all(orig)
detectors = list(edges.keys())

fig, axes = plt.subplots(1, len(detectors) + 1, figsize=(20, 5))
axes[0].imshow(orig)
axes[0].set_title('Original')
axes[0].axis('off')

for ax, name in zip(axes[1:], detectors):
    ax.imshow(edges[name], cmap='gray')
    ax.set_title(name.capitalize())
    ax.axis('off')

plt.suptitle('Edge Detection Comparison', fontsize=13)
plt.tight_layout()
plt.show()

## 5. OCR Text Extraction

In [None]:
from src.analysis.ocr import extract_text

result = extract_text(SAMPLE_IMG, handwritten=False)
print('Engine:', result['engine'])
print('Avg confidence:', f"{result['avg_confidence']:.1%}")
print('\nExtracted text:')
print(result['full_text'])