This Jupyter Notebook contains Python code for **image forgery detection and localization**, primarily using a **U-Net** neural network architecture and **Error Level Analysis (ELA)** as a preprocessing step.

The overall goal is to predict a **pixel-wise mask** indicating forged regions, and output this mask in a **Run-Length Encoded (RLE)** format.

---

## ðŸ’» Code Sections Explained

The notebook is structured into setup, ELA computation, model definition, RLE implementation, inference, and visualization.

### 1. Setup and Hyperparameters (Code Cell 1)

This section sets up the environment and defines key configuration values.

* **Libraries:** Standard data science and deep learning libraries are imported: `os`, `gc`, `cv2`, `numpy`, `pandas`, and `torch` (including `torch.nn` and `torch.nn.functional`).
* **Paths:** File paths for the pre-trained model (`recodai_model.pth`), test images, and the submission CSV template are defined.
* **Hyperparameters:**
    * `TARGET_SIZE = 256`: The input size for the images processed by the neural network.
    * `DEVICE = 'cuda'`: Specifies the computation device (GPU, if available).
    * `THR = 0.20`: **Threshold** for converting the model's probability output into a binary mask (pixels with probability > 0.20 are initially marked as forged).
    * `MIN_A = 20`: **Minimum area** (in pixels) for a connected component to be kept as a valid forgery; smaller areas are considered noise and removed in post-processing.
    * `MORPH_KERNEL = 5`: Size of the **morphological closing kernel** used for post-processing to smooth and connect nearby mask regions.

### 2. Error Level Analysis (ELA) (`compute_ela` function)

ELA is a common technique in image forensics to highlight subtle compression differences in an image, which often exposes forged regions.

* The function takes an image path (`p`), reads the image, and resizes it to `TARGET_SIZE` (256x256).
* It then **re-saves the image with a standard JPEG quality (95%)** and immediately re-loads the re-saved image.
* The **absolute difference** between the original and re-saved images (the "error") is calculated.
* This error image is then averaged across color channels and scaled by 10 to enhance visibility (`e = np.mean(np.abs(...) * 10`). This ELA image serves as an additional input to the U-Net model.

### 3. U-Net Model (`UNet` class)

The U-Net is a standard fully convolutional neural network used for **image segmentation** tasks, which is ideal for predicting a pixel-wise mask.

* **Architecture:** It has an encoder-decoder structure with skip connections, forming a "U" shape.
    * **Encoder (`self.e1`, `self.e2`, `self.bot`):** Reduces spatial dimensions (downsampling via `F.max_pool2d`) while increasing feature channels. The basic `block` consists of two `Conv2d` layers, each followed by `BatchNorm2d` and `ReLU` activation, with `Dropout(0.2)` in between the two convolutions.
    * **Decoder (`self.up2`, `self.d2`, `self.up1`, `self.d1`):** Increases spatial dimensions (upsampling via `nn.ConvTranspose2d`) while decreasing feature channels.
    * **Skip Connections (`torch.cat([d2,e2],1)` and `torch.cat([d1,e1],1)`):** Concatenates high-resolution feature maps from the encoder (`e1`, `e2`) with the upsampled feature maps in the decoder. This helps the model maintain fine spatial details crucial for accurate segmentation.
* **Input/Output:** The model expects **4 input channels** (`in_c=4`): 3 for the resized image (RGB) and 1 for the ELA map. It outputs **1 channel** (`out_c=1`) representing the probability of forgery at each pixel.

### 4. Run-Length Encoding (RLE) (`rle` function)

RLE is a simple compression format used to efficiently encode the mask output, listing starting positions and lengths of consecutive "forged" pixels (runs of `1`s) in a flattened image.

* The function flattens the 2D mask (transposed first, as is common in RLE definitions).
* It finds the indices where the pixel value transitions (from 0 to 1 or 1 to 0).
* These transition indices are used to generate the **start index** and **length** pairs for all forged regions.
* If the mask is all zeros (`mask.sum() == 0`), it returns the string `"authentic"`.

### 5. Inference and Post-Processing (Code Cell 1, continued)

* **Data Prep:** Reads the test image, computes the ELA map, and prepares the combined 4-channel input tensor (RGB + ELA).
* **Prediction:** The input is passed through the `model`.
    * **Test-Time Augmentation (TTA):** The prediction is done twice: once with the original image (`p1`) and once with a **horizontally flipped** image (`p2`). The flipped prediction (`p2`) is then flipped back, and the two predictions are averaged (`prob = (p1 + p2) / 2`) to improve robustness.
    * **Mask Generation:** The averaged probability map is thresholded (`prob > THR`) to create a binary mask.
* **Post-Processing (Cleaning):**
    * `cv2.GaussianBlur` and re-thresholding are used to smooth the mask boundaries.
    * `cv2.connectedComponentsWithStats` identifies all distinct forgery regions.
    * **Noise Removal:** Any connected component with an area less than `MIN_A` (20 pixels) is discarded, effectively removing small noise/artifacts.
    * `cv2.morphologyEx(..., cv2.MORPH_CLOSE, ...)` applies **morphological closing** to fill small holes and connect close-by forgery regions, using a `5x5` ellipse kernel.
    * The cleaned mask is **resized** to the original image's dimensions (`(w, h)`) using nearest-neighbor interpolation.
* **RLE and Output:** The final mask is converted to an RLE string using `rle(final)` and saved to the `submission.csv` file.

### 6. Validation and Visualization (Code Cells 1, 2, and 3)

* **`validate` (in Code Cell 1):** Prints simple statistics about the final `submission.csv`, including the total number of submissions, authentic images, and the average number of RLE segments (pairs of start/length).
* **`validate_and_print_rle` (in Code Cell 2):** A more robust validation function that checks the RLE string structure for common errors (e.g., missing brackets, non-integer values, odd number of elements) to ensure the submission format is correct.
* **Visualization (Code Cell 3 and 4):**
    * The original image is loaded.
    * The RLE string from the generated submission is loaded and **decoded back into a binary mask** using `rle_decode`.
    * Statistics (forged pixel count, percentage) are calculated.
    * A visualization plot shows the **Original Image**, the **Binary Mask** from the RLE, and the **Forged Pixels Overlaid in Red** on the original image for visual inspection.

---

In [None]:
# --------------------------------------------------------------
# KAGGLE: FINAL INFERENCE â€” BALANCED RECALL + CLEAN MASKS
# Model: /kaggle/input/rluc-sfic-st/recodai_model.pth
# Target: 0.71â€“0.75 mIoU â†’ Top 10%
# --------------------------------------------------------------

import os
import gc
import cv2
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from tqdm.auto import tqdm
import warnings
warnings.filterwarnings('ignore')

# ------------------- PATHS -------------------
MODEL_PATH = "/kaggle/input/rluc-sfic-st/recodai_model.pth"
TEST_DIR   = "/kaggle/input/recodai-luc-scientific-image-forgery-detection/test_images"
SUB_CSV    = "/kaggle/input/recodai-luc-scientific-image-forgery-detection/sample_submission.csv"
OUT        = "submission.csv"

# ------------------- BALANCED HYPERPARAMETERS -------------------
TARGET_SIZE = 256
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'

# HYPERPARAMETERS
THR = 0.20          #  strong signals
MIN_A = 20          #  noise
MORPH_KERNEL = 5    # Light connection

# ------------------- ELA (MEMORY SAFE) -------------------
def compute_ela(p):
    try:
        img = cv2.imread(p)
        if img is None: return np.zeros((TARGET_SIZE, TARGET_SIZE), np.float32)
        img = cv2.resize(img, (TARGET_SIZE, TARGET_SIZE))
        _, enc = cv2.imencode('.jpg', img, [cv2.IMWRITE_JPEG_QUALITY, 95])
        c = cv2.imdecode(enc, cv2.IMREAD_COLOR)
        e = np.mean(np.abs(img.astype(np.float32) - c.astype(np.float32)), 2) * 10
        del img, c, enc
        return cv2.resize(e, (TARGET_SIZE, TARGET_SIZE)).astype(np.float32)
    except:
        return np.zeros((TARGET_SIZE, TARGET_SIZE), np.float32)

# ------------------- MODEL -------------------
class UNet(nn.Module):
    def __init__(self, in_c=4, out_c=1):
        super().__init__()
        def block(i,o): return nn.Sequential(
            nn.Conv2d(i,o,3,1,1), nn.BatchNorm2d(o), nn.ReLU(),
            nn.Dropout(0.2),
            nn.Conv2d(o,o,3,1,1), nn.BatchNorm2d(o), nn.ReLU()
        )
        self.e1 = block(in_c,64); self.e2 = block(64,128); self.bot = block(128,256)
        self.up2 = nn.ConvTranspose2d(256,128,2,2); self.d2 = block(256,128)
        self.up1 = nn.ConvTranspose2d(128,64,2,2);  self.d1 = block(128,64)
        self.out = nn.Conv2d(64,out_c,1)
    def forward(self,x):
        e1 = self.e1(x); p1 = F.max_pool2d(e1,2)
        e2 = self.e2(p1); p2 = F.max_pool2d(e2,2)
        b  = self.bot(p2)
        d2 = self.up2(b); d2 = torch.cat([d2,e2],1); d2 = self.d2(d2)
        d1 = self.up1(d2); d1 = torch.cat([d1,e1],1); d1 = self.d1(d1)
        return self.out(d1)

# ------------------- LOAD MODEL -------------------
model = UNet().to(DEVICE)
model.load_state_dict(torch.load(MODEL_PATH, map_location=DEVICE))
model.eval()
print(f"MODEL LOADED: {MODEL_PATH}")

# ------------------- RLE -------------------
def rle(mask):
    if mask.sum() == 0: return "authentic"
    p = mask.T.flatten()
    p = np.concatenate([[0], p, [0]])
    r = np.where(p[1:] != p[:-1])[0] + 1
    r[1::2] -= r[::2]
    return ','.join(map(str, r))

# ------------------- INFERENCE LOOP (MEMORY SAFE) -------------------
sub = pd.read_csv(SUB_CSV)
sub['case_id'] = sub['case_id'].astype(str)
paths = {os.path.splitext(f)[0]: os.path.join(TEST_DIR, f) 
         for f in os.listdir(TEST_DIR) if f.lower().endswith(('.png','.jpg','.jpeg','.tif','.tiff'))}
sub['p'] = sub['case_id'].map(paths)
sub = sub.dropna(subset=['p']).reset_index(drop=True)

res = []
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (MORPH_KERNEL, MORPH_KERNEL))

print(f"Found {len(sub)} test images. Starting inference...")

for idx, r in tqdm(sub.iterrows(), total=len(sub), desc="Predicting"):
    try:
        img = cv2.cvtColor(cv2.imread(r['p']), cv2.COLOR_BGR2RGB)
        h, w = img.shape[:2]
        ela = compute_ela(r['p'])
        x = cv2.resize(img, (TARGET_SIZE, TARGET_SIZE))
        e = cv2.resize(ela, (TARGET_SIZE, TARGET_SIZE))
        inp = np.concatenate([x, e[..., None]], -1) / 255.0
        inp = torch.from_numpy(inp.transpose(2,0,1)).unsqueeze(0).float().to(DEVICE)

        with torch.no_grad():
            p1 = torch.sigmoid(model(inp)).cpu().numpy().squeeze()
            inp_flip = torch.flip(inp, dims=[3])
            p2 = torch.sigmoid(model(inp_flip)).cpu().numpy().squeeze()
            p2 = np.flip(p2, axis=1)
            prob = (p1 + p2) / 2

        # --- HIGH RECALL + CLEAN POST-PROCESSING ---
        mask = (prob > THR).astype(np.float32)
        mask = cv2.GaussianBlur(mask, (3,3), 0)
        mask = (mask > 0.5).astype(np.uint8)

        n, lbl, stats, _ = cv2.connectedComponentsWithStats(mask, 4, cv2.CV_32S)
        clean = np.zeros_like(mask)
        for i in range(1, n):
            if stats[i, 4] >= MIN_A:
                clean[lbl == i] = 1
        clean = cv2.morphologyEx(clean, cv2.MORPH_CLOSE, kernel)
        final = cv2.resize(clean, (w, h), interpolation=cv2.INTER_NEAREST)

        ann = f'[{rle(final)}]' if final.sum() > 0 else 'authentic'
        res.append({'case_id': r['case_id'], 'annotation': ann})

        # MEMORY CLEANUP
        del img, ela, x, e, inp, p1, p2, prob, mask, clean, final
        if idx % 10 == 0:
            gc.collect()
            torch.cuda.empty_cache()

    except Exception as e:
        print(f"Error on {r['case_id']}: {e}")
        res.append({'case_id': r['case_id'], 'annotation': 'authentic'})

# ------------------- SAVE -------------------
pd.DataFrame(res)[['case_id', 'annotation']].to_csv(OUT, index=False)
print(f"SUBMISSION SAVED: {OUT}")

# ------------------- FINAL VALIDATION -------------------
def validate(df):
    print("\n" + "="*70)
    print("   FINAL RLE VALIDATION")
    print("="*70)
    total = len(df)
    auth = df['annotation'].apply(lambda x: str(x).strip() == 'authentic').sum()
    rle_df = df[df['annotation'].apply(lambda x: str(x).strip() != 'authentic')]
    print(f"Total: {total} | Authentic: {auth} | RLE: {len(rle_df)}")
    if len(rle_df) > 0:
        pairs = [len(str(x).strip('[]').split(','))//2 for x in rle_df['annotation'] if '[' in str(x)]
        if pairs:
            print(f"Avg RLE Pairs: {np.mean(pairs):.1f} | Max: {max(pairs)} | Min: {min(pairs)}")
    print("="*70)

validate(pd.DataFrame(res))

In [None]:
def validate_and_print_rle(submission_df):
    """
    Validates RLE output structure and prints debugging info.
    Checks for:
      1. Authentic/RLE count
      2. Correct RLE format: even number of integers (start, length pairs)
      3. RLE string starts with '[' and ends with ']'
      4. All values are valid integers
      5. NEW: Number of (start, length) segment pairs per RLE
    """
    print("\n" + "="*60)
    print("   RLE OUTPUT VALIDATION CHECK + SEGMENT PAIR COUNT")
    print("="*60)

    if 'annotation' not in submission_df.columns or 'case_id' not in submission_df.columns:
        print("ERROR: Missing required columns: 'case_id' or 'annotation'")
        return

    total = len(submission_df)
    authentic_count = submission_df['annotation'].apply(lambda x: str(x).strip() == 'authentic').sum()
    rle_rows = submission_df[submission_df['annotation'].apply(lambda x: str(x).strip() != 'authentic')].copy()

    print(f"Total Submissions       : {total}")
    print(f"Authentic (No Forgery)  : {authentic_count}")
    print(f"RLE Annotated (Forged)  : {len(rle_rows)}")

    if len(rle_rows) == 0:
        print("No RLE annotations to validate.")
        print("="*60)
        return

    # Extract RLE numbers inside [ ]
    def extract_rle_nums(rle_str):
        rle_str = str(rle_str).strip()
        if not (rle_str.startswith('[') and rle_str.endswith(']')):
            return None
        inner = rle_str[1:-1]
        if not inner:
            return []
        return [x.strip() for x in inner.split(',') if x.strip()]

    # Validation + pair counting
    valid_rle = []
    invalid_rle = []
    pair_counts = []  # Store number of (start, len) pairs per valid RLE

    for idx, row in rle_rows.iterrows():
        case_id = row['case_id']
        ann = row['annotation']
        nums = extract_rle_nums(ann)
        
        if nums is None:
            invalid_rle.append((case_id, ann, "Missing [ ] brackets"))
            continue
        if not all(x.isdigit() for x in nums):
            invalid_rle.append((case_id, ann, "Non-integer values"))
            continue
        if len(nums) == 0:
            invalid_rle.append((case_id, ann, "Empty RLE"))
            continue
        if len(nums) % 2 != 0:
            invalid_rle.append((case_id, ann, f"Odd number of elements: {len(nums)}"))
            continue
        
        # Count segment pairs
        num_pairs = len(nums) // 2
        pair_counts.append(num_pairs)

        # Optional: check increasing starts
        starts = [int(nums[i]) for i in range(0, len(nums), 2)]
        if starts != sorted(starts):
            invalid_rle.append((case_id, ann, "Non-increasing start positions"))
        else:
            valid_rle.append(case_id)

    # Summary
    print(f"\nRLE Validation Results:")
    print(f"   Valid RLE     : {len(valid_rle)}")
    print(f"   Invalid RLE   : {len(invalid_rle)}")

    if len(pair_counts) > 0:
        total_pairs = sum(pair_counts)
        avg_pairs = total_pairs / len(pair_counts)
        max_pairs = max(pair_counts)
        min_pairs = min(pair_counts)
        print(f"\nSEGMENT PAIR STATISTICS (start, length):")
        print(f"   Total Pairs   : {total_pairs}")
        print(f"   Avg Pairs/RLE : {avg_pairs:.2f}")
        print(f"   Min Pairs     : {min_pairs}")
        print(f"   Max Pairs     : {max_pairs}")
    else:
        print(f"\nNo valid RLE to compute pair statistics.")

    if len(invalid_rle) > 0:
        print(f"\nFIRST 5 INVALID RLE EXAMPLES:")
        for case_id, ann, reason in invalid_rle[:5]:
            print(f"   case_id {case_id}: {reason}")
            print(f"     â†’ {ann[:100]}{'...' if len(ann) > 100 else ''}")
        if len(invalid_rle) > 5:
            print(f"   ... and {len(invalid_rle) - 5} more.")
        print(f"\nFIX THESE BEFORE SUBMITTING!")
    else:
        print(f"ALL {len(valid_rle)} RLE STRINGS ARE VALID!")
    
    print("="*60)

In [None]:
submission_df = pd.read_csv("submission.csv")
validate_and_print_rle(submission_df)

In [None]:
import cv2
import matplotlib.pyplot as plt
import numpy as np

# Load the image
img_path = "/kaggle/input/recodai-luc-scientific-image-forgery-detection/test_images/45.png"
img = cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB)

plt.figure(figsize=(10, 8))
plt.imshow(img)
plt.title("TEST IMAGE: 45.png â€” LOOK FOR FORGERY", fontsize=16)
plt.axis('off')
plt.show()

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# ------------------- PATHS -------------------
IMG_PATH = "/kaggle/input/recodai-luc-scientific-image-forgery-detection/test_images/45.png"
SUBMISSION_CSV = "submission.csv"

# ------------------- LOAD IMAGE -------------------
img = cv2.cvtColor(cv2.imread(IMG_PATH), cv2.COLOR_BGR2RGB)
h, w = img.shape[:2]

# ------------------- LOAD RLE FROM SUBMISSION -------------------
df = pd.read_csv(SUBMISSION_CSV)
df['case_id'] = df['case_id'].astype(str)

print("Available case_ids (as str):", df['case_id'].tolist()[:10])

# Search for '45' or '45.png'
row = df[df['case_id'] == '45']
if len(row) == 0:
    row = df[df['case_id'] == '45.png']
if len(row) == 0:
    raise ValueError("case_id '45' or '45.png' not found in submission.csv")

rle_str = row.iloc[0]['annotation']
case_id = row.iloc[0]['case_id']
print(f"Found: case_id = {case_id} â†’ RLE loaded")

# ------------------- DECODE RLE -------------------
def rle_decode(rle_str, shape):
    if rle_str == 'authentic' or not rle_str.startswith('['):
        return np.zeros(shape, dtype=np.uint8)
    rle_str = rle_str.strip('[]')
    if not rle_str.strip():
        return np.zeros(shape, dtype=np.uint8)
    numbers = [int(x) for x in rle_str.split(',') if x.strip()]
    if len(numbers) % 2 != 0:
        print("Warning: Odd number of RLE values!")
        return np.zeros(shape, dtype=np.uint8)
    mask = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for i in range(0, len(numbers), 2):
        start = numbers[i] - 1
        length = numbers[i + 1]
        if start + length <= len(mask):
            mask[start:start + length] = 1
    return mask.reshape(shape[1], shape[0]).T  # (H, W)

mask = rle_decode(rle_str, (h, w))

# ------------------- CALCULATE DYNAMIC STATS -------------------
pairs = len(rle_str.strip('[]').split(',')) // 2 if '[' in rle_str else 0
forged_pixels = mask.sum()
forged_pct = forged_pixels / (h * w) * 100

# ------------------- VISUALIZE -------------------
fig, axes = plt.subplots(1, 3, figsize=(21, 7))

axes[0].imshow(img)
axes[0].set_title("Original: 45.png", fontsize=16)
axes[0].axis('off')

axes[1].imshow(mask, cmap='gray')
axes[1].set_title(f"Binary Mask ({pairs} RLE Segments)", fontsize=16)  # DYNAMIC
axes[1].axis('off')

overlay = img.copy()
overlay[mask == 1] = [255, 0, 0]
blended = cv2.addWeighted(img, 0.7, overlay, 0.3, 0)

axes[2].imshow(blended)
axes[2].set_title("Forgery Detected (Red)", fontsize=16)
axes[2].axis('off')

plt.suptitle(f"RLE VISUALIZATION: {pairs} Forgery Segments", fontsize=18, y=0.95)  # DYNAMIC
plt.tight_layout()
plt.show()

# ------------------- DYNAMIC STATS -------------------
print(f"\nRLE STATS:")
print(f"   case_id: {case_id}")
print(f"   RLE (first 100): {rle_str[:100]}{'...' if len(rle_str) > 100 else ''}")
print(f"   Total RLE Pairs: {pairs}")
print(f"   Forged Pixels: {forged_pixels} ({forged_pct:.3f}% of image)")