<h1 style="text-align:center; color:#3C91E6; font-size:2.2em; margin-bottom:0;">üß† Recod.ai/LUC - Scientific Image Forgery Detection</h1>
<h2 style="text-align:center; color:#555; font-weight:400; margin-top:5px;">A Comprehensive Guide and Educational Notebook</h2>

<hr style="border:1px solid #ccc; margin:20px 0;">

<section>
  <h2 style="color:#3C91E6;">1. Competition Overview</h2>
  <p>
    This notebook serves as a <strong>comprehensive guide</strong> for the 
    <em>"Recod.ai/LUC - Scientific Image Forgery Detection"</em> competition.
    The primary goal is to develop a robust model capable of 
    <strong>detecting and segmenting copy-move forgeries</strong> 
    in biomedical research images at the pixel level.
  </p>

  <div style="background-color:#f8f9fa; border-left:4px solid #3C91E6; padding:10px 20px; margin:15px 0;">
    <h3 style="color:#3C91E6;">üìò Through this notebook, you will find:</h3>
    <ul>
      <li>A complete <strong>Exploratory Data Analysis (EDA)</strong> to understand dataset structure and challenges.</li>
      <li>A review of <strong>effective model architectures</strong> from recent literature, focused on segmentation and forgery detection.</li>
      <li>A detailed, step-by-step <strong>implementation pipeline</strong> ‚Äî from preprocessing to model training and inference.</li>
      <li>A <strong>submission-ready system</strong> aligned with competition requirements.</li>
    </ul>
  </div>

  <p>
    The goal is not just to build a model, but to create an <strong>educational resource</strong> 
    that explains the <em>why</em> and <em>how</em> ‚Äî making advanced image forensics 
    methods understandable for all participants.
  </p>
</section>

<hr style="border:1px solid #ccc; margin:30px 0;">

<section>
  <h2 style="color:#3C91E6;">2. Scientific Context</h2>
  <p>
    The <strong>integrity of scientific imagery</strong> is fundamental to credible research. 
    Images in publications should represent genuine data ‚Äî yet manipulation persists. 
    A common and damaging practice is <strong>copy-move forgery</strong>, 
    where regions of an image are duplicated to fabricate or reinforce results.
  </p>

  <p>
    The consequences are far-reaching:
  </p>
  <ul>
    <li>It <strong>misleads researchers</strong>, wasting valuable time and funding.</li>
    <li>It <strong>undermines public trust</strong> in scientific institutions.</li>
    <li>It can even <strong>endanger lives</strong> when flawed clinical data informs real-world decisions.</li>
  </ul>

  <p>
    This competition uses a <strong>realistic benchmark dataset</strong> derived from 
    hundreds of confirmed forgeries across over 2,000 retracted scientific papers ‚Äî 
    making it one of the most detailed datasets in the field of image forensics.
  </p>
</section>

<hr style="border:1px solid #ccc; margin:30px 0;">

<section>
  <h2 style="color:#3C91E6;">3. Link to Real-World Retracted Research</h2>
  <p>
    Retractions are not abstract ‚Äî they represent a growing crisis in scientific publishing. 
    The following table summarizes insights from studies on retracted research, 
    emphasizing the urgency of tools like this competition aims to build.
  </p>

  <table style="width:100%; border-collapse:collapse; margin:15px 0; font-size:0.95em;">
    <thead style="background-color:#3C91E6; color:#fff;">
      <tr>
        <th style="padding:10px; text-align:left;">Aspect</th>
        <th style="padding:10px; text-align:left;">Findings from Recent Research</th>
        <th style="padding:10px; text-align:left;">Relevance to this Competition</th>
      </tr>
    </thead>
    <tbody>
      <tr style="background-color:#f8f9fa;">
        <td style="padding:10px;">Retraction Scale & Impact</td>
        <td style="padding:10px;">Thousands of retracted papers have been analyzed using ML. One study examined 764 retracted AI papers.</td>
        <td style="padding:10px;">Provides a rich foundation of real-world fraudulent cases for training detection models.</td>
      </tr>
      <tr>
        <td style="padding:10px;">Persistence of Flawed Work</td>
        <td style="padding:10px;">96% of citations to a retracted clinical trial failed to note its retraction.</td>
        <td style="padding:10px;">Highlights the importance of detecting manipulation <em>before</em> publication.</td>
      </tr>
      <tr style="background-color:#f8f9fa;">
        <td style="padding:10px;">Common Reasons for Retraction</td>
        <td style="padding:10px;">In AI and biomedical fields, causes include data falsification and image manipulation.</td>
        <td style="padding:10px;">Copy-move forgery detection targets one of the root causes directly.</td>
      </tr>
      <tr>
        <td style="padding:10px;">AI‚Äôs Role in the Problem</td>
        <td style="padding:10px;">AI systems sometimes reuse content from retracted papers without warning.</td>
        <td style="padding:10px;">Shows the need for reliable, ethical AI systems that safeguard research integrity.</td>
      </tr>
    </tbody>
  </table>

  <div style="background-color:#eaf5ff; border-left:4px solid #3C91E6; padding:15px 20px; margin-top:20px;">
    <p style="font-size:1.05em; color:#333;">
      These studies confirm that the issue this competition tackles is <strong>real, urgent, and impactful</strong>. 
      By contributing here, you help develop technologies that can 
      <strong>protect the integrity of scientific discovery</strong>.
    </p>
  </div>
</section>

<hr style="border:1px solid #ccc; margin:30px 0;">

<section style="text-align:center;">
  <h2 style="color:#3C91E6;">üöÄ Let's Keep Science Honest ‚Äî One Pixel at a Time</h2>
  
</section>


# Data Deep Dive (EDA)

In [None]:

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
import cv2
from collections import Counter
import warnings
warnings.filterwarnings('ignore')

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Paths
BASE_PATH = "/kaggle/input/recodai-luc-scientific-image-forgery-detection"
TRAIN_AUTHENTIC_PATH = os.path.join(BASE_PATH, "train_images/authentic")
TRAIN_FORGED_PATH = os.path.join(BASE_PATH, "train_images/forged")
TRAIN_MASKS_PATH = os.path.join(BASE_PATH, "train_masks")
TEST_IMAGES_PATH = os.path.join(BASE_PATH, "test_images")

print("üîç Exploring Dataset Structure...")
print(f"Base path: {BASE_PATH}")
print(f"Authentic images: {TRAIN_AUTHENTIC_PATH}")
print(f"Forged images: {TRAIN_FORGED_PATH}")
print(f"Masks path: {TRAIN_MASKS_PATH}")
print(f"Test images: {TEST_IMAGES_PATH}")

# Check what files exist
def explore_directory_structure():
    """Explore and count files in each directory"""
    print("\n" + "="*50)
    print("üìÅ DIRECTORY STRUCTURE ANALYSIS")
    print("="*50)
    
    directories = {
        'Authentic Train': TRAIN_AUTHENTIC_PATH,
        'Forged Train': TRAIN_FORGED_PATH,
        'Train Masks': TRAIN_MASKS_PATH,
        'Test Images': TEST_IMAGES_PATH
    }
    
    file_counts = {}
    file_extensions = {}
    
    for dir_name, dir_path in directories.items():
        if os.path.exists(dir_path):
            files = os.listdir(dir_path)
            file_counts[dir_name] = len(files)
            
            # Count file extensions
            extensions = Counter([os.path.splitext(f)[1] for f in files])
            file_extensions[dir_name] = extensions
            
            print(f"{dir_name}: {len(files)} files")
            print(f"  Extensions: {dict(extensions)}")
        else:
            print(f"{dir_name}: Directory not found!")
            file_counts[dir_name] = 0
    
    return file_counts, file_extensions

file_counts, file_extensions = explore_directory_structure()

In [None]:
# Create summary dataframe
summary_df = pd.DataFrame({
    'Dataset': list(file_counts.keys()),
    'Count': list(file_counts.values())
})

print("\n" + "="*50)
print("üìä DATASET SUMMARY")
print("="*50)
print(summary_df)

# Visualize dataset distribution
plt.figure(figsize=(10, 6))
bars = plt.bar(summary_df['Dataset'], summary_df['Count'], color=['skyblue', 'lightcoral', 'lightgreen', 'gold'])
plt.title('Dataset Distribution Across Categories', fontsize=16, fontweight='bold')
plt.ylabel('Number of Files', fontsize=12)
plt.xticks(rotation=45)
plt.grid(axis='y', alpha=0.3)

# Add value labels on bars
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{int(height)}', ha='center', va='bottom')

plt.tight_layout()
plt.show()

# Analyze image properties
def analyze_image_properties(image_paths, category_name):
    """Analyze basic properties of images"""
    print(f"\nüî¨ Analyzing {category_name} Images...")
    
    heights = []
    widths = []
    aspect_ratios = []
    sizes_kb = []
    modes = []
    
    for img_path in image_paths[:500]:  # Sample first 500 images for efficiency
        try:
            with Image.open(img_path) as img:
                width, height = img.size
                mode = img.mode
                
                heights.append(height)
                widths.append(width)
                aspect_ratios.append(width / height)
                sizes_kb.append(os.path.getsize(img_path) / 1024)
                modes.append(mode)
        except Exception as e:
            print(f"Error reading {img_path}: {e}")
            continue
    
    if not heights:  # If no images were processed
        return None
    
    properties = {
        'heights': heights,
        'widths': widths,
        'aspect_ratios': aspect_ratios,
        'sizes_kb': sizes_kb,
        'modes': modes
    }
    
    # Create summary statistics
    stats_df = pd.DataFrame({
        'Property': ['Height', 'Width', 'Aspect Ratio', 'Size (KB)'],
        'Mean': [np.mean(heights), np.mean(widths), np.mean(aspect_ratios), np.mean(sizes_kb)],
        'Std': [np.std(heights), np.std(widths), np.std(aspect_ratios), np.std(sizes_kb)],
        'Min': [np.min(heights), np.min(widths), np.min(aspect_ratios), np.min(sizes_kb)],
        'Max': [np.max(heights), np.max(widths), np.max(aspect_ratios), np.max(sizes_kb)],
        'Median': [np.median(heights), np.median(widths), np.median(aspect_ratios), np.median(sizes_kb)]
    })
    
    print(f"Sample size: {len(heights)} images")
    print(f"Color modes: {Counter(modes)}")
    print(f"\nSummary Statistics for {category_name}:")
    print(stats_df.round(2))
    
    return properties, stats_df

# Get sample image paths
def get_sample_image_paths(directory, sample_size=500):
    """Get sample image paths from directory"""
    if os.path.exists(directory):
        all_files = [os.path.join(directory, f) for f in os.listdir(directory) 
                    if f.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff'))]
        return all_files[:sample_size]
    return []

# Analyze authentic and forged images
authentic_samples = get_sample_image_paths(TRAIN_AUTHENTIC_PATH)
forged_samples = get_sample_image_paths(TRAIN_FORGED_PATH)

auth_props, auth_stats = analyze_image_properties(authentic_samples, "Authentic")
forge_props, forge_stats = analyze_image_properties(forged_samples, "Forged")

# Visualize image properties comparison
def plot_image_properties_comparison(auth_props, forge_props):
    """Create comparison plots for image properties"""
    if auth_props is None or forge_props is None:
        print("No image data to plot")
        return
    
    fig, axes = plt.subplots(2, 2, figsize=(15, 12))
    
    # Height distribution
    axes[0,0].hist(auth_props['heights'], bins=50, alpha=0.7, label='Authentic', color='skyblue')
    axes[0,0].hist(forge_props['heights'], bins=50, alpha=0.7, label='Forged', color='lightcoral')
    axes[0,0].set_xlabel('Height (pixels)')
    axes[0,0].set_ylabel('Frequency')
    axes[0,0].set_title('Image Height Distribution')
    axes[0,0].legend()
    axes[0,0].grid(alpha=0.3)
    
    # Width distribution
    axes[0,1].hist(auth_props['widths'], bins=50, alpha=0.7, label='Authentic', color='skyblue')
    axes[0,1].hist(forge_props['widths'], bins=50, alpha=0.7, label='Forged', color='lightcoral')
    axes[0,1].set_xlabel('Width (pixels)')
    axes[0,1].set_ylabel('Frequency')
    axes[0,1].set_title('Image Width Distribution')
    axes[0,1].legend()
    axes[0,1].grid(alpha=0.3)
    
    # Aspect ratio distribution
    axes[1,0].hist(auth_props['aspect_ratios'], bins=50, alpha=0.7, label='Authentic', color='skyblue')
    axes[1,0].hist(forge_props['aspect_ratios'], bins=50, alpha=0.7, label='Forged', color='lightcoral')
    axes[1,0].set_xlabel('Aspect Ratio (Width/Height)')
    axes[1,0].set_ylabel('Frequency')
    axes[1,0].set_title('Aspect Ratio Distribution')
    axes[1,0].legend()
    axes[1,0].grid(alpha=0.3)
    
    # File size distribution
    axes[1,1].hist(auth_props['sizes_kb'], bins=50, alpha=0.7, label='Authentic', color='skyblue')
    axes[1,1].hist(forge_props['sizes_kb'], bins=50, alpha=0.7, label='Forged', color='lightcoral')
    axes[1,1].set_xlabel('File Size (KB)')
    axes[1,1].set_ylabel('Frequency')
    axes[1,1].set_title('File Size Distribution')
    axes[1,1].legend()
    axes[1,1].grid(alpha=0.3)
    
    plt.tight_layout()
    plt.show()

plot_image_properties_comparison(auth_props, forge_props)

In [None]:
# Analyze mask properties
def analyze_masks(masks_path):
    """Analyze the mask files (.npy)"""
    print("\n" + "="*50)
    print("üé≠ MASK ANALYSIS")
    print("="*50)
    
    if not os.path.exists(masks_path):
        print("Masks directory not found!")
        return None
    
    mask_files = [f for f in os.listdir(masks_path) if f.endswith('.npy')]
    print(f"Found {len(mask_files)} mask files")
    
    if not mask_files:
        return None
    
    mask_properties = {
        'num_masks_per_file': [],
        'mask_shapes': [],
        'mask_areas': [],
        'coverage_ratios': []
    }
    
    # Analyze sample of masks
    for mask_file in mask_files[:200]:  # Sample for efficiency
        try:
            mask_path = os.path.join(masks_path, mask_file)
            mask_data = np.load(mask_path)
            
            # Handle different mask formats
            if mask_data.ndim == 2:
                # Single mask
                mask_properties['num_masks_per_file'].append(1)
                mask_properties['mask_shapes'].append(mask_data.shape)
                area = np.sum(mask_data > 0)  # Assuming binary mask
                mask_properties['mask_areas'].append(area)
                if mask_data.size > 0:
                    mask_properties['coverage_ratios'].append(area / mask_data.size)
            elif mask_data.ndim == 3:
                # Multiple masks
                num_masks = mask_data.shape[0] if mask_data.shape[0] > 1 else 1
                mask_properties['num_masks_per_file'].append(num_masks)
                for i in range(num_masks):
                    single_mask = mask_data[i] if mask_data.shape[0] > 1 else mask_data
                    mask_properties['mask_shapes'].append(single_mask.shape)
                    area = np.sum(single_mask > 0)
                    mask_properties['mask_areas'].append(area)
                    if single_mask.size > 0:
                        mask_properties['coverage_ratios'].append(area / single_mask.size)
                    
        except Exception as e:
            print(f"Error loading mask {mask_file}: {e}")
            continue
    
    if mask_properties['mask_areas']:
        print(f"\nMask Analysis Summary (sample size: {len(mask_properties['mask_areas'])})")
        print(f"Number of masks per file: {Counter(mask_properties['num_masks_per_file'])}")
        print(f"Average mask area: {np.mean(mask_properties['mask_areas']):.2f} pixels")
        print(f"Average coverage ratio: {np.mean(mask_properties['coverage_ratios']):.4f}")
        print(f"Min coverage: {np.min(mask_properties['coverage_ratios']):.6f}")
        print(f"Max coverage: {np.max(mask_properties['coverage_ratios']):.4f}")
    
    return mask_properties

mask_props = analyze_masks(TRAIN_MASKS_PATH)

# Visualize sample images and masks
def visualize_samples(authentic_path, forged_path, masks_path, num_samples=5):
    """Visualize sample authentic images, forged images, and their masks"""
    print("\n" + "="*50)
    print("üñºÔ∏è SAMPLE VISUALIZATION")
    print("="*50)
    
    # Get sample files
    authentic_files = [f for f in os.listdir(authentic_path) if f.lower().endswith(('.png', '.jpg', '.jpeg'))][:num_samples]
    forged_files = [f for f in os.listdir(forged_path) if f.lower().endswith(('.png', '.jpg', '.jpeg'))][:num_samples]
    
    fig, axes = plt.subplots(3, num_samples, figsize=(4*num_samples, 12))
    
    if num_samples == 1:
        axes = axes.reshape(3, 1)
    
    # Plot authentic images
    for i, img_file in enumerate(authentic_files):
        img_path = os.path.join(authentic_path, img_file)
        img = Image.open(img_path)
        axes[0, i].imshow(img)
        axes[0, i].set_title(f'Authentic: {img_file[:15]}...')
        axes[0, i].axis('off')
    
    # Plot forged images
    for i, img_file in enumerate(forged_files):
        img_path = os.path.join(forged_path, img_file)
        img = Image.open(img_path)
        axes[1, i].imshow(img)
        axes[1, i].set_title(f'Forged: {img_file[:15]}...')
        axes[1, i].axis('off')
        
        # Try to find corresponding mask
        mask_name = os.path.splitext(img_file)[0] + '.npy'
        mask_path = os.path.join(masks_path, mask_name)
        if os.path.exists(mask_path):
            try:
                mask = np.load(mask_path)
                if mask.ndim == 3:
                    mask = mask[0]  # Take first mask if multiple
                axes[2, i].imshow(mask, cmap='hot')
                axes[2, i].set_title(f'Mask: {mask_name[:15]}...')
                coverage = np.sum(mask > 0) / mask.size if mask.size > 0 else 0
                axes[2, i].set_xlabel(f'Coverage: {coverage:.4f}')
            except Exception as e:
                axes[2, i].text(0.5, 0.5, f'Mask Error\n{e}', 
                               ha='center', va='center', transform=axes[2, i].transAxes)
        else:
            axes[2, i].text(0.5, 0.5, 'Mask Not Found', 
                           ha='center', va='center', transform=axes[2, i].transAxes)
        axes[2, i].axis('off')
    
    plt.tight_layout()
    plt.show()

# Display samples
visualize_samples(TRAIN_AUTHENTIC_PATH, TRAIN_FORGED_PATH, TRAIN_MASKS_PATH, num_samples=5)

# Analyze class distribution
print("\n" + "="*50)
print("üìà CLASS DISTRIBUTION ANALYSIS")
print("="*50)

total_train_images = file_counts['Authentic Train'] + file_counts['Forged Train']
authentic_ratio = file_counts['Authentic Train'] / total_train_images
forged_ratio = file_counts['Forged Train'] / total_train_images

print(f"Total training images: {total_train_images}")
print(f"Authentic images: {file_counts['Authentic Train']} ({authentic_ratio:.2%})")
print(f"Forged images: {file_counts['Forged Train']} ({forged_ratio:.2%})")

# Plot class distribution
plt.figure(figsize=(8, 6))
classes = ['Authentic', 'Forged']
counts = [file_counts['Authentic Train'], file_counts['Forged Train']]
colors = ['lightgreen', 'lightcoral']

plt.pie(counts, labels=classes, colors=colors, autopct='%1.1f%%', startangle=90)
plt.title('Class Distribution in Training Set', fontsize=14, fontweight='bold')
plt.show()

# Summary statistics
print("\n" + "="*50)
print("üìã EDA SUMMARY")
print("="*50)
print("‚úì Dataset structure verified")
print("‚úì Image properties analyzed (dimensions, aspect ratios, file sizes)")
print("‚úì Class distribution calculated")
print("‚úì Mask properties explored")
print("‚úì Sample visualization completed")

if auth_props and forge_props:
    print(f"\nKey Insights:")
    print(f"- Authentic images analyzed: {len(auth_props['heights'])}")
    print(f"- Forged images analyzed: {len(forge_props['heights'])}")
    print(f"- Average image dimensions: {np.mean(auth_props['widths']):.0f}x{np.mean(auth_props['heights']):.0f} pixels")
    print(f"- Class balance: {authentic_ratio:.1%} authentic vs {forged_ratio:.1%} forged")

if mask_props and mask_props['coverage_ratios']:
    print(f"- Average mask coverage: {np.mean(mask_props['coverage_ratios']):.4f}")
    print(f"- This indicates the typical proportion of image area that is forged")

print("\nüéØ Next steps: Use these insights to guide preprocessing and model selection!")

# Evaluation Metrics

### Why Accuracy Can Be Misleading in Image Forgery Detection In medical image segmentation tasks like forgery detection, accuracy is often a deceptive metric that can lead to false confidence. Here's why:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, f1_score, jaccard_score

# Demonstration of why accuracy can be misleading
def demonstrate_accuracy_paradox():
    # Simulate a biomedical image scenario
    # Typical case: 95% background, 5% forged regions
    y_true = np.random.choice([0, 1], size=10000, p=[0.95, 0.05])
    
    # Naive model that predicts everything as background
    y_pred_naive = np.zeros_like(y_true)
    
    # Good model with 90% recall on forged regions
    y_pred_good = y_true.copy()
    forged_indices = np.where(y_true == 1)[0]
    # Simulate 10% false negatives in forged regions
    false_negatives = np.random.choice(forged_indices, size=int(0.1 * len(forged_indices)), replace=False)
    y_pred_good[false_negatives] = 0
    
    # Calculate metrics
    acc_naive = accuracy_score(y_true, y_pred_naive)
    acc_good = accuracy_score(y_true, y_pred_good)
    f1_naive = f1_score(y_true, y_pred_naive)
    f1_good = f1_score(y_true, y_pred_good)
    
    print("üìä The Accuracy Paradox in Medical Image Segmentation")
    print("="*60)
    print(f"Scenario: 95% background pixels, 5% forged region pixels")
    print(f"\nNaive Model (always predicts background):")
    print(f"  Accuracy: {acc_naive:.4f} ({acc_naive*100:.2f}%)")
    print(f"  F1 Score: {f1_naive:.4f}")
    
    print(f"\nGood Model (90% recall on forged regions):")
    print(f"  Accuracy: {acc_good:.4f} ({acc_good*100:.2f}%)")
    print(f"  F1 Score: {f1_good:.4f}")
    
    print(f"\nKey Insight:")
    print(f"  The naive model appears to have excellent accuracy ({acc_naive*100:.2f}%)")
    print(f"  but completely fails at the actual task (F1 = 0)")
    print(f"  The good model has slightly lower accuracy but actually detects forgeries!")

demonstrate_accuracy_paradox()

## Core Evaluation Metrics for Segmentation

### 1. Intersection over Union (IoU) / Jaccard Index


In [None]:
def calculate_iou(mask1, mask2):
    """
    Calculate Intersection over Union between two binary masks
    
    IoU = |A ‚à© B| / |A ‚à™ B|
    
    Where:
    - A ‚à© B: Intersection (pixels where both masks are 1)
    - A ‚à™ B: Union (pixels where either mask is 1)
    """
    intersection = np.logical_and(mask1, mask2).sum()
    union = np.logical_or(mask1, mask2).sum()
    
    if union == 0:
        return 1.0  # Both masks are empty
    return intersection / union

# Example calculation
def demonstrate_iou():
    # Example masks
    y_true = np.array([
        [0, 0, 0, 0, 0],
        [0, 1, 1, 1, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 0, 0]
    ])
    
    y_pred = np.array([
        [0, 0, 0, 0, 0],
        [0, 1, 1, 0, 0],
        [0, 1, 1, 0, 0],
        [0, 0, 0, 0, 0]
    ])
    
    iou = calculate_iou(y_true, y_pred)
    
    print(f"\nüéØ Intersection over Union (IoU) Example")
    print("="*40)
    print("Ground Truth Mask:")
    print(y_true)
    print("\nPredicted Mask:")
    print(y_pred)
    print(f"\nIoU = {iou:.4f}")
    
    # Visualization
    plt.figure(figsize=(12, 4))
    
    plt.subplot(1, 3, 1)
    plt.imshow(y_true, cmap='Blues')
    plt.title('Ground Truth')
    plt.colorbar()
    
    plt.subplot(1, 3, 2)
    plt.imshow(y_pred, cmap='Reds')
    plt.title('Prediction')
    plt.colorbar()
    
    plt.subplot(1, 3, 3)
    intersection = np.logical_and(y_true, y_pred)
    union = np.logical_or(y_true, y_pred)
    plt.imshow(union.astype(int) - intersection.astype(int), cmap='RdYlBu')
    plt.title('Union - Intersection\n(Blue: Union only)')
    plt.colorbar()
    
    plt.tight_layout()
    plt.show()
    
    return iou

iou_score = demonstrate_iou()

### 2. Dice Coefficient / F1 Score


In [None]:
def calculate_dice_coefficient(mask1, mask2):
    """
    Calculate Dice Coefficient (F1 Score for segmentation)
    
    Dice = (2 * |A ‚à© B|) / (|A| + |B|)
    
    This is equivalent to the F1 score where:
    - True Positives: |A ‚à© B|
    - False Positives: |B - A|
    - False Negatives: |A - B|
    """
    intersection = np.logical_and(mask1, mask2).sum()
    mask1_sum = mask1.sum()
    mask2_sum = mask2.sum()
    
    if mask1_sum + mask2_sum == 0:
        return 1.0  # Both masks are empty
    
    return (2.0 * intersection) / (mask1_sum + mask2_sum)

def demonstrate_dice_vs_iou():
    """
    Show relationship between Dice and IoU
    """
    # Create sample masks
    y_true = np.random.randint(0, 2, (100, 100))
    y_pred = np.random.randint(0, 2, (100, 100))
    
    dice = calculate_dice_coefficient(y_true, y_pred)
    iou = calculate_iou(y_true, y_pred)
    
    print(f"\nüé≤ Dice Coefficient vs IoU")
    print("="*40)
    print(f"Dice Coefficient: {dice:.4f}")
    print(f"IoU Score: {iou:.4f}")
    
    # Mathematical relationship
    if dice > 0:
        calculated_iou = dice / (2 - dice)
        print(f"Mathematical relationship: IoU = Dice / (2 - Dice)")
        print(f"Calculated IoU from Dice: {calculated_iou:.4f}")
    
    return dice, iou

dice_score, iou_score = demonstrate_dice_vs_iou()

### 3. Competition-Specific F1 Variant


In [None]:
import pandas as pd
def competition_f1_variant(y_true, y_pred, beta=1):
    """
    This competition uses a variant of F1 score
    Typically includes run-length encoding and specific handling
    
    FŒ≤ = (1 + Œ≤¬≤) * (precision * recall) / (Œ≤¬≤ * precision + recall)
    """
    # Convert to binary if needed
    y_true = (y_true > 0.5).astype(np.uint8)
    y_pred = (y_pred > 0.5).astype(np.uint8)
    
    # Calculate TP, FP, FN
    tp = np.sum((y_true == 1) & (y_pred == 1))
    fp = np.sum((y_true == 0) & (y_pred == 1))
    fn = np.sum((y_true == 1) & (y_pred == 0))
    
    # Calculate precision and recall
    precision = tp / (tp + fp + 1e-7)
    recall = tp / (tp + fn + 1e-7)
    
    # F-beta score
    f_beta = (1 + beta**2) * (precision * recall) / ((beta**2 * precision) + recall + 1e-7)
    
    return f_beta, precision, recall

def analyze_metric_sensitivity():
    """
    Analyze how different metrics respond to various prediction scenarios
    """
    scenarios = {
        'Perfect Prediction': (np.ones(100), np.ones(100)),
        'All Wrong': (np.ones(100), np.zeros(100)),
        'Partial Overlap': (np.concatenate([np.ones(50), np.zeros(50)]), 
                           np.concatenate([np.ones(30), np.zeros(70)])),
        'Empty Masks': (np.zeros(100), np.zeros(100))
    }
    
    results = []
    
    for scenario, (true_mask, pred_mask) in scenarios.items():
        accuracy = accuracy_score(true_mask, pred_mask)
        dice = calculate_dice_coefficient(true_mask, pred_mask)
        iou = calculate_iou(true_mask, pred_mask)
        f1, precision, recall = competition_f1_variant(true_mask, pred_mask)
        
        results.append({
            'Scenario': scenario,
            'Accuracy': accuracy,
            'Dice/F1': dice,
            'IoU': iou,
            'Competition_F1': f1,
            'Precision': precision,
            'Recall': recall
        })
    
    # Create results table
    results_df = pd.DataFrame(results)
    print(f"\nüìà Metric Sensitivity Analysis")
    print("="*60)
    print(results_df.round(4).to_string(index=False))
    
    return results_df

metric_analysis = analyze_metric_sensitivity()

### Practical Implementation for the Competition


In [None]:
# Complete evaluation metric implementation
class ForgeryDetectionMetrics:
    def __init__(self, threshold=0.5):
        self.threshold = threshold
        self.scores = {}
    
    def rle_decode(self, mask_rle, shape):
        """
        Decode run-length encoded mask
        """
        if mask_rle == 'authentic' or pd.isna(mask_rle):
            return np.zeros(shape, dtype=np.uint8)
        
        s = mask_rle.split()
        starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
        starts -= 1
        ends = starts + lengths
        mask = np.zeros(shape[0] * shape[1], dtype=np.uint8)
        
        for lo, hi in zip(starts, ends):
            mask[lo:hi] = 1
        
        return mask.reshape(shape)
    
    def calculate_metrics(self, y_true_masks, y_pred_masks):
        """
        Calculate all relevant metrics for the competition
        """
        dice_scores = []
        iou_scores = []
        f1_scores = []
        
        for true_mask, pred_mask in zip(y_true_masks, y_pred_masks):
            # Ensure binary masks
            true_binary = (true_mask > self.threshold).astype(np.uint8)
            pred_binary = (pred_mask > self.threshold).astype(np.uint8)
            
            dice = calculate_dice_coefficient(true_binary, pred_binary)
            iou = calculate_iou(true_binary, pred_binary)
            f1, precision, recall = competition_f1_variant(true_binary, pred_binary)
            
            dice_scores.append(dice)
            iou_scores.append(iou)
            f1_scores.append(f1)
        
        self.scores = {
            'dice_mean': np.mean(dice_scores),
            'dice_std': np.std(dice_scores),
            'iou_mean': np.mean(iou_scores),
            'iou_std': np.std(iou_scores),
            'f1_mean': np.mean(f1_scores),
            'f1_std': np.std(f1_scores),
            'individual_dice': dice_scores,
            'individual_iou': iou_scores
        }
        
        return self.scores
    
    def plot_metric_distribution(self):
        """
        Plot distribution of metrics across test samples
        """
        fig, axes = plt.subplots(1, 2, figsize=(12, 5))
        
        # Dice distribution
        axes[0].hist(self.scores['individual_dice'], bins=20, alpha=0.7, color='skyblue')
        axes[0].axvline(self.scores['dice_mean'], color='red', linestyle='--', label=f'Mean: {self.scores["dice_mean"]:.3f}')
        axes[0].set_xlabel('Dice Coefficient')
        axes[0].set_ylabel('Frequency')
        axes[0].set_title('Distribution of Dice Scores')
        axes[0].legend()
        axes[0].grid(alpha=0.3)
        
        # IoU distribution
        axes[1].hist(self.scores['individual_iou'], bins=20, alpha=0.7, color='lightcoral')
        axes[1].axvline(self.scores['iou_mean'], color='red', linestyle='--', label=f'Mean: {self.scores["iou_mean"]:.3f}')
        axes[1].set_xlabel('IoU Score')
        axes[1].set_ylabel('Frequency')
        axes[1].set_title('Distribution of IoU Scores')
        axes[1].legend()
        axes[1].grid(alpha=0.3)
        
        plt.tight_layout()
        plt.show()

# Example usage
metrics_calculator = ForgeryDetectionMetrics()

print("\nüéØ Key Takeaways for the Competition:")
print("="*50)
print("1. ‚úÖ USE: Dice/F1 Score & IoU - They handle class imbalance well")
print("2. ‚úÖ USE: Competition's F1 variant - Follows official evaluation")
print("3. ‚ùå AVOID: Raw Accuracy - Misleading with imbalanced data")
print("4. üìä MONITOR: Both Precision and Recall - Balance detection quality")
print("5. üîç ANALYZE: Metric distributions - Don't just look at averages")

print("\nFor this competition, focus on optimizing the Dice/F1 score as it")
print("directly measures how well your model detects and segments forgeries,")
print("ignoring the easy background predictions that inflate accuracy.")

### This comprehensive metrics section explains why traditional accuracy is problematic for this task and provides practical implementations of the appropriate metrics for evaluating your forgery detection models.

<h1 style="text-align:center; color:#3C91E6; font-size:2.2em; margin-bottom:0;">üß† Image Segmentation for Forgery Detection</h1>

<p style="text-align:center; color:#6C757D;">Comprehensive review of image segmentation techniques, from traditional to deep learning approaches.</p>

<hr>

<h2>üèõÔ∏è Traditional Segmentation Methods</h2>

<p>Before the rise of deep learning, image segmentation relied on mathematical models and low-level features like intensity and texture. The following table summarizes their key characteristics, which are useful for understanding the fundamentals of image analysis:</p>

<table style="width:100%; border-collapse:collapse; text-align:left;">
  <thead style="background-color:#E8F0FE;">
    <tr>
      <th style="padding:8px; border:1px solid #ccc;">Method Type</th>
      <th style="padding:8px; border:1px solid #ccc;">Key Idea</th>
      <th style="padding:8px; border:1px solid #ccc;">Strengths</th>
      <th style="padding:8px; border:1px solid #ccc;">Weaknesses in Medical/Forgery Context</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="padding:8px; border:1px solid #ccc;">Thresholding</td>
      <td style="padding:8px; border:1px solid #ccc;">Converts a grayscale image to binary based on a pixel value threshold.</td>
      <td style="padding:8px; border:1px solid #ccc;">Computationally efficient, simple to implement.</td>
      <td style="padding:8px; border:1px solid #ccc;">Struggles with intensity variations, noise, and complex structures.</td>
    </tr>
    <tr>
      <td style="padding:8px; border:1px solid #ccc;">Edge-Based</td>
      <td style="padding:8px; border:1px solid #ccc;">Detects boundaries between regions by finding sharp intensity changes.</td>
      <td style="padding:8px; border:1px solid #ccc;">Good at identifying clear contours.</td>
      <td style="padding:8px; border:1px solid #ccc;">Sensitive to noise; struggles with discontinuous or blurred edges common in medical images.</td>
    </tr>
    <tr>
      <td style="padding:8px; border:1px solid #ccc;">Region-Based</td>
      <td style="padding:8px; border:1px solid #ccc;">Groups pixels with similar properties into contiguous regions.</td>
      <td style="padding:8px; border:1px solid #ccc;">Can produce coherent regions.</td>
      <td style="padding:8px; border:1px solid #ccc;">Often requires manual seed points; performance depends on initial choices.</td>
    </tr>
    <tr>
      <td style="padding:8px; border:1px solid #ccc;">Clustering</td>
      <td style="padding:8px; border:1px solid #ccc;">Groups pixels into clusters based on feature similarity (e.g., K-Means).</td>
      <td style="padding:8px; border:1px solid #ccc;">Unsupervised; no need for labeled data.</td>
      <td style="padding:8px; border:1px solid #ccc;">Struggles with complex shapes and high variability of forged regions.</td>
    </tr>
  </tbody>
</table>

<h2>üß† Deep Learning Revolution</h2>

<p>Deep learning, particularly Convolutional Neural Networks (CNNs), has dramatically improved segmentation performance by automatically learning hierarchical features from data.</p>

<p>Fully Convolutional Networks (FCNs) were a pivotal step, demonstrating that networks could perform dense pixel-wise prediction, which is the foundation of modern segmentation models.</p>

<h2>üî¨ Key Deep Learning Architectures</h2>

<ul>
  <li><strong>U-Net and its Variants:</strong> The U-Net architecture, with its encoder-decoder structure and skip connections, has become a cornerstone in medical image segmentation. It is highly effective even with limited training data.</li>
  <li><strong>VANet:</strong> Designed for polyp segmentation in colonoscopy images, it enhances boundary perception, a critical feature for precise forgery masking.</li>
  <li><strong>VM-UNet and Mamba-UNet:</strong> Recent architectures integrating Vision Transformers or state-space models to capture global context better.</li>
  <li><strong>Generative Adversarial Networks (GANs):</strong> Used to generate synthetic data and improve segmentation, especially in limited annotated datasets.</li>
  <li><strong>Transformers:</strong> Excel at modeling long-range dependencies, valuable for identifying large or context-dependent forgeries.</li>
  <li><strong>Segment Anything Model (SAM):</strong> A foundational model capable of generalizing to new objects with minimal prompts. Adapted versions like DeSAM enhance medical segmentation efficiency.</li>
</ul>

<h2>‚öôÔ∏è Optimization Methods for Training</h2>

<p>Choosing the right optimizer is crucial for efficiently training these deep models. While Stochastic Gradient Descent (SGD) is a classic choice, adaptive optimizers are often preferred:</p>

<ul>
  <li><strong>Adam:</strong> Combines momentum and adaptive learning rates for each parameter.</li>
  <li><strong>AdamW:</strong> Decouples weight decay from gradient update for better generalization.</li>
  <li><strong>NovoGrad:</strong> A layer-wise adaptive optimizer, more stable and memory-efficient.</li>
</ul>

<h2>üìä Practical Considerations for the Competition</h2>

<ul>
  <li><strong>Leverage Pre-trained Models and Benchmarks:</strong> Use PyTorch or TensorFlow frameworks and pre-trained models. Benchmarks like MedSegBench provide robust standards for evaluation.</li>
  <li><strong>Address the "Black Box" Problem with XAI:</strong> Apply Explainable AI (XAI) methods to visualize and interpret model decisions for trust and transparency.</li>
  <li><strong>Start Simple and Iterate:</strong> Begin with U-Net as a baseline, then explore advanced architectures like Vision Transformers or U-Net++ to optimize performance.</li>
</ul>

<hr>




<h2 style="text-align:center; color:#3C91E6; font-size:2em; margin-top:10px;">‚öîÔ∏è U-Net vs Mask R-CNN: Which Model Should You Choose?</h2>

<p style="font-size:1.05em; color:#444; text-align:justify;">
Based on current research and benchmarking studies, both <strong>U-Net</strong> and <strong>Mask R-CNN</strong> are top-tier candidates for segmentation in this competition. 
U-Net often achieves <strong>higher pixel-level accuracy</strong>, while Mask R-CNN offers the flexibility of <strong>instance-level detection</strong>. 
The table below highlights their main differences to help you make an informed choice.
</p>

<table style="width:100%; border-collapse:collapse; text-align:left; margin-top:15px;">
  <thead style="background-color:#E8F0FE;">
    <tr>
      <th style="padding:10px; border:1px solid #ccc;">Feature</th>
      <th style="padding:10px; border:1px solid #ccc;">U-Net <span style="color:#3C91E6;">(Recommended)</span></th>
      <th style="padding:10px; border:1px solid #ccc;">Mask R-CNN</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Architecture Type</td>
      <td style="padding:10px; border:1px solid #ccc;">Semantic Segmentation</td>
      <td style="padding:10px; border:1px solid #ccc;">Instance Segmentation</td>
    </tr>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Core Strength</td>
      <td style="padding:10px; border:1px solid #ccc;">Pixel-level classification with highly precise boundaries</td>
      <td style="padding:10px; border:1px solid #ccc;">Distinguishes between individual object instances</td>
    </tr>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Ideal For</td>
      <td style="padding:10px; border:1px solid #ccc;">This competition's goal ‚Äî detecting forged or manipulated regions</td>
      <td style="padding:10px; border:1px solid #ccc;">Scenes with multiple, distinct objects</td>
    </tr>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Key Advantage</td>
      <td style="padding:10px; border:1px solid #ccc;">Superior Dice/IoU scores in biomedical and forgery detection tasks</td>
      <td style="padding:10px; border:1px solid #ccc;">Provides both bounding boxes and segmentation masks</td>
    </tr>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Performance (Example)</td>
      <td style="padding:10px; border:1px solid #ccc;">Dice: 0.96, IoU: 0.97 <br><small>(Panoramic Radiographs)</small></td>
      <td style="padding:10px; border:1px solid #ccc;">Dice: 0.87, IoU: 0.74 <br><small>(Panoramic Radiographs)</small></td>
    </tr>
    <tr>
      <td style="padding:10px; border:1px solid #ccc;">Computational Cost</td>
      <td style="padding:10px; border:1px solid #ccc;">Generally lower ‚Äî lightweight and fast</td>
      <td style="padding:10px; border:1px solid #ccc;">Higher ‚Äî more complex due to multi-stage detection</td>
    </tr>
  </tbody>
</table>

<h3 style="margin-top:30px; color:#3C91E6;">üõ†Ô∏è How to Implement U-Net (Step-by-Step)</h3>

<p style="color:#444; text-align:justify;">
For the highest segmentation accuracy, <strong>start with a standard U-Net</strong> implementation. Once it‚Äôs working reliably, you can enhance it with modern variants such as <strong>U-Net++</strong>, <strong>Attention U-Net</strong>, or <strong>ResUNet</strong> for improved boundary precision and generalization.
</p>

<h4 style="color:#333;">Option 1: Build a Standard U-Net</h4>

<p style="color:#555;">This foundational approach constructs the classic encoder‚Äìdecoder structure with skip connections that preserve fine spatial details:</p>




In [None]:
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model

def build_unet(input_size=(128, 128, 1)):
    inputs = Input(input_size)
    
    # Encoder (Contracting Path)
    c1 = Conv2D(64, (3, 3), activation='relu', padding='same')(inputs)
    c1 = Conv2D(64, (3, 3), activation='relu', padding='same')(c1)
    p1 = MaxPooling2D((2, 2))(c1)

    c2 = Conv2D(128, (3, 3), activation='relu', padding='same')(p1)
    c2 = Conv2D(128, (3, 3), activation='relu', padding='same')(c2)
    p2 = MaxPooling2D((2, 2))(c2)

    # Bottleneck
    b = Conv2D(256, (3, 3), activation='relu', padding='same')(p2)
    b = Conv2D(256, (3, 3), activation='relu', padding='same')(b)

    # Decoder (Expanding Path)
    u1 = Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(b)
    u1 = concatenate([u1, c2])
    u1 = Conv2D(128, (3, 3), activation='relu', padding='same')(u1)
    u1 = Conv2D(128, (3, 3), activation='relu', padding='same')(u1)

    u2 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(u1)
    u2 = concatenate([u2, c1])
    u2 = Conv2D(64, (3, 3), activation='relu', padding='same')(u2)
    u2 = Conv2D(64, (3, 3), activation='relu', padding='same')(u2)

    outputs = Conv2D(1, (1, 1), activation='sigmoid')(u2)
    
    model = Model(inputs=[inputs], outputs=[outputs])
    return model

# Build and compile the model
model = build_unet()
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

<!-- Advanced U-Net Strategies ‚Äî ready to paste into a Markdown (text) cell in your Kaggle notebook -->
<section>
  <h2 style="color:#1F6FEB; margin-bottom:0.2em;">Option 2: Advanced U-Net with Modern Enhancements</h2>
  <p style="color:#333; font-size:1em; line-height:1.45;">
    To push a U-Net baseline toward state-of-the-art performance on the forgery detection task, combine loss, pretraining and architectural improvements from recent literature.  
    Below are practical, research-backed recommendations with short explanations and implementation tips.
  </p>

  <div style="margin-top:0.8em; padding:12px; background:#F6FBFF; border-left:4px solid #1F6FEB; border-radius:8px;">
    <strong style="color:#1F6FEB;">Summary ‚Äî What to try</strong>
    <ul style="margin-top:8px; color:#333;">
      <li>Optimize with a <strong>Dice (or Dice + BCE) loss</strong> to match the competition metric.</li>
      <li>Use <strong>masked pretraining</strong> (self-supervised reconstruction) to leverage unlabeled data and improve feature learning.</li>
      <li>Experiment with modern U-Net variants such as <strong>MS-UNet</strong> and <strong>U-Tunnel-Net</strong> for better feature preservation and efficiency.</li>
    </ul>
  </div>

  <h3 style="color:#0B66C3; margin-top:1em;">1) Loss: Dice (and Combo) ‚Äî Why and how</h3>
  <p style="color:#333; line-height:1.45;">
    Dice directly measures overlap between predicted and ground-truth masks and correlates with the competition F1-style metric. In practice, combining Dice with a pixel-wise loss (BCE) stabilizes training:
  </p>

  <pre style="background:#F8F9FA; padding:12px; border-radius:8px; overflow:auto; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, 'Roboto Mono', monospace;">
# PyTorch: simple combined Dice + BCE loss
import torch
import torch.nn.functional as F

def dice_loss(pred, target, eps=1e-6):
    # pred: logits or probabilities
    pred = torch.sigmoid(pred).view(-1)
    target = target.view(-1).float()
    intersection = (pred * target).sum()
    return 1 - (2. * intersection + eps) / (pred.sum() + target.sum() + eps)

def bce_dice_loss(logits, target, bce_weight=0.5):
    bce = F.binary_cross_entropy_with_logits(logits, target.float())
    d_loss = dice_loss(logits, target)
    return bce_weight * bce + (1 - bce_weight) * d_loss
  </pre>

  <h3 style="color:#0B66C3; margin-top:1em;">2) Masked Pretraining (Self-Supervised) ‚Äî What & Why</h3>
  <p style="color:#333; line-height:1.45;">
    Masked pretraining asks the network to reconstruct randomly masked image patches. This forces the encoder and decoder to learn strong context and texture priors without labels. Benefits:
  </p>
  <ul style="color:#333;">
    <li>Improves sample efficiency when labeled masks are scarce.</li>
    <li>Helps the model learn domain-specific textures typical of biomedical images (microscopy, gels, charts).</li>
    <li>Often yields measurable gains after fine-tuning on the labeled task.</li>
  </ul>

  <p style="color:#333;">
    <strong>Practical recipe:</strong> pretrain the U-Net to predict the original pixels for randomly masked patches (e.g., 25‚Äì50% of pixels masked). Use L1 or L2 reconstruction loss. Then fine-tune with the segmentation loss on labeled masks.
  </p>

  <h3 style="color:#0B66C3; margin-top:1em;">3) Modern U-Net Variants to Try</h3>
  <p style="color:#333; line-height:1.45;">
    Replace or augment the vanilla U-Net building blocks with these modern ideas to improve receptive field, boundary accuracy, and parameter efficiency:
  </p>

  <table style="width:100%; border-collapse:collapse; margin-top:0.6em;">
    <thead style="background:#EEF6FF;">
      <tr>
        <th style="padding:8px; text-align:left; border:1px solid #E1ECFF;">Variant</th>
        <th style="padding:8px; text-align:left; border:1px solid #E1ECFF;">Core Idea</th>
        <th style="padding:8px; text-align:left; border:1px solid #E1ECFF;">Why it helps</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="padding:8px; border:1px solid #E1ECFF;">MS-UNet</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Multi-scale feature fusion across encoder/decoder levels</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Captures both fine details and global context with fewer parameters</td>
      </tr>
      <tr>
        <td style="padding:8px; border:1px solid #E1ECFF;">U-Tunnel-Net</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Improved feature preservation via enhanced skip-path design and residual tunnels</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Better restoration of subtle textures and boundaries ‚Äî ideal for small forged patches</td>
      </tr>
      <tr>
        <td style="padding:8px; border:1px solid #E1ECFF;">Attention U-Net</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Gates that reweight encoder features before skip connections</td>
        <td style="padding:8px; border:1px solid #E1ECFF;">Focuses model capacity on relevant regions, reducing false positives</td>
      </tr>
    </tbody>
  </table>

  <h3 style="color:#0B66C3; margin-top:1em;">4) Implementation Tips & Workflow</h3>
  <ol style="color:#333;">
    <li><strong>Pretrain</strong> with masked reconstruction on all available images (train + unlabeled data) for several epochs.</li>
    <li><strong>Initialize</strong> the segmentation U-Net with encoder weights from the pretrained model.</li>
    <li><strong>Train</strong> with the combined Dice + BCE loss, using strong augmentations (copy-paste, elastic, brightness/contrast) that preserve biological plausibility.</li>
    <li><strong>Post-process</strong> predictions with morphological ops (open/close) and small-object removal to reduce noise.</li>
    <li><strong>Calibrate</strong> the probability threshold on a validation split to balance false positives vs false negatives (important for "authentic" images).</li>
  </ol>

  <div style="margin-top:0.8em; padding:12px; background:#FFF8E8; border-left:4px solid #F5A623; border-radius:8px;">
    <strong>Quick practical note:</strong> if GPU memory is limited, prefer MS-UNet or attention modules over very deep transformer blocks ‚Äî they give strong gains with lower compute cost.
  </div>

  <p style="color:#333; margin-top:1em;">
    If you want, I can now generate:
    <ul>
      <li>an editable PyTorch notebook cell that implements masked pretraining for U-Net,</li>
      <li>a ready-to-run training loop using the combined Dice+BCE loss, or</li>
      <li>code templates for MS-UNet and U-Tunnel-Net you can plug into your Kaggle notebook.</li>
    </ul>
  </p>
</section>


<h3>üöÄ Step-by-Step Implementation Guide</h3>

<ol>
  <li>
    <strong>1Ô∏è‚É£ Data Preparation</strong><br>
    - Organize your dataset into <code>train</code>, <code>validation</code>, and <code>test</code> folders.<br>
    - Apply data augmentation techniques (flips, rotations, brightness adjustments) to improve generalization.<br>
    - Normalize pixel values to the [0, 1] range for stable training.
  </li>
  
  <li>
    <strong>2Ô∏è‚É£ Model Construction</strong><br>
    - Build your U-Net architecture using PyTorch or TensorFlow.<br>
    - Implement skip connections to preserve spatial features.<br>
    - Optionally, use pretrained encoders like <code>ResNet34</code> or <code>EfficientNet-B0</code> for faster convergence.
  </li>
  
  <li>
    <strong>3Ô∏è‚É£ Loss Function & Metrics</strong><br>
    - Combine <code>Dice Loss</code> with <code>Binary Cross-Entropy</code> for balanced learning.<br>
    - Track metrics such as <strong>Dice Score</strong>, <strong>IoU</strong>, and <strong>Precision</strong> after each epoch.<br>
    - Implement early stopping based on validation Dice score.
  </li>

  <li>
    <strong>4Ô∏è‚É£ Training Strategy</strong><br>
    - Start with a small learning rate (e.g., <code>1e-4</code>) and use a scheduler like <code>ReduceLROnPlateau</code>.<br>
    - Train for 30‚Äì50 epochs depending on dataset size.<br>
    - Monitor loss and metrics using <code>TensorBoard</code> or <code>Weights & Biases</code> for better visualization.
  </li>

  <li>
    <strong>5Ô∏è‚É£ Post-Processing</strong><br>
    - Apply thresholding (e.g., 0.5) to convert predicted masks into binary form.<br>
    - Use morphological operations (opening/closing) to remove small artifacts.<br>
    - Optionally, smooth boundaries with Gaussian filtering for clean segmentation results.
  </li>

  <li>
    <strong>6Ô∏è‚É£ Evaluation & Comparison</strong><br>
    - Compare results between <strong>Standard U-Net</strong> and <strong>Enhanced U-Net</strong> variants.<br>
    - Visualize sample predictions with overlayed ground truth masks.<br>
    - Report <strong>Dice</strong>, <strong>IoU</strong>, and <strong>F1-score</strong> in a summary table for clarity.
  </li>

  
</ol>

<p><strong>üí° Pro Tip:</strong> Always validate results on unseen data and monitor overfitting. Visual inspection is key in segmentation tasks ‚Äî numbers alone don't tell the full story!</p>
