# Day 27 — "Multi-Modal Fusion: Optical + SAR + DEM in Remote Sensing"

Fusion is not just stacking data. It is learning which sensor to trust for each location and condition.


In [1]:
# Ensure repo root is on sys.path for local imports
import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / "days").exists():
    for parent in Path.cwd().resolve().parents:
        if (parent / "days").exists():
            repo_root = parent
            break

sys.path.insert(0, str(repo_root))
print(f"Using repo root: {repo_root}")


Using repo root: /media/abdul-aziz/sdb7/masters_research/math_course_dlcv


## 1. Core Intuition

- Optical sees color/texture but fails under clouds.
- SAR sees structure/moisture but looks noisy.
- DEM provides terrain context but no temporal change.

Fusion learns *when* to trust each modality.


## 2. Modality Strengths

| Modality | Good at | Struggles with |
| --- | --- | --- |
| Optical (S2) | Land cover, texture | Clouds, shadows |
| SAR (S1) | Structure, moisture | Speckle, geometry artifacts |
| DEM | Elevation, slope | No temporal info |


## 3. Fusion Strategies

- **Early fusion**: concatenate inputs; simple baseline.
- **Mid-level fusion**: separate encoders, fuse features (most robust).
- **Late fusion**: merge predictions; robust when modalities fail.
- **Attention fusion**: learn weights dynamically per modality.


## 4. Attention-Based Fusion

Attention weights downplay cloudy optical regions, amplify SAR edges, and elevate DEM in terrain-driven classes.


## 5. Python — Fusion Summaries

`days/day27/code/fusion_strategies.py` prints fusion strategy summaries and a simple attention fusion demo.


In [2]:
from days.day27.code.fusion_strategies import FUSIONS, attention_fusion
import numpy as np

for fusion in FUSIONS:
    print(f"{fusion.name}: {fusion.description}")

rng = np.random.default_rng(0)
f_opt = rng.normal(0.2, 0.1, size=(8, 8))
f_sar = rng.normal(0.4, 0.2, size=(8, 8))
f_dem = rng.normal(0.1, 0.05, size=(8, 8))

fused = attention_fusion(f_opt, f_sar, f_dem)
print("Fused mean:", fused.mean())


Early: Concatenate modalities at input and feed one encoder
Mid-level: Separate encoders, then fuse feature maps
Late: Fuse predictions from separate models
Attention: Learn weights per modality and spatial location
Fused mean: 0.25537533853681216


## 6. Visualization — Fusion Schematic

`days/day27/code/visualizations.py` draws early, mid, and late fusion diagrams.


In [3]:
from days.day27.code.visualizations import plot_fusion_schematic

RUN_FIGURES = False

if RUN_FIGURES:
    plot_fusion_schematic()
else:
    print("Set RUN_FIGURES = True to regenerate Day 27 figures inside days/day27/outputs/.")


Set RUN_FIGURES = True to regenerate Day 27 figures inside days/day27/outputs/.


## 7. Normalization & Alignment

Normalize **per modality**. Optical: atmospheric correction, per-band stats. SAR: log-scale + speckle handling. DEM: normalize elevation and add slope/aspect.


## 8. Losses & Metrics

- Losses: BCE + Dice, Focal + Dice, contrastive alignment.
- Metrics: IoU/mIoU, F1 for rare classes, Boundary F1 for urban edges.


## 9. Failure Modes

- Treating SAR like optical.
- Early fusion without normalization.
- Overfitting to one modality.
- Ignoring missing data.


## 10. Mini Exercises

1. Compare optical-only vs optical+SAR on cloudy scenes.
2. Add DEM slope/aspect and measure terrain-driven gains.
3. Replace concatenation with attention fusion.
4. Drop one modality during training for robustness.


## 11. Key Takeaways

- Fusion is about complementarity, not channel count.
- Mid-level fusion is the most robust default.
- Attention enables dynamic trust across sensors.
- Normalization and alignment are critical.
