# Day 28 — "Foundation Models & Embeddings for EO (Prithvi, AlphaEarth, SatMAE)"

Foundation models learn Earth’s structure and dynamics. Embeddings compress multi-band, multi-temporal patches into semantic vectors you can reuse across tasks.


In [1]:
# Ensure repo root is on sys.path for local imports
import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / "days").exists():
    for parent in Path.cwd().resolve().parents:
        if (parent / "days").exists():
            repo_root = parent
            break

sys.path.insert(0, str(repo_root))
print(f"Using repo root: {repo_root}")


Using repo root: /media/abdul-aziz/sdb7/masters_research/math_course_dlcv


## 1. Core Intuition

- Raw bands are letters; embeddings are the semantic meaning of Earth patches.
- Foundation models separate representation learning from downstream tasks.


## 2. What Is an Embedding?

A compact vector summarizing a patch: land cover, texture, temporal behavior, and context. Nearby vectors imply similar places.


## 3. Three Major EO Foundation Approaches

- **Prithvi**: spatio-temporal masked modeling for Earth dynamics.
- **AlphaEarth**: global semantic embeddings for every place.
- **SatMAE**: masked autoencoding for spatial structure.


## 4. Geometry of Embedding Space

Distances encode semantic differences, clusters reveal land-use types, and directions can represent transitions (urbanization, deforestation).


## 5. Python — Embedding Demo (NumPy)

`days/day28/code/embeddings_demo.py` creates toy embeddings and measures distances.


In [2]:
from days.day28.code.embeddings_demo import MODELS, embed_patch, embedding_distance
import numpy as np

for model in MODELS:
    print(f"{model.name}: {model.focus} | best for {model.best_for}")

rng = np.random.default_rng(0)
patch_a = rng.random((16, 16, 4)).astype(np.float32)
patch_b = patch_a.copy()
patch_b[4:8, 6:10, :] += 0.4
patch_b = np.clip(patch_b, 0, 1)

emb_a = embed_patch(patch_a)
emb_b = embed_patch(patch_b)
print("Embedding distance:", embedding_distance(emb_a, emb_b))


Prithvi: Spatio-temporal masked modeling | best for Time-series tasks
AlphaEarth: Global EO embeddings | best for Fast transfer & search
SatMAE: Masked autoencoding | best for Spatial representation learning
Embedding distance: 0.13914865255355835


## 6. Visualization — Embedding Drift

`days/day28/code/visualizations.py` plots embedding distance as change intensity increases.


In [3]:
from days.day28.code.visualizations import plot_embedding_drift

RUN_FIGURES = False

if RUN_FIGURES:
    plot_embedding_drift()
else:
    print("Set RUN_FIGURES = True to regenerate Day 28 figures inside days/day28/outputs/.")


Set RUN_FIGURES = True to regenerate Day 28 figures inside days/day28/outputs/.


## 7. Embeddings in Practice

- Classification: train a lightweight classifier on embeddings.
- Clustering: group land-use types via KMeans/UMAP.
- Change detection: threshold embedding distance over time.


## 8. Advantages vs Raw Pixels

| Aspect | Raw Pixels | Embeddings |
| --- | --- | --- |
| Dimensionality | Very high | Compact |
| Labels needed | Many | Few |
| Transfer | Low | High |
| Robustness | Sensitive | Strong |
| Compute | Heavy | Light |


## 9. Limitations & Cautions

Embeddings are not magic: resolution can be fixed, interpretability is lower, and domain shift still matters.


## 10. Mini Exercises

1. Compare embeddings-only vs indices-only classification.
2. Visualize embeddings with UMAP/t-SNE.
3. Detect urban growth via embedding distance over years.
4. Cluster embeddings and interpret clusters geographically.


## 11. Key Takeaways

- Foundation models learn the grammar of Earth.
- Embeddings store semantic meaning, not raw reflectance.
- Prithvi, AlphaEarth, SatMAE target different strengths.
- Embeddings enable fast, transferable downstream models.
