Stop downlinking the boring tiles. A small on-board VLM that ships only structured verdicts of what actually changed.
A LoRA fine-tune of LiquidAI/LFM2.5-VL-1.6B for satellite-imagery change detection across 7 event schemas. Submitted to the DPhi Space x Liquid AI hackathon, May 2026. Repo slug stays liquid_space for stable links.
Result: F1 = 0.95 on the held-out v2_val set (700 rows). That's +0.16 over the natural-distribution baseline, with precision climbing from 0.69 to 0.98 and recall holding at 0.92.
| schema | F1 |
|---|---|
| IceExtentChange | 1.00 |
| ThermalHotspot | 0.97 |
| VegetationStress | 0.97 |
| UrbanChangeDelta | 0.95 |
| WetlandChange | 0.95 |
| FloodExtentChange | 0.91 |
| ConstructionProgress | 0.90 |
| overall | 0.95 |
Given a pair of Sentinel-2 RGB tiles (before, after) and a per-schema predicate prompt, the model returns either a JSON object matching the schema spec or the literal string null. Seven schemas cover the hackathon's change-event categories: construction progress, flood extent, ice retreat, thermal hotspots, urban delta, vegetation stress, wetland change. Schemas live at schemas/*.json.
- Model (LoRA adapter): danvinci/lfm25vl_mosaic_change_detection
- Dataset (reformat): danvinci/mosaic_sen2_cc_change_detection
The dataset repo carries the full 29,281-row TRAIN split and the 700-row balanced VAL split, both as parquet with embedded PIL images.
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoProcessor
import torch
base = AutoModelForCausalLM.from_pretrained(
"LiquidAI/LFM2.5-VL-1.6B",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
).eval()
model = PeftModel.from_pretrained(base, "danvinci/lfm25vl_mosaic_change_detection")
processor = AutoProcessor.from_pretrained("LiquidAI/LFM2.5-VL-1.6B", trust_remote_code=True)Generation defaults: do_sample=True, temperature=0.1, min_p=0.15, repetition_penalty=1.05. Greedy decoding silently underperforms on this task (cost us about 0.4 F1 on early runs).
A self-hosted web UI with a curated AOI gallery and a multi-schema result panel. SolidJS frontend, FastAPI backend, two services in docker-compose.
# 1. Clone
git clone https://github.com/danvinci/liquid_space
cd liquid_space
# 2. Pull the model (~3 GB base + 60 MB adapter)
hf download LiquidAI/LFM2.5-VL-1.6B \
--local-dir ./models/LiquidAI/LFM2.5-VL-1.6B
hf download danvinci/lfm25vl_mosaic_change_detection \
--local-dir ./training/adapter_v6/adapter
# 3. Bring up both services
docker compose up
# 4. Open
open http://localhost:5180Click any pin on the map. The backend serves cached pre-recorded model verdicts from sat_service/cache/pool/, replayed by the streaming UI in t_after order. Two pins (Po Valley, Riyadh) carry real Sentinel-2 archive tile pairs fetched via SimSat; the other six show held-out MOSAIC reference pairs. See the About modal for the full breakdown.
The backend picks its device automatically: cuda > mps > cpu. On Apple Silicon the model loads in ~15 s and a single schema generation is ~17 s.
Lighter than the full container, especially on Apple Silicon where docker has no GPU passthrough:
python -m venv .venv && source .venv/bin/activate
pip install -r sat_service/requirements.txt
PYTHONPATH=. uvicorn sat_service.main:app --port 8050
# in another shell
cd web && bun install && bun run devOpen http://localhost:5180. The backend reports its device on GET /health.
The 8 demo AOIs ship with their pool entries pre-cached under sat_service/cache/pool/<aoi_id>/, so a fresh clone is ready to demo without rebuilding. The Modal entrypoint precompute_demo_cache and the local scripts/build_simsat_timelines.py rebuild this cache from scratch (needs Modal auth + adapter on the runs volume, or a running SimSat stack).
training/ Modal-based fine-tuning pipeline
modal_app.py bootstrap → prepare_dataset → train → eval → publish
prompts.py predicate_prompt() + parse_output(): the contract
predictors/ base + adapter wrappers used by eval
smoke_test.py end-to-end check: load → infer → parse on 5 rows
DEPLOY.md runbook for the Modal pipeline
schemas/ 7 schema JSONs (the public per-event contracts)
scripts/ Julia data-prep utilities (MOSAIC reformat, AOI scout)
eval_results/ all eval JSONs (per run, per eval set) + comparison tools
sat_service/ FastAPI demo backend (loads base + adapter, /infer-all)
web/ SolidJS demo frontend (AOI gallery, result panel)
docker-compose.yml Two-service local stack for the demo
The training pipeline runs on Modal (modal run training/modal_app.py::<fn>). Reproducing the v6 result needs an H100. The fine-tune trained 4 epochs over 29,281 rows in 5h wall, train_loss 0.012.
Code is MIT (see LICENSE). The LoRA adapter inherits the LFM Open License v1.0 from the base model: permissive, citation required, free commercial use under $10M revenue. The dataset reformat is research-use-cite-source per the upstream MOSAIC-SEN2-CC gray-zone license posture; cite the IEEE JSTARS 2025 paper.
@article{karaca2025robust,
author={Tuzlupinar, Busra and Ozelbas, Enes and Amasyali, Mehmet Fatih and Karaca, Ali Can},
journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
title={Introducing MOSAIC-SEN2-CC: A Multispectral Dataset and Adaptation Framework for Remote Sensing Change Captioning},
year={2025}, volume={18}, pages={25410-25426},
doi={10.1109/JSTARS.2025.3615113}
}