# Nodes-Only Map Versions (Markdown Notebook)

> Scope: **nodes only** (cities as points). Segments/edges are handled in a separate function and are **out of scope here**.

---

## V0 — Uniform Synthetic Nodes

**Goal:** Minimal, reproducible node set in a rectangular region.
**Inputs:** seed, N, bbox (km), population range.
**Method:** Sample positions uniformly inside bbox; populations \~ Uniform\[min,max].
**Outputs:** `id, x_km, y_km, pop`.
**Validation:** N matches; inside bbox; pop in range.

---

## V1 — City Size Classes

**Goal:** Urban hierarchy.
**Changes:** Assign `class ∈ {large, medium, small}` with quotas (e.g., 10/30/60%). Distinct lognormal per class.
**Validation:** Class counts match; medians ordered `large > medium > small`.

### V1.1 — Single-Core Density Field

**Goal:** Spatial realism with a capital-like core.
**Changes:** Rejection sampling from a 2D Gaussian density. Parameters: `core_location`, `core_sigma_km`, `density_strength`.
**Outputs:** (optional) `core_dist_km`.
**Validation:** Mean distance to core below uniform baseline.

---

## V2 — Multi-Core & Corridors

**Goal:** Multiple metros and directional corridors (still nodes-only).
**Changes:** Mixture of K Gaussians; optional elongated kernels for corridors.
**Outputs:** `core_id`.
**Validation:** Cluster sizes near targets; inter-core gaps > intra-core.

### V2.1 — Terrain Masks (Synthetic)

**Goal:** Keep cities off water/mountains.
**Changes:** Binary masks for water/mountain; reject masked samples; (optional) synthetic elevation.
**Outputs:** `mask_violation=false`, optional `elev_m`.
**Validation:** 0 violations; distance-to-coast summary if used.

---

## V3 — Administrative Regions & Quotas

**Goal:** Balanced distribution by region (states/provinces).
**Changes:** Partition bbox into R polygons; enforce per-region node and class quotas.
**Outputs:** `region_id`.
**Validation:** Quotas satisfied; regional population totals within tolerance.

### V3.1 — Socioeconomic Attributes

**Goal:** Enrich nodes for later demand models.
**Changes:** Sample `gdp_idx`, `sector_service_pct`, `tourism_score` conditioned on class/region; allow correlation with `pop`.
**Validation:** Correlation checks (e.g., corr(pop, gdp\_idx) within configured band).

---

## V4 — Urban Extents & Footprint Radius

**Goal:** Distinguish metros vs. towns (single centroid per city).
**Changes:** Compute `urban_radius_km ~ c * pop^α`; (optional) metropolitan cluster flag if extents overlap.
**Outputs:** `urban_radius_km`, `metro_cluster_id` (optional).
**Validation:** Radii monotonic with population; overlap rates controlled.

### V4.1 — Temporal Population Profiles

**Goal:** Proto-demand without edges.
**Changes:** Store `pop_day`, `pop_night`, `seasonality[12]` derived from class/sector mix.
**Outputs:** `pop_base, pop_day, pop_night, seasonality[]`.
**Validation:** Mass conservation rules per region/time.

---

## V5 — Calibration to Empirical Laws

**Goal:** Synthetic but tuned to global regularities.
**Changes:** Rank–size (Zipf) calibration for top K cities; spatial K‑function/nearest‑neighbor targets; simple optimizer.
**Validation:** KS-distance (or similar) vs. targets below threshold.

---

## V6 — Real-World Import (Nodes-Only)

**Goal:** Replace synthetic with real city centroids & populations.
**Changes:** Import from public datasets; normalize to schema/CRS; deduplicate.
**Outputs:** add `name, iso_code, official_pop_year`.
**Validation:** Coverage; missing attribute rate; duplicates by name/coords.

### V6.1 — Hybrid Augmentation

**Goal:** Real cities + synthetic infill where data is sparse.
**Changes:** Generate small-town infill following regional quotas/masks.
**Validation:** Real nodes untouched; infill respects constraints.

---

## V7 — Scenario Knobs (Nodes-Only)

**Goal:** Parametric scenarios affecting nodes.
**Changes:** Regional/class growth factors; activation flags; new-town scenarios.
**Outputs:** `scenario_id, active_flag, pop_scenario`.
**Validation:** Delta reports per scenario; reproducibility via seed+config.

---

## Common Standards (Applies Across Versions)

* **Schema (SV 1.x):** `id, x_km, y_km, pop` plus optional fields (`class, core_id, region_id, gdp_idx, tourism_score, urban_radius_km, ...`).
* **CRS:** Prefer a clear declaration. For synthetic boxes use `LOCAL_KM`; for real-world data adopt `EPSG:4326` or `EPSG:3857` and align units/fields.
* **Reproducibility:** Persist seed + full YAML config in `meta.json`.
* **Validation Pack:** node count, bounds, spacing, class quotas, pop percentiles, mask violations, calibration metrics.
* **Preview:** Scatter sized by `pop`, optionally colored by `class/core/region`, with legend.

---

### Changelog Discipline

* **schema\_version (SV):** bump when columns/semantics/units change.
* **dataset\_version (DV):** bump for content changes under same schema (seed/params).

> Tip: Name output directories like `maps/sv1.0/dv0.3_v2_multicore_k3` so datasets are self-describing and easy to diff.
