[Optimization 3/4] New h3_clustering.ipynb — H3 map clustering demo

## Priority: 6

## Context

No clustering visualization exists in the notebooks. Lonboard shows every point, which overwhelms at 6M points. H3 res6 columns enable zoom-adaptive clustering.

## Data Files on R2

| File | URL | Size |
|------|-----|------|
| **Wide + H3** | `https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_wide_h3.parquet` | 292 MB |
| **Facet summaries** | `https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_facet_summaries.parquet` | 2 KB |

H3 columns: `h3_res4` (BIGINT), `h3_res6` (BIGINT), `h3_res8` (BIGINT). 11.96M rows have H3 values.

## File to Create

`examples/basic/h3_clustering.ipynb`

## Notebook Structure

### Cell 1: Introduction (markdown)
Explain H3 hierarchical hexagonal indexing, why it's useful for geospatial clustering, link to h3geo.org.

### Cell 2: Setup and H3 stats
```python
import duckdb

con = duckdb.connect()
wide_h3_url = "https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_wide_h3.parquet"

# Show H3 column distribution
stats = con.sql(f"""
    SELECT
        COUNT(*) as total,
        COUNT(h3_res4) as with_h3,
        COUNT(DISTINCT h3_res4) as cells_res4,
        COUNT(DISTINCT h3_res6) as cells_res6,
        COUNT(DISTINCT h3_res8) as cells_res8
    FROM read_parquet('{wide_h3_url}')
    WHERE otype = 'MaterialSampleRecord'
""").df()
stats
```

### Cell 3: Cluster at res6 (~3.2km hexagons)
```python
clusters = con.sql(f"""
    SELECT
        h3_res6,
        COUNT(*) as n,
        AVG(latitude) as lat,
        AVG(longitude) as lon,
        MODE(n) as dominant_source
    FROM read_parquet('{wide_h3_url}')
    WHERE otype = 'MaterialSampleRecord' AND h3_res6 IS NOT NULL
    GROUP BY h3_res6
""").df()

print(f"{len(clusters):,} clusters from {clusters.n.sum():,} samples")
print(f"Cluster sizes: min={clusters.n.min()}, median={clusters.n.median():.0f}, max={clusters.n.max():,}")
```

### Cell 4: Lonboard clustered visualization
```python
from lonboard import Map, ScatterplotLayer
import numpy as np

# Scale radius by log of count
clusters['radius'] = np.clip(np.log2(clusters['n']) * 500, 500, 50000)

# Color by dominant source
source_colors = {
    'SESAR': [0, 100, 255],
    'OpenContext': [0, 200, 100],
    'GEOME': [255, 165, 0],
    'Smithsonian': [148, 0, 211]
}
clusters['color'] = clusters['dominant_source'].map(
    lambda s: source_colors.get(s, [128, 128, 128])
)

layer = ScatterplotLayer.from_dataframe(
    clusters,
    get_position=['lon', 'lat'],
    get_radius='radius',
    get_fill_color='color',
    opacity=0.6,
    pickable=True,
)
Map(layer)
```

### Cell 5: Compare resolutions side-by-side
Show how res4 vs res6 vs res8 produce different clustering granularity. Include a table comparing cluster counts and a note about when to use each.

### Cell 6: Benchmark — clustering vs full points
Time comparison: loading 112K clusters vs 6M individual points into Lonboard.

### Cell 7: Regional drill-down demo
Select a res4 cell, then show its res6 children, then res8. Demonstrate hierarchical zoom.

## Acceptance Criteria

- [ ] New notebook created at `examples/basic/h3_clustering.ipynb`
- [ ] Clear H3 explanation for newcomers
- [ ] Lonboard map showing clusters colored by source
- [ ] Multi-resolution comparison (res4/6/8)
- [ ] Performance benchmark included
- [ ] Notebook runs end-to-end without errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization 3/4] New h3_clustering.ipynb — H3 map clustering demo #4

Priority: 6

Context

Data Files on R2

File to Create

Notebook Structure

Cell 1: Introduction (markdown)

Cell 2: Setup and H3 stats

Cell 3: Cluster at res6 (~3.2km hexagons)

Cell 4: Lonboard clustered visualization

Cell 5: Compare resolutions side-by-side

Cell 6: Benchmark — clustering vs full points

Cell 7: Regional drill-down demo

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File	URL	Size
Wide + H3	`https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_wide_h3.parquet`	292 MB
Facet summaries	`https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_facet_summaries.parquet`	2 KB

[Optimization 3/4] New h3_clustering.ipynb — H3 map clustering demo #4

Description

Priority: 6

Context

Data Files on R2

File to Create

Notebook Structure

Cell 1: Introduction (markdown)

Cell 2: Setup and H3 stats

Cell 3: Cluster at res6 (~3.2km hexagons)

Cell 4: Lonboard clustered visualization

Cell 5: Compare resolutions side-by-side

Cell 6: Benchmark — clustering vs full points

Cell 7: Regional drill-down demo

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions