# 11 — GEDI L4C WSCI Extraction (footprint‑level)
Search and download **GEDI04_C (L4C) v2** granules via NASA Earthdata (ORNL DAAC),
then extract **WSCI** for an AOI and export GeoJSON/CSV.

Refs:
- Product page: https://gedi.umd.edu/gedi-l4c-footprint-level-waveform-structural-complexity-index-released/
- CMR concept (ORNL DAAC): https://cmr.earthdata.nasa.gov/search/concepts/C3049900163-ORNL_CLOUD.html


## Setup
Install extra deps once:
```bash
pip install earthaccess h5py geopandas shapely pyproj pandas
```
You need a free **Earthdata** account; run `earthaccess.login()`. 

In [ ]:
import os, json, pandas as pd
from shapely.geometry import box, Point
import geopandas as gpd
import h5py
import earthaccess

# ---- Parameters ----
aoi_bbox = [-57.0, -3.0, -54.0, -1.0]  # [min_lon, min_lat, max_lon, max_lat] — TODO: update
time_range = ("2019-04-01", "2025-12-31")
out_dir = "data/gedi_l4c"
os.makedirs(out_dir, exist_ok=True)

earthaccess.login()
results = earthaccess.search_data(
    short_name="GEDI04_C", version="2", temporal=time_range, bounding_box=aoi_bbox
)
print("Granules:", len(results))
files = earthaccess.download(results, out_dir)
files[:3]

## Extract WSCI footprints from HDF5
Each granule contains groups for shots/footprints; we extract lat/lon and **WSCI**.

In [ ]:
recs = []
for fp in files:
    try:
        with h5py.File(fp, "r") as h5:
            # Path may be /GEDI04_C/footprint or similar; check group names
            # Below is a defensive scan for datasets named 'WSCI' and 'lat', 'lon'
            def find_dataset(name):
                return [k for k in h5[name].keys() if isinstance(h5[name][k], h5py.Dataset)]

            # Attempt common structure
            for grp in h5.keys():
                g = h5[grp]
                if isinstance(g, h5py.Group) and "lat" in g and "lon" in g and "WSCI" in g:
                    lat = g["lat"][:]
                    lon = g["lon"][:]
                    wsci = g["WSCI"][:]
                    for la, lo, w in zip(lat, lon, wsci):
                        recs.append({"lat": float(la), "lon": float(lo), "WSCI": float(w)})
    except Exception as e:
        print("Skip file (structure not matched):", fp, e)

df = pd.DataFrame.from_records(recs)
print(df.shape)
gdf = gpd.GeoDataFrame(
    df, geometry=[Point(xy) for xy in zip(df["lon"], df["lat"])], crs="EPSG:4326"
)
aoi = gpd.GeoDataFrame({"id": [0]}, geometry=[box(*aoi_bbox)], crs="EPSG:4326")
gdf = gdf.loc[gdf.within(aoi.iloc[0].geometry)]
gdf.to_file(os.path.join(out_dir, "gedi_wsci_points.geojson"), driver="GeoJSON")
df.to_csv(os.path.join(out_dir, "gedi_wsci_points.csv"), index=False)
print("Exported GeoJSON/CSV:", len(gdf))

### Next
- Join WSCI stats to segments from the ΔdB map (Notebook 10) and compare against nearby controls.
- Log a second evidence line for the **same candidate network** using GEDI as the independent method.