# GOES-R Lightning Data: Max Flash Point Extraction

This notebook processes GOES-R Lightning NetCDF datasets to extract the maximum lightning flash point from each file. It converts the satellite-projected coordinates to geographic coordinates (latitude and longitude), aggregates the maximum flash locations, and exports the results as a GeoJSON file for visualization and further analysis.

The data used in this notebook comes from the **GOES-R Series Geostationary Lightning Mapper (GLM)** Level 2 NetCDF products provided by NOAA/NESDIS. These datasets capture lightning flash information across the Americas with high spatial and temporal resolution.

For more information, visit the official NOAA GLM product page:  
https://www.nesdis.noaa.gov/GOES-R-series/geostationary-lightning-mapper


---

## How to Retrieve GOES-R Lightning Data from NASA Earthdata

You can download GOES-R GLM data from NASA's Earthdata Search portal:

🔗 https://search.earthdata.nasa.gov/

### Steps:
1. Go to [search.earthdata.nasa.gov](https://search.earthdata.nasa.gov/).
2. In the search bar, enter: **"GLM L2 Lightning Detection GOES"** or a more specific product name like `"OR_GLM-L2-LCFA_G16"` or `"GLMF"`.
3. Use the map to narrow the spatial region (if desired).
4. Set your date range in the filter panel.
5. Browse the results and click **Download All** or use the **Customize** button to select specific files.
6. You’ll need a free [Earthdata Login](https://urs.earthdata.nasa.gov/users/new) to access and download files.

The downloaded `.nc` files can then be processed using the script in this notebook.

---

## Script Overview

- Opens GOES-R NetCDF files from a specified folder.
- Identifies the pixel with the maximum total optical energy (a proxy for lightning intensity).
- Converts pixel coordinates from geostationary projection (radians) to latitude/longitude using the correct satellite longitude and geostationary CRS.
- Compiles all maximum flash points into a GeoDataFrame.
- Exports the combined points as a GeoJSON file for easy use in GIS software and web mapping.


---

**Michael Huff**  
GIS Analyst & Developer

*Feel free to reach out for questions or collaboration!*

michaelhuff17@gmail.com


In [1]:
import os
import xarray
import numpy as np
import geopandas as gpd
from shapely.geometry import Point
from pyproj import CRS, Transformer
from concurrent.futures import ThreadPoolExecutor

# -------------------
# CONFIG
# -------------------
input_folder = r"E:\GOES-R Lightning Data\2023"
output_file = r"E:\GOES-R Lightning Data\Processed\2023_flashes_processed.geojson"
sat_height = 35786023
wgs84 = CRS.from_epsg(4326)

# -------------------
# Build reusable transformers
# -------------------
transformers = {
    "east": Transformer.from_crs(
        CRS.from_proj4(
            f"+proj=geos +h={sat_height} +lon_0=-75 +sweep=x "
            "+a=6378137 +b=6356752.31414"
        ),
        wgs84,
        always_xy=True,
    ),
    "west": Transformer.from_crs(
        CRS.from_proj4(
            f"+proj=geos +h={sat_height} +lon_0=-137 +sweep=x "
            "+a=6378137 +b=6356752.31414"
        ),
        wgs84,
        always_xy=True,
    ),
}

# -------------------
# Worker function
# -------------------
def process_file(filepath):
    try:
        with xarray.open_dataset(filepath) as ds:
            if "Total_Optical_energy" not in ds:
                return None

            slot = ds.attrs.get("orbital_slot", "").lower()
            transformer = transformers["west"] if "west" in slot else transformers["east"]

            energy = ds["Total_Optical_energy"]
            fill_value = energy.attrs.get("_FillValue", 0)
            energy = energy.where(energy != fill_value)

            if np.all(np.isnan(energy)):
                return None

            # Find index of max without loading full array
            max_index = np.unravel_index(np.nanargmax(energy.values), energy.shape)
            max_value = float(energy.values[max_index])

            if np.isnan(max_value) or max_value <= 0:
                return None

            x = float(ds["x"].values[max_index[1]])
            y = float(ds["y"].values[max_index[0]])
            lon, lat = transformer.transform(x * sat_height, y * sat_height)

            return {
                "filename": os.path.basename(filepath),
                "energy": max_value,
                "lon": lon,
                "lat": lat,
            }
    except Exception:
        return None

# -------------------
# Main
# -------------------
files = [
    os.path.join(input_folder, f)
    for f in os.listdir(input_folder)
    if f.endswith(".nc")
]

results = []
# Limit workers to avoid crashes (adjust as needed)
with ThreadPoolExecutor(max_workers=4) as executor:
    for res in executor.map(process_file, files):
        if res:
            results.append(res)

if results:
    gdf = gpd.GeoDataFrame(
        results,
        geometry=[Point(r["lon"], r["lat"]) for r in results],
        crs="EPSG:4326",
    )
    gdf.to_file(output_file, driver="GeoJSON")
    print(f"Saved {len(results)} flash points to {output_file}")
else:
    print("No valid flashes processed.")


Saved 92505 flash points to E:\GOES-R Lightning Data\Processed\2023_flashes_processed.geojson
