# ee-utils

A Python package for efficiently extracting satellite imagery patches from Google Earth Engine (GEE). This library provides high-level utilities for:

- **Patch Extraction**: Download georeferenced image patches centred on arbitrary coordinates
- **Neighbourhood Sampling**: Retrieve grids of contiguous patches around a point of interest
- **Parallel Downloads**: Concurrent requests via Earth Engine's high-volume API for rapid bulk exports
- **Multi-sensor Support**: Pre-configured data accessors for Landsat 4-9, Sentinel-2, and spatial covariates
- **Covariate Extraction**: Extract point-level values from climate, population, and land cover datasets

## Installation

Clone the repository and install:

```bash
git clone https://github.com/jordanimahori/ee-utils.git
cd ee-utils
python -m pip install .
```

**Requirements**: Python 3.10+, plus the dependencies listed in `pyproject.toml` (earthengine-api, numpy, pandas, matplotlib, pillow, requests, tqdm, pyproj).

For development (linting and tests):

```bash
python -m pip install -e ".[dev]"
```

## Package Layout

```
src/ee_utils/
├── geo.py                # CRS helpers (get_utm_epsg, wgs84_to_utm)
├── sampling.py           # Core extraction functions (get_patch, extract_patches, etc.)
├── viz.py                # Visualisation utilities (plot_neighbourhood, preview_patch)
└── imagery/
    ├── landsat.py        # LandsatSR: cloud-masked Landsat 4-9 composites
    ├── sentinel2.py      # Sentinel2SR: Cloud Score+ masked Sentinel-2 imagery
    └── covariates.py     # SpatialCovariates: climate, population, land cover layers
```

## Quick Start

The following examples demonstrate the core functionality of the package.

In [None]:
import ee
import matplotlib.pyplot as plt
import numpy as np
import rasterio

from eeutils import (
    LandsatSR,
    extract_patches,
    get_neighbourhood,
    get_patch,
    get_utm_epsg,
    plot_neighbourhood,
)

In [None]:
# Image Patch Params
SCALE = 30  # spatial resolution of export in meters: 30m/px
PATCH_SIZE = 224  # patch diameter in pixels

### Authentication

Before using the package, authenticate with Google Earth Engine. If you haven't authenticated before, `ee.Authenticate()` will open a browser window for OAuth login.

In [None]:
# Initialise using concurrency-optimised API route
ee.Authenticate()
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")

### CRS Helpers

The `get_utm_epsg()` function automatically determines the appropriate UTM zone for any WGS84 coordinate. This ensures patches are exported in a local projected CRS with minimal distortion.

In [None]:
# Get EPSG code for UTM zone covering Manila
manila = (120.9842, 14.5995)
manila_crs = get_utm_epsg(manila)

## Extracting a Single Patch

The workhorse utility is `get_patch()`, which underlies most functions in this package. It provides a high-level wrapper around `ee.data.computePixels` (for computed `ee.Image` objects) or `ee.data.getPixels` (for Earth Engine Assets).

**Key features:**
- Works with both Asset IDs (e.g., `"LANDSAT/LC09/C02/T1_L2/LC09_116050_20240110"`) and computed `ee.Image` objects
- Automatic UTM projection when `patch_crs` is omitted
- Returns patch in requested format e.g. Numpy Array by default; GeoTIFF via `file_format="GEO_TIFF"`

In [None]:
# Get patch around Manila
patch = get_patch(
    pt=manila,
    patch_crs=manila_crs,
    patch_size=224,
    patch_scale=30,
    asset_id="LANDSAT/LC09/C02/T1_L2/LC09_116050_20240110",
)

# Stack into RGB (H, W, 3); scale DNs to [0, 1]
rgb = np.stack([patch["SR_B1"], patch["SR_B2"], patch["SR_B3"]], axis=-1).astype(
    "float32"
)
rgb = np.clip((rgb - 7300) / (18000 - 7300), 0, 1)

plt.imshow(rgb)
plt.show()

## Neighbourhood Extraction

This package also provides higher-level utilities that build on `get_patch()` to extract many patches. The `get_neighbourhood()` function fetches a grid of contiguous patches around a point.

**Parameters:**
- `levels`: Number of rings around the centre patch (e.g., `levels=2` gives a 5×5 grid of 25 patches)
- `concurrent_requests`: Number of parallel requests (default: 20)

In [None]:
# Get image patches around Manila; visualise RGB bands.
asset_patches = get_neighbourhood(
    asset_id="LANDSAT/LC09/C02/T1_L2/LC09_116050_20240110",
    pt=manila,
    patch_scale=30,
    patch_crs=manila_crs,
    levels=2,
)
plot_neighbourhood(
    patches=asset_patches,
    levels=2,
    bands=["SR_B1", "SR_B2", "SR_B3"],
    vis_min=7300,
    vis_max=18000,
)

## Creating Cloud-Free Composites

As can be seen above, single-date imagery often contains clouds. To mitigate this, we can construct a temporal composite using the `LandsatSR` class.

### LandsatSR

The `LandsatSR` class provides a convenient interface for working with Landsat Collection 2 Level-2 Surface Reflectance data:

- **Multi-platform pooling**: Combines imagery from Landsat 4, 5, 7, 8, and 9 with harmonised band names
- **Cloud masking**: Automatic QA_PIXEL-based cloud and shadow masking
- **Band rescaling**: Optional rescaling to surface reflectance values (0-1 range)
- **Thermal band linking**: Joins thermal bands from TOA collections

In [None]:
# Create median-composite for 2024 using all available Landsat imagery
landsat = LandsatSR(
    start_date="2024-01-01", end_date="2024-12-31", rescale_bands=False
).images.median()

In [None]:
# Extract patches in area around Manila; plot true-colour composite
patches = get_neighbourhood(
    image=landsat, pt=manila, patch_scale=30, patch_crs=manila_crs
)
plot_neighbourhood(
    patches=patches,
    levels=2,
    bands=["RED", "GREEN", "BLUE"],
    vis_min=7300,
    vis_max=18000,
)

## Bulk Patch Export

For large-scale exports, use `extract_patches()` to download many patches in parallel and save them as GeoTIFFs:

**Parameters:**
- `pts`: List of (lon, lat) coordinates
- `patch_ids`: Unique identifiers for each patch (used as filenames)
- `output_dir`: Directory to save GeoTIFFs
- `concurrent_requests`: Number of parallel downloads (default: 40)
- `overwrite_patches`: If `False`, skip existing files (useful for resuming interrupted exports)

In [None]:
# Create three-year (centred on survey year) median composite pooling Landsat 4-9
patch_centroids = [
    (120.9842, 14.5995),
    (139.6917, 35.6895),
    (103.8198, 1.3521),
    (116.4074, 39.9042),
]
patch_ids = ["Manila", "Tokyo", "Singapore", "Beijing"]

# Extract patches from requested locations
extract_patches(
    image=landsat,
    pts=patch_centroids,
    patch_ids=patch_ids,
    patch_size=224,
    patch_scale=30,
    output_dir="nb_outputs",
    overwrite_patches=True,
)

In [None]:
with rasterio.open("nb_outputs/Beijing.tif") as f:
    patch = f.read()

patch[1, 0:10, 0:10]  # print 10x10 pixel extract

## Additional Features

### Sentinel-2 Support

The `Sentinel2SR` class provides similar functionality for Sentinel-2 imagery with Cloud Score+ masking:

```python
from eeutils import Sentinel2SR

s2 = Sentinel2SR(
    start_date="2024-01-01",
    end_date="2024-12-31",
    bands=["B2", "B3", "B4", "B8"],  # Blue, Green, Red, NIR
    clear_threshold=0.60  # Cloud Score+ threshold
)
composite = s2.images.median()
```

### Spatial Covariates

The `SpatialCovariates` class and `extract_spatial_covariates()` function enable extraction of point-level covariate values from multiple Earth Engine datasets:

- **TerraClimate**: Temperature (min/max)
- **CHIRPS**: Precipitation
- **VIIRS**: Nighttime lights
- **GHS**: Population and built-up area (JRC Global Human Settlement Layer)
- **Dynamic World**: Land cover probabilities
- **OpenLandMap**: Soil pH

```python
from eeutils import SpatialCovariates, extract_spatial_covariates

covariates = SpatialCovariates(year=2024)
df = extract_spatial_covariates(
    image=covariates.all_covariates_comp,
    pts=[(120.98, 14.60), (139.69, 35.69)],
    patch_ids=["Manila", "Tokyo"],
    patch_scale=1000
)
```


## API Reference

### Core Functions

| Function | Description |
|----------|-------------|
| `get_patch()` | Extract a single patch as NumPy array or GeoTIFF |
| `get_neighbourhood()` | Extract a grid of contiguous patches |
| `extract_patches()` | Bulk export patches to GeoTIFF files |
| `extract_spatial_covariates()` | Extract point values as a DataFrame |

### CRS Utilities

| Function | Description |
|----------|-------------|
| `get_utm_epsg()` | Get UTM EPSG code for a WGS84 coordinate |
| `wgs84_to_utm()` | Convert WGS84 to UTM coordinates |

### Data Accessors

| Class | Description |
|-------|-------------|
| `LandsatSR` | Cloud-masked Landsat 4-9 Collection 2 Level-2 imagery |
| `Sentinel2SR` | Cloud Score+ masked Sentinel-2 Harmonized imagery |
| `SpatialCovariates` | Annual composite covariate layers |

## License

MIT License.