# Downloading Sentinel-2 Tiles

Downloads the scenes selected by `04_maji_search.ipynb` from CDSE S3.

> For implementation details, see `02_research_download.ipynb`.

**Prerequisites**

1. Run `04_maji_search.ipynb` first to generate `DATA/search_results/best_scenes.parquet`
2. CDSE S3 credentials — register free at [dataspace.copernicus.eu](https://dataspace.copernicus.eu/)
3. Generate an access key at [eodata-s3keysmanager.dataspace.copernicus.eu](https://eodata-s3keysmanager.dataspace.copernicus.eu/)
4. Save credentials in `.env` file (see below)

In [None]:
import os
from pathlib import Path

import geopandas as gpd
from dotenv import load_dotenv

from maji import create_s3_session, download_tiles, DOWNLOAD_BANDS

print(f"Bands to download: {DOWNLOAD_BANDS}")

## Load Search Results

Load the scenes saved by `04_maji_search.ipynb`.

In [None]:
RESULTS_DIR = Path("../DATA/search_results")

# --- Options ---
# Set to True to download only the tile with most AOI overlap (faster demo)
# Set to False to download all tiles
SINGLE_TILE = True

# AOI bounding box (must match 04_maji_search.ipynb)
BBOX = (40.6, -2.6, 41.6, -1.6)  # Faza region, Kenya

# Load scenes
best = gpd.read_parquet(RESULTS_DIR / "best_scenes.parquet")
print(f"Loaded {len(best)} scene(s) from {RESULTS_DIR / 'best_scenes.parquet'}")

# Optionally filter to single tile with most AOI overlap
if SINGLE_TILE and len(best) > 1:
    import warnings
    from shapely.geometry import box
    aoi = box(*BBOX)
    best = best.copy()
    # Suppress CRS warning — we only need relative comparison, not absolute area
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", message=".*geographic CRS.*")
        best["aoi_overlap"] = best.geometry.intersection(aoi).area
    best = best.nlargest(1, "aoi_overlap").drop(columns=["aoi_overlap"])
    print(f"  → Filtered to 1 tile (most AOI overlap)")

print()
for _, row in best.iterrows():
    print(f"  {row['mgrs_tile']}  {row['datetime'].strftime('%Y-%m-%d')}  cloud: {row['cloud_cover']:.1f}%")

## S3 Authentication

Load CDSE credentials from `.env` file. Create a `.env` file in the project root with:

```
CDSE_ACCESS_KEY=your-access-key
CDSE_SECRET_KEY=your-secret-key
```

In [None]:
# Load credentials from .env file
load_dotenv(dotenv_path="../.env", override=True)

CDSE_ACCESS_KEY = os.environ.get("CDSE_ACCESS_KEY", "")
CDSE_SECRET_KEY = os.environ.get("CDSE_SECRET_KEY", "")

if not CDSE_ACCESS_KEY or not CDSE_SECRET_KEY:
    print("WARNING: CDSE credentials not found.")
    print("Set CDSE_ACCESS_KEY and CDSE_SECRET_KEY in ../.env")
else:
    print(f"Credentials loaded: {CDSE_ACCESS_KEY[:8]}...")

session = create_s3_session(CDSE_ACCESS_KEY, CDSE_SECRET_KEY, verbose=True)

## Download Tiles

Download each scene as a multi-band GeoTIFF. Output files are saved to `DATA/<mgrs_tile>/<date>_S2L2A.tif`.

In [None]:
DATA_DIR = Path("../DATA")

print(f"Downloading {len(best)} scene(s) to {DATA_DIR}")
print(f"Bands: {DOWNLOAD_BANDS}")
print()

paths = download_tiles(
    scenes=best,
    data_dir=DATA_DIR,
    session=session,
    overwrite=False,
    verbose=True,
)

print()
print(f"Downloaded {len(paths)} file(s):")
for p in paths:
    size_mb = p.stat().st_size / 1e6
    print(f"  {p.relative_to(DATA_DIR)} ({size_mb:.1f} MB)")

## Verify Output

Check that the GeoTIFF has the expected structure.

In [None]:
import rasterio

if paths:
    with rasterio.open(paths[0]) as src:
        print(f"File: {paths[0].name}")
        print(f"Size: {paths[0].stat().st_size / 1e6:.1f} MB")
        print(f"Dimensions: {src.width} x {src.height} pixels")
        print(f"Bands: {src.count}")
        print(f"CRS: {src.crs}")
        print(f"Dtype: {src.dtypes[0]}")
        print()
        print("Band descriptions:")
        for i in range(1, src.count + 1):
            desc = src.descriptions[i - 1] or f"Band {i}"
            print(f"  {i}: {desc}")
else:
    print("No files downloaded.")

## Next Steps

The downloaded GeoTIFFs are ready for inference. See `06_maji_inference.ipynb` for the flood detection workflow.

```python
from maji import load_model, run_inference

model = load_model("path/to/model.pt")
prediction = run_inference(model, paths[0])
```