# Download Sentinel-2 data
* Iterate over each region in the supplied `regions.geojson` input 
* For each region, query Planetary Computer STAC database and load scenes into a data cube
  * Group scenes by solar date and subset to area provided in `regions.geojson`
* Save each scene by solar date by region

Regions (**bolded** regions selected in this analysis)
* **Gambia-flooding-8-11-2022**
* Hurricane-Fiona-9-19-2022
* Hurricane-Ian-9-26-2022
* **Indonesia-Earthquake22**
* **Kahramanmaras-turkey-earthquake-23**
* New-Zealand-Flooding22
* New-Zealand-Flooding23
* Sudan-flooding-8-22-2022
* **afghanistan-earthquake22**
* **cyclone-emnati22**
* kentucky-flooding-7-29-2022
* pakistan-flooding22
* southafrica-flooding22
* tonga-volcano21
* **volcano-indonesia21**
* yellowstone-flooding22
* **baltimore-nd**

In [18]:
# Standard library imports
import json
import os
from pathlib import Path

# Third-party imports
import dask
import dask.distributed
import dask.utils
from datacube.utils.cog import write_cog
from dotenv import load_dotenv
import geopandas as gpd
import numpy as np
from odc.stac import configure_rio, stac_load
import pandas as pd
import planetary_computer as pc
from pystac_client import Client
import rasterio as rio
from rasterio.mask import mask as rio_mask
from rasterio.session import AWSSession
import xarray as xr
from IPython.display import display

# Local imports
from utils import to_float
from download import download_cogs


print("Load environment variables from .env file.")
load_dotenv()
USGS_API_KEY = os.environ["USGS_API_KEY"]
USGS_TOKEN_NAME = os.environ["USGS_TOKEN_NAME"]
USGS_USERNAME = os.environ["USGS_USERNAME"]
USGS_PASSWORD = os.environ["USGS_PASSWORD"]
AWS_ACCESS_KEY = os.environ["AWS_ACCESS_KEY"]
AWS_SECRET_KEY = os.environ["AWS_SECRET_KEY"]
NASA_EARTHDATA_S3_ACCESS_KEY = os.environ["NASA_EARTHDATA_S3_ACCESS_KEY"]
NASA_EARTHDATA_S3_SECRET_KEY = os.environ["NASA_EARTHDATA_S3_SECRET_KEY"]
NASA_EARTHDATA_S3_SESSION = os.environ["NASA_EARTHDATA_S3_SESSION"]
NASA_EARTHDATA_USERNAME = os.environ["NASA_EARTHDATA_USERNAME"]
NASA_EARTHDATA_PASSWORD = os.environ["NASA_EARTHDATA_PASSWORD"]

DATA_DIR = None
RES = 10
STAC_ENDPOINT = "https://planetarycomputer.microsoft.com/api/stac/v1"
COLLECTIONS = ["sentinel-2-l2a"]
COLLECTION_BANDS = ["blue", "green", "red", "nir08", "swir16", "swir22", "qa"]
OUTPUT_BANDS = ["blue", "green", "red", "nir08", "swir16", "swir22", "ndvi", "qa"]

if DATA_DIR is None:
    raise ValueError("You must specify a data directory.")

os.environ["GDAL_DISABLE_READDIR_ON_OPEN"] = "FALSE"

Load environment variables from .env file.


## Define directory paths
* The DATA_DIR is a key input for this and all other notebooks. It's the root folder and all subsequent written output is stored in that folder.

In [19]:
# Define directory paths
raw_dir = DATA_DIR / "raw"
interim_dir = DATA_DIR / "interim"
processed_dir = DATA_DIR / "processed"
region_dir = raw_dir / "regions"
dst_dir = interim_dir / "cogs"
dst_dir.mkdir(exist_ok=True, parents=True)

# Load regions
* The regions `geojson` is another key input for this and other notebooks (and the `task.py` which executes the same tasks as the notebooks). The geojson provides a list of regions to download and process in a format that conforms to the [geojson standard](https://geojson.org/).

In [20]:
print("[Download]: Load input region geojson and config.")
with open(raw_dir / "cfg.json")  as f:
    cfg = json.load(f)
regions = gpd.read_file(raw_dir / "regions.geojson")
regions["time_range"] = regions["s2_start"] + "/" + regions["s2_end"]

Load input region geojson and config.


In [21]:
print("[Download]: Instantiate dask client.")
client = dask.distributed.Client()
configure_rio(cloud_defaults=True, client=client)
display(client)

Instantiate dask client.


Perhaps you already have a cluster running?
Hosting the HTTP server on port 58646 instead


0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:58646/status,

0,1
Dashboard: http://127.0.0.1:58646/status,Workers: 4
Total threads: 16,Total memory: 15.93 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:58647,Workers: 4
Dashboard: http://127.0.0.1:58646/status,Total threads: 16
Started: Just now,Total memory: 15.93 GiB

0,1
Comm: tcp://127.0.0.1:58667,Total threads: 4
Dashboard: http://127.0.0.1:58669/status,Memory: 3.98 GiB
Nanny: tcp://127.0.0.1:58650,
Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-q4tmmjtu,Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-q4tmmjtu

0,1
Comm: tcp://127.0.0.1:58677,Total threads: 4
Dashboard: http://127.0.0.1:58678/status,Memory: 3.98 GiB
Nanny: tcp://127.0.0.1:58651,
Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-4iairlav,Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-4iairlav

0,1
Comm: tcp://127.0.0.1:58668,Total threads: 4
Dashboard: http://127.0.0.1:58673/status,Memory: 3.98 GiB
Nanny: tcp://127.0.0.1:58652,
Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-9ikzgyvy,Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-9ikzgyvy

0,1
Comm: tcp://127.0.0.1:58666,Total threads: 4
Dashboard: http://127.0.0.1:58671/status,Memory: 3.98 GiB
Nanny: tcp://127.0.0.1:58653,
Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-fv0duftd,Local directory: C:\Users\Peter\AppData\Local\Temp\dask-worker-space\worker-fv0duftd


## Download COGs

In [23]:
download_cogs(regions, dst_dir, cfg, stac_endpoint=STAC_ENDPOINT, collections=COLLECTIONS)

[af-kharkamar-2022]: 2016-01-01/2023-04-29.
[af-kharkamar-2022]: Search catalog.
[af-kharkamar-2022]: Found 431 items.
[af-kharkamar-2022]: Selected 293 items.
[af-kharkamar-2022]: Load items into data cube.
[af-kharkamar-2022]: Re-order bands.
[af-kharkamar-2022]: Write 289 TIFs.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-01-15.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-02-14.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-03-15.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-04-14.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-04-27.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-05-07.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-06-16.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-06-23.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-07-13.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-08-22.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_2016-09-11.tif.
[af-kharkamar-2022]: Wrote af-kharkamar-2022_20