Unfortunately, CDSE STAC doesn't work as well as it could (see: [https://github.com/Kayrros/sentinel-2-jp2-tlm](https://github.com/Kayrros/sentinel-2-jp2-tlm)). Luckily, OData can be used to get the data and process it, provided sufficient disk space is available and you know the tile identifier.

First step is to list all the available product names. `scripts/get_tile_list.py` handles this. 

```bash

python src/get_tile_list.py -h

usage: get_tile_list.py [-h] [--product_type PRODUCT_TYPE] [--cloud_cover CLOUD_COVER] [--outpath OUTPATH]
                        tile_id start_date end_date

positional arguments:
  tile_id                      Tile id to query, for example 34VEM
  start_date                   Start date for query, format yyyy-mm-dd
  end_date                     End date for query, format yyyy-mm-dd

options:
  -h, --help                   show this help message and exit
  --product_type PRODUCT_TYPE  Which product type to download (default: 1C)
  --cloud_cover CLOUD_COVER    Maximum cloud cover percentage, default 20 (default: 20)
  --outpath OUTPATH            Where to save the resulting product id list, default '.' (default: .)
```

Downloading the data requires CDSE account, which can be created [here](https://documentation.dataspace.copernicus.eu/Registration.html). After this, `scripts/download_and_convert.py` can be used to bulk download the individual products and resample them to 10m resolution. Make sure to run with `--save_scl`, as it can be used for cloud and nodata masking.

```bash

usage: download_and_convert.py [-h] [--save_scl] product_name_file creds_file dl_path outpath

Wrapper to both download and convert data with multiprocessing

positional arguments:
  product_name_file  Path to txt file containing the product ids
  creds_file         Path to credential files
  dl_path            tmp path to download to
  outpath            Path to output mosaic folder

options:
  -h, --help         show this help message and exit
  --save_scl         Whether to save SCL to mosaics (default: False)
  ```

After all data has been downloaded, `scripts/make_cubes.py` creates monthly medians and saves them to `zarr` files, so that all data from one year is in a single file.

```python
    files = [f for f in os.listdir(datapath) if f.endswith('tif')]
    arrays = []
    dates = []
    files.sort(key=lambda x: x.split('_')[2].split('T')[0])
    for f in files: 
        date = f.split('_')[2].split('T')[0]
        dates.append(pd.to_datetime(date))
        da = rxr.open_rasterio(datapath/f, chunks={'x': 1024, 'y': 1024})
        da = da.assign_coords(band=BAND_NAMES)
        mask = da.sel(band='SCL')
        da = da.where(~mask.isin([0,1,8,9,10]))
        da = da.sel(band=da.band.values[:-1])
        arrays.append(da)
    cube = xr.concat(arrays, dim='time').assign_coords(time=dates)
    cube = cube.chunk({'time': -1, 'band': -1, 'y': 1024, 'x': 1024})
    monthly = cube.groupby('time.month').median(skipna=True)
    monthly = monthly.to_dataset(dim='band')
    monthly.to_zarr(outpath, mode='w')

```

The full data processing chain can be done with this shell script:

```bash
#!/bin/bash

basepath=<set_your_own_datapath>

tiles=("35WNT")
years=("2022" "2021" "2020")
mosaic_path=$basepath/s2-data/tiles
median_path=$basepath/s2-data/medians

for year in ${years[@]}; do
    echo Processing year $year
    mkdir $mosaic_path/$year
    mkdir $median_path/$year
    for tile in ${tiles[@]}; do    
        mkdir $mosaic_path/$year/$tile
        mkdir $median_path/$year/$tile
        # Get file list for tile
        python $basepath/src/get_tile_list.py $tile \
            ${year}-01-01 ${year}-12-31 \
            --cloud_cover 10 --product_type 2A \
            --outpath $mosaic_path/$year
        python $basepath/src/download_and_convert.py \
               $mosaic_path/$year/${tile}_tileids.txt \
               $basepath/secret/creds.json \
               $basepath/s2-data/products \
               $mosaic_path/$year/$tile \
               --save_scl

        echo Making monthly medians
        
        # Make monthly medians
        python $basepath/src/make_cubes.py \
            $mosaic_path/$year/$tile \
            $median_path/$year/$tile.zarr

    done
done
```