# Mirror GHS-composite-S2

This section covers mirroring locally the UK section of the cloud-free composite Sentinel-2 mosaic created by the European Commission. Official website is over at:

> https://ghsl.jrc.ec.europa.eu/ghs_s2composite.php

And paper for the dataset is:

> Corbane, C., Politis, P., Kempeneers, P., Simonetti, D., Soille, P., Burger, A., ... & Kemper, T. (2020). [A global cloud free pixel-based image composite from Sentinel-2 data](https://www.sciencedirect.com/science/article/pii/S2352340920306314). *Data in Brief*, 105737.

In [1]:
import sys
sys.path.insert(0, "../")
import utils
import geopandas
from dask import dataframe as dd
from dask.system import cpu_count

## Set up

Before accessing and downloading each GeoTIFF, let's set up the target folder:

In [2]:
local_dir = "../../data/"

The set of UTM tiles we require for GB are the following:

In [3]:
gb_utm_tiles = ["30U", "31U", "29V", "30V"]

The metadata for the grid of tiles and their URLs is available as a GeoJSON. We read the file and exclude every tile that does not cover GB:

In [4]:
meta_p = "GHS-composite-S2.geojson"
meta = geopandas.read_file(meta_p)
meta = meta[meta["UTMtile"].isin(gb_utm_tiles)]

## Download scenes

All of the scenes will be stored in the same folder (`local_dir`) in the OSGB 1936 British National Grid ([`EPSG:EPSG:27700`](http://epsg.io/27700)). This aligns with the rest of data in the project and will also allow setting up a single virtual raster (see next section).

### Generate a column with target file

In [5]:
meta["dst_path"] = meta["URL"].apply(lambda x: local_dir+x.split("/")[-1])

### Parallel download/reprojection

In parallel:
- Download each file on its each location
- Reproject to OSGB grid

In [6]:
# Ship `meta` to Dask
dmeta = dd.from_pandas(meta[["dst_path", "UTMtile", "URL"]],
                       npartitions=cpu_count()
                      )
# Apply in parallel
dout = dmeta.apply(utils.download_reproject, 
                   axis=1,
                   meta=("Output", None),
                   progressbar=False,
                   remove_intermediate=False,
                  )
_ = dout.compute()

04/11/2020 17:23:27 | Working on Tile 29V - File: S2_percentile_UTM_148-0000069888-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 29V - File: S2_percentile_UTM_148-0000000000-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 29V - File: S2_percentile_UTM_148-0000046592-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30U - File: S2_percentile_UTM_209-0000069888-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30U - File: S2_percentile_UTM_209-0000046592-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30U - File: S2_percentile_UTM_209-0000023296-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 29V - File: S2_percentile_UTM_148-0000023296-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30U - File: S2_percentile_UTM_209-0000000000-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30V - File: S2_percentile_UTM_149-0000069888-0000023296.tif
04/11/2020 17:23:27 | Working on Tile 30V - File: S2_percentile_UTM_149-0000046592-0000023296.tif
04/11/2020 17:23:27 

### Generate `.vrt` file

**NOTE** - The command below does not seem to work from the notebook. If you run the equivalent from the terminal without backtracking folders (`../`) it should work.

In [10]:
! gdalbuildvrt $local_dir"GHS-composite-S2.vrt" $local_dir"S2_percentile_UTM_*_osgb.tif"

0...10...20...30...40...50...60...70...80...90...100 - done.
ERROR 4: ../../data/S2_percentile_UTM_*_osgb.tif: No such file or directory
