# Sentinel Hub folder format

Uncomment the following line to install eotdl if needed.

In [None]:
# !pip install eotdl

When we download imagery through the Sentinel Hub client, by default the EOTDL enviroment makes that every image is downloaded in a folder with nomenclature `<id>_<date>/<request_id>`. If we get one of the downloaded rasters path we can see it. If you missed how to download imagery from Sentinel Hub, see the [previous notebook](12_sh_download.ipynb).

In [13]:
from glob import glob

rasters = glob('data/sentinel_2/*/*/*.tiff')
rasters[0]

'data/sentinel_2/Jaca_2020-01-09/3edb5693f0d781595997bffd04b92a67/response.tiff'

In order to maintain a logic structure and ensure that the dataset is diregible by the EO-TDL environment, we must make sure that the project structure is compatible and every image has an associated metadata file with necessary info about the image, which will be used later by the STAC generation and the labeling with SCANEO. 

To do so, the EOTDL environment has a `Folder Formatter` that does exactly that: extracts the images to a more human-readable folder structure, which also is necessary if we want to label those images using SCANEO. What we need is a unique folder containing all the images, with associated `json` files with necessary metadata of every image such as the acquisiton date of the image, the type, the bounding box, and so on.

Let's format the folder structure. We can choose between `structured_format_folders`, which will allocate every image in a single directory, or `unestructured_format_folders`, which will allocate all the images in a given directory. This is the method we are going to use, as we need all the images in a single directory to label them using SCANEO.

In [14]:
from eotdl.curation.folder_formatters import SHFolderFormatter

formatter = SHFolderFormatter('data/sentinel_2')
formatter.root

formatter.unestructured_format_folders()   # By default, it uses the path given in the constructor

Now, if we look again for the rasters paths, we will see that the folder structure is much more readable and nice.

In [18]:
rasters = glob('data/sentinel_2/*.tif')
rasters[:5]

['data/sentinel_2/Jaca_1.tif',
 'data/sentinel_2/Jaca_2.tif',
 'data/sentinel_2/Jaca_3.tif',
 'data/sentinel_2/Jaca_4.tif']

We can look for them metadata files, too.

In [16]:
jsons = glob('data/sentinel_2/*.json')
jsons[:5]

['data/sentinel_2/Jaca_4.json',
 'data/sentinel_2/Jaca_2.json',
 'data/sentinel_2/Jaca_3.json',
 'data/sentinel_2/Jaca_1.json']

What if we want to see a random image? We can display it and explore it interactively using [leafmap](https://leafmap.org/).

In [19]:
import leafmap.leafmap as leafmap
from random import choice

m = leafmap.Map()
raster = choice(rasters)
m.add_raster(raster, bands=[5, 4, 3], layer_name='Raster')
m

Map(center=[20, 0], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_text…