# Sentinel-2 Image Downloading 

The notebook presents the data download script from Google Earth Engine, formatting the images for preprocessing.

### How to Install

1. Install conda environment.

```
conda env create -f processing_environment.yml
conda activate ee
```
  
2. Install kernel.

```
python -m ipykernel install --user --name ee --display-name "ee kernel"
```

3. In new notebook from jupylab, select kernel 'ee kernel'

Source on how to install ee: https://developers.google.com/earth-engine/python_install-conda

### How to Add New Areas

In utils/gee_settings.py
1. In 'areas' list, include area, removing spaces i.e. Villa del Rosario > villadelrosario
2. In BBOX dict, add bounding box arranged as a list of 4 numbers, upper left and lower right
3. In CLOUD_PARAMS dict, specify cloud filter and if will be masked or not
4. In admin2RefN, add name in Admin Boundary shapefile

Once downloaded file shows in gs://immap-gee
1. check if the area is split into multiple files
2. If yes, add area to multi-part list in Section Input params

## Imports of Required Packages and Setup


### Import Python packages

In [11]:
import os
import sys
import geopandas as gpd
from fiona.crs import to_string
import pathlib
from tqdm import tqdm


### Import customized modules

In [12]:
### Add local modules to the path
src = os.path.abspath('../scripts')
if src not in sys.path:
    # sys.path.append(src)
    sys.path.insert(0, src)


In [13]:
from gee import sen2median, deflatecrop1
from gee_settings import BBOX, CLOUD_PARAMS, admin2RefN
from mkdir import check_create_dir


## Setup useful directories

In [4]:
### Define working base path
root = os.path.abspath("../../../../Abidjan/outputs")
root


'/Users/ldjeutsch/Offliners/PYTHON-DATA-SCI/UM-Project/Abidjan/outputs'

In [5]:
# Define working path 
data_dir = os.path.join(root, "data")
adm_dir = os.path.join(data_dir, "admin_bounds")
img_dir = os.path.join(data_dir, "images")
tmp_dir = os.path.join(data_dir, "tmp")


In [6]:
### Check and create output data directory if needed
list_directories = [data_dir, adm_dir, img_dir, tmp_dir]
for path in list_directories:
    check_create_dir(path) 


The folder '/Users/ldjeutsch/Offliners/PYTHON-DATA-SCI/UM-Project/Abidjan/outputs/data' already exists
The folder '/Users/ldjeutsch/Offliners/PYTHON-DATA-SCI/UM-Project/Abidjan/outputs/data/admin_bounds' already exists
The folder '/Users/ldjeutsch/Offliners/PYTHON-DATA-SCI/UM-Project/Abidjan/outputs/data/images' already exists
The folder '/Users/ldjeutsch/Offliners/PYTHON-DATA-SCI/UM-Project/Abidjan/outputs/data/tmp' already exists


In [7]:
### Get area shape file
# gdf = gpd.read_file(adm_dir + 'admin_bounds.gpkg')
# fcrs = to_string({'init': 'epsg:4326', 'no_defs': True})
# gdf.crs = fcrs


## Input params

In [8]:
PRODUCT = 'COPERNICUS/S2_SR' # L2A
years = ['2020-2021', '2022-2023']
def get_minmaxdt(year_str):
    list_ = year_str.split('-')
    return list_[0] + '-09-01', list_[1] + '-12-31'

ibadan_eas = []

ibadan_lga = ['Ibadan North', 'Ibadan North West', 'Ibadan North East', 'Ibadan South West', 'Ibadan South East']


## Download from GEE

In [5]:
for ea in ibadan_eas:
    for year in years:
        cloud_pct, mask = CLOUD_PARAMS[ea][year]
        min_dt, max_dt = get_minmaxdt(year)
        sen2median(
            BBOX[ea], 
            FILENAME = f'gee_{ea}_{year}', 
            min_dt = min_dt, 
            max_dt = max_dt,
            cloud_pct = cloud_pct, 
            mask = mask,
            PRODUCT = PRODUCT,
            verbose = 1
        )


Processing gee_pereira_2015-2016
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_pereira_2017-2018
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_pereira_2019-2020
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_chia_2015-2016
using COPERNICUS/S2
Filtering to images with cloud cover < 20
with mask
Task started
Processing gee_chia_2017-2018
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_chia_2019-2020
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_pamplona_2015-2016
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_pamplona_2017-2018
using COPERNICUS/S2
Filtering to images with cloud cover < 40
with mask
Task started
Processing gee_pamplona_2019-2020
using COPERNICUS/S2
Filtering

## Deflate and crop

In [6]:
# create shapefiles for cropping
for area in areas:
    area1 = gdf[gdf['admin2RefN'] == admin2RefN[area]]
    area1.to_file(adm_dir + area + '.shp')


In [7]:
# collect filenames to be processed
files_ = []

for area in areas:
    for year in years:
        if area in multipart:
            # just get the largest part
            files_.append(f'gee_{area}_{year}0000000000-0000000000')
        else:
            files_.append(f'gee_{area}_{year}')


In [None]:
for f in tqdm(files_):
    deflatecrop1(
        raw_filename = f, 
        output_dir = img_dir, 
        adm_dir = adm_dir,
        tmp_dir = tmp_dir,
        bucket = 'gs://immap-images/20200613/',
        clear_local = True
    )


  3%|▎         | 2/69 [05:19<2:58:26, 159.81s/it]

In [12]:
1+1


2