# South Sudan Data Layers

This notebook is used to prepare the data layers for South Sudan. The data layers will be processed and upload to Mapbox.

## Data Hierarchy

The data is organized in the following hierarchy:

- [This](https://docs.google.com/spreadsheets/d/1RdJCjygAiWu2zBMGRF0ayigzrA2WhaObWMNdlkllgSQ/edit?usp=sharing) is the link to the data hierarchy spreadsheet.

## Data Access
**Input Data**

The data is stored in the following Google Cloud Storage bucket (source from GMV):
- https://console.cloud.google.com/storage/browser/wbhydross_deliverables

**Output Data**
- Raster layers: s3 bucket
- Vector layers: Mapbox


## Setup

### Library import


In [1]:
# imports
import os
import sys
from pathlib import Path
from pprint import pprint

# Include local library paths if you have ../src/utils.py
sys.path.append("../src/")
sys.path.append("../src/animations")
sys.path.append("../src/datasets")
sys.path.append("../src/helpers")
sys.path.append("../src/datasets/factory")

from datasets.datasets import dataset_database
from datasets.processing import LayerProcessing
from helpers.mapbox_uploader import upload_to_mapbox
from helpers.s3_uploader import upload_files_to_s3_parallel
from helpers.settings import get_settings

In [2]:
# Load settings with environment variables
settings = get_settings()

# Data Acquisition

## Dataset information

In [3]:
datasets = dataset_database.datasets()
pprint(datasets)

{'Agricultural drought exposure': <datasets.datasets.Dataset object at 0x7f47e9f27b90>,
 'Agricultural drought hazard': <datasets.datasets.Dataset object at 0x7f47e9f27c20>,
 'Boundaries': <datasets.datasets.Dataset object at 0x7f47e9f27bc0>,
 'Contextual layers': <datasets.datasets.Dataset object at 0x7f47e9f27b60>,
 'EO-based flood exposure': <datasets.datasets.Dataset object at 0x7f47e9f27b30>,
 'EO-based flood hazard': <datasets.datasets.Dataset object at 0x7f47e9f27b00>,
 'Hydrographic data': <datasets.datasets.Dataset object at 0x7f47e9f27ad0>,
 'Hydrometeorological Data': <datasets.datasets.Dataset object at 0x7f47e9f27aa0>,
 'Meteorological drought exposure': <datasets.datasets.Dataset object at 0x7f47e9f27a70>,
 'Meteorological drought hazard': <datasets.datasets.Dataset object at 0x7f47e9f27a40>,
 'Model-based flood exposure': <datasets.datasets.Dataset object at 0x7f47e9f27a10>,
 'Model-based flood hazard': <datasets.datasets.Dataset object at 0x7f47e9f27950>,
 'Populated in

# Static Layers
## Floods and Droughts Layers
### Create layers

In [4]:
datasets_list = [
    "Model-based flood hazard",
    "EO-based flood hazard",
    "Meteorological drought hazard",
    "Agricultural drought hazard",
    "Model-based flood exposure",
    "EO-based flood exposure",
    "Meteorological drought exposure",
    "Agricultural drought exposure",
]

dict_path = "../data/processed/datasets_dict.json"

layer_processing = LayerProcessing(datasets, datasets_list, dict_path)
layer_processing.create_layers()

  0%|          | 0/8 [00:00<?, ?it/s]

Model-based flood hazard


100%|██████████| 10/10 [00:00<00:00, 180013.05it/s]


EO-based flood hazard


100%|██████████| 2/2 [00:00<00:00, 50533.78it/s]


Meteorological drought hazard


100%|██████████| 1/1 [00:00<00:00, 28149.69it/s]


Agricultural drought hazard


100%|██████████| 1/1 [00:00<00:00, 16008.79it/s]


Model-based flood exposure


100%|██████████| 20/20 [00:00<00:00, 321402.61it/s]


EO-based flood exposure


100%|██████████| 10/10 [00:00<00:00, 283398.92it/s]


Meteorological drought exposure


100%|██████████| 8/8 [00:00<00:00, 277309.36it/s]


Agricultural drought exposure


100%|██████████| 8/8 [00:00<00:00, 138084.08it/s]
100%|██████████| 8/8 [00:00<00:00, 536.48it/s]


### Raster layers

#### Upload raster tiles to S3 bucket

In [4]:
directory_path = "../data/processed/RasterTiles/"
bucket_folder = "raster-tiles"

all_folders = os.listdir(directory_path)

for folder in all_folders:
    print(f"Uploading folder {folder}:")
    local_directory = os.path.join(directory_path, folder)
    if os.path.isdir(local_directory):
        upload_files_to_s3_parallel(local_directory, f"{bucket_folder}/{folder}")

Uploading folder EFH_flood_extent_(2017-2022):


100%|██████████| 21320/21320 [02:13<00:00, 159.98it/s]


Uploading folder ADE_cropland_and_grassland:


100%|██████████| 21125/21125 [02:14<00:00, 157.64it/s]


Uploading folder EFE_population:


100%|██████████| 20221/20221 [02:06<00:00, 160.39it/s]


Uploading folder MDE_built-up_surface:


100%|██████████| 21125/21125 [02:10<00:00, 161.58it/s]


Uploading folder MFH_fluvial_flood_depth_10-yr_rp:


100%|██████████| 23061/23061 [02:22<00:00, 162.24it/s]


Uploading folder MDH_combined_spi_and_spei_indices:


100%|██████████| 24929/24929 [02:33<00:00, 162.01it/s]


Uploading folder MFH_fluvial_flood_depth_50-yr_rp:


100%|██████████| 23061/23061 [02:23<00:00, 161.16it/s]


Uploading folder MFH_pluvial_flood_depth_20-yr_rp:


100%|██████████| 23061/23061 [02:23<00:00, 161.19it/s]


Uploading folder ADH_combined_sndvi_and_sma_indices:


100%|██████████| 21125/21125 [02:10<00:00, 161.28it/s]


Uploading folder MFE_population_5-yr_rp:


100%|██████████| 20799/20799 [02:08<00:00, 162.24it/s]


Uploading folder MFE_cropland_and_grassland_5-yr_rp:


100%|██████████| 20799/20799 [02:21<00:00, 146.77it/s]


Uploading folder MFH_pluvial_flood_depth_100-yr_rp:


100%|██████████| 23061/23061 [02:24<00:00, 159.83it/s]


Uploading folder MFH_fluvial_flood_depth_20-yr_rp:


100%|██████████| 23061/23061 [02:22<00:00, 162.27it/s]


Uploading folder MFE_built-up_surface_5-yr_rp:


100%|██████████| 20799/20799 [02:08<00:00, 161.49it/s]


Uploading folder MFH_pluvial_flood_depth_50-yr_rp:


100%|██████████| 23061/23061 [02:22<00:00, 161.54it/s]


Uploading folder MFH_fluvial_flood_depth_100-yr_rp:


100%|██████████| 23061/23061 [02:25<00:00, 158.80it/s]


Uploading folder MFH_pluvial_flood_depth_5-yr_rp:


100%|██████████| 23061/23061 [02:21<00:00, 162.95it/s]


Uploading folder EFE_cropland_and_grassland:


100%|██████████| 20221/20221 [02:05<00:00, 160.52it/s]


Uploading folder CL_land_cover:


100%|██████████| 21125/21125 [02:12<00:00, 158.95it/s]


Uploading folder HD_digital_elevation_model:


100%|██████████| 21125/21125 [02:14<00:00, 156.92it/s]


Uploading folder ADE_built-up_surface:


100%|██████████| 21125/21125 [02:09<00:00, 162.65it/s]


Uploading folder MDE_cropland_and_grassland:


100%|██████████| 24929/24929 [02:34<00:00, 161.53it/s]


Uploading folder MFH_fluvial_flood_depth_5-yr_rp:


100%|██████████| 23061/23061 [02:22<00:00, 162.16it/s]


Uploading folder EFE_built-up_surface:


100%|██████████| 20221/20221 [02:04<00:00, 161.82it/s]


Uploading folder ADE_population:


100%|██████████| 21125/21125 [02:10<00:00, 161.83it/s]


Uploading folder MFE_cropland_and_grassland_100-yr_rp:


100%|██████████| 20799/20799 [02:12<00:00, 156.88it/s]


Uploading folder MFE_built-up_surface_100-yr_rp:


100%|██████████| 20799/20799 [02:09<00:00, 161.22it/s]


Uploading folder MFE_population_100-yr_rp:


100%|██████████| 20799/20799 [02:09<00:00, 160.58it/s]


Uploading folder MFH_pluvial_flood_depth_10-yr_rp:


100%|██████████| 23061/23061 [02:23<00:00, 160.44it/s]


Uploading folder CL_population:


100%|██████████| 21125/21125 [02:11<00:00, 160.49it/s]


Uploading folder EFH_flood_max_frequency_(2017-2022):


100%|██████████| 21320/21320 [02:13<00:00, 160.09it/s]


### Vector layers

#### Upload layers to Mapbox

In [5]:
directory_path = Path("../data/processed/VectorLayers/")

all_files = os.listdir(directory_path)

for file_name in all_files:
    local_file = directory_path / Path(file_name)

    # Upload to Mapbox
    upload_name = upload_to_mapbox(
        local_file,
        file_name,
        settings.MAPBOX_USER,
        settings.MAPBOX_TOKEN,
    )

## Contextual Layers
### Create layers

In [11]:
datasets_list = ["Contextual layers"]

dict_path = "../data/processed/datasets_dict.json"

layer_processing = LayerProcessing(datasets, datasets_list, dict_path)
layer_processing.create_layers()

  0%|          | 0/1 [00:00<?, ?it/s]

Contextual layers


INFO:helpers.raster_processor:Applying styles
Application path not initialized
INFO:helpers.raster_processor:Converting to GeoTIFF


Processing Land cover from Contextual layers


INFO:helpers.raster_processor:Converting to Cloud-Optimized GeoTIFF
Reading input: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/CL_land_cover.tif

Adding overviews...
Updating dataset tags...
Writing output to: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/CL_land_cover.tif
INFO:helpers.raster_processor:Processing complete. Output saved to ../data/processed/RasterLayers/CL_land_cover.tif


Creating tiles ...


INFO:helpers.raster_processor:Applying styles


Processing Population from Contextual layers


Application path not initialized
INFO:helpers.raster_processor:Converting to GeoTIFF
INFO:helpers.raster_processor:Converting to Cloud-Optimized GeoTIFF
Reading input: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/CL_population.tif

Adding overviews...
Updating dataset tags...
Writing output to: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/CL_population.tif
INFO:helpers.raster_processor:Processing complete. Output saved to ../data/processed/RasterLayers/CL_population.tif


Creating tiles ...




Processing Seasonal cattle grazing areas from Contextual layers
Loading data from https://storage.googleapis.com/wbhydross_deliverables/D3-Database/01-Population_Assets_Infrastructures/Pastures-REACH/WBHYDROSSD_REACH-SeasonalCattleGrazingAreas_4326_SouthSudan_2020_20240103.shp...


INFO:root:Creating JSON file...
INFO:pyogrio._io:Created 80 records
INFO:root:Creating mbtiles file...
For layer 0, using name "CL_seasonal_cattle_grazing_areas"
../data/processed/RasterLayers/CL_seasonal_cattle_grazing_areas.json:8: Found ] at top level
../data/processed/RasterLayers/CL_seasonal_cattle_grazing_areas.json:11: Reached EOF without all containers being closed
In JSON object {"type":"FeatureCollection","name":"CL_seasonal_cattle_grazing_areas","crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:OGC:1.3:CRS84"}},"features":[]}
80 features, 473626 bytes of geometry, 591 bytes of separate metadata, 1108 bytes of string pool
Choosing a maxzoom of -z1 for features about 252062 feet (76829 meters) apart
Choosing a maxzoom of -z10 for resolution of about 440 feet (134 meters) within features
  99.9%  10/612/497  
100%|██████████| 3/3 [30:06<00:00, 602.01s/it]
100%|██████████| 1/1 [30:06<00:00, 1806.02s/it]


## Boundaries, Waterbodies and Infrastructure 
### Create layers

In [4]:
datasets_list = [
    "Boundaries",
    "Hydrographic data",
    "Populated infrastructures",
    "Transportation Network Infrastructures",
    "Water-related infrastructures",
]

dict_path = "../data/processed/datasets_dict.json"

layer_processing = LayerProcessing(datasets, datasets_list, dict_path)
layer_processing.create_layers()

  0%|          | 0/5 [00:00<?, ?it/s]

Boundaries


100%|██████████| 5/5 [00:00<00:00, 90394.48it/s]


Hydrographic data


INFO:helpers.raster_processor:Applying styles
Application path not initialized
Application path not initialized
Application path not initialized
Application path not initialized


Processing Digital Elevation Model from Hydrographic data


Application path not initialized
Application path not initialized
INFO:helpers.raster_processor:Converting to GeoTIFF
INFO:helpers.raster_processor:Converting to Cloud-Optimized GeoTIFF
Reading input: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/HD_digital_elevation_model.tif

Adding overviews...
Updating dataset tags...
Writing output to: /home/iker/Vizzuality/Proiektuak/wims-south-sudan/data-processing/data/processed/RasterLayers/HD_digital_elevation_model.tif
INFO:helpers.raster_processor:Processing complete. Output saved to ../data/processed/RasterLayers/HD_digital_elevation_model.tif


Creating tiles ...


100%|██████████| 2/2 [03:37<00:00, 108.87s/it]
 40%|████      | 2/5 [03:37<05:26, 108.87s/it]

Populated infrastructures


100%|██████████| 2/2 [00:00<00:00, 79891.50it/s]


Transportation Network Infrastructures


100%|██████████| 1/1 [00:00<00:00, 41943.04it/s]


Water-related infrastructures


100%|██████████| 2/2 [00:00<00:00, 82241.25it/s]
100%|██████████| 5/5 [03:37<00:00, 43.55s/it] 


# Animated Layers
## Hydrometeorological Data Layers
### Create layers

In [4]:
datasets_list = [
    "Hydrometeorological Data",
]

dict_path = "../data/processed/datasets_dict.json"

layer_processing = LayerProcessing(datasets, datasets_list, dict_path)
layer_processing.create_layers()

  0%|          | 0/1 [00:00<?, ?it/s]

Hydrometeorological Data




Processing Evapotranspiration from Hydrometeorological Data
Loading Zarr data from gs://wbhydross_deliverables/D3-Database/02- Meteorological datasets/Evapotranspiration-WaPOR/WBHYDROSSD_WaPOR_Evapotranspiration_100m_SouthSudan_2023_20240220.zarr...
Creating tiles ...



[AINFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.credentials:Found credentials in environment variables.
INFO:botocore.cr

KeyboardInterrupt: 

### Upload raster tiles to S3 bucket

In [None]:
directory_path = "../data/processed/AnimatedTiles/"
bucket_folder = "animated-tiles"

all_folders = os.listdir(directory_path)

for folder in all_folders:
    print(f"Uploading folder {folder}:")
    local_directory = os.path.join(directory_path, folder)
    if os.path.isdir(local_directory):
        upload_files_to_s3_parallel(local_directory, f"{bucket_folder}/{folder}")