## OS-C ECB Stress Test 2022 Issue

[Investigate replication of ECB stress test methodology 2022](https://github.com/os-climate/physrisk/issues/126)

## Spanish flood data onboarding to S3

The data is 1m resolution and for 10 year return period the file weight is 1000Gb (1Tb). Given that the resolution is extremely high we can downsize it to 100m. [Data source link](http://centrodedescargas.cnig.es/CentroDescargas/index.jsp#)

On the other hand, the data is provided chunked by the Spanish gov and some pre-computing must be done to upload it to an unique raster file in s3. First of all, the raster shape and affine transformation for the s3 raster file must be guessed. Secondly, every file in spanish gov must be read in 100x100 window and inserted in the new raster file.

From map provided by spanish gov we can discover shape and affine transformation of raster file.

Coordinate system used is: [EPSG 25830](https://epsg.org/crs_25830/ETRS89-UTM-zone-30N.html?sessionkey=cedqtluqe0)

left up corner: (-110000, 4914036)  
right up corner: (1095561, 4914036)  
left down corner: (-110000, 3900000)  
right down corner: (1095561, 3900000)  

the range is approximate and can be narrowed

width: 1095561 + 110000 = 1205561  
height: 4914036 - 3900000 = 1014036

Affine transformation adding vector:= (-110000, 3900000): = left down corner.

Canary Islands to be treated as a separated raster file.

To guess the bound exactly instead of approximate we can use the Spanish bounds lat-lon coordinates for the peninsula and transform them to EPSG 25830. Then repeat for Canary Islands.

## Create Zarr from shape and Affine transformation

<span style="color:blue">Note: the file must be located in /hazard/src/ for the dependencies to work</span>

In [5]:
import sys
import os
import s3fs
import zarr
import numpy as np
import rasterio
import math
import xarray as xr

from pyproj.crs import CRS
from affine import Affine

In [6]:
from hazard.sources.osc_zarr import OscZarr

In [7]:
# https://console-openshift-console.apps.odh-cl1.apps.os-climate.org/k8s/ns/sandbox/secrets/physrisk-s3-keys
default_staging_bucket = "redhat-osc-physical-landing-647521352890"
# OSC_S3_ACCESS_KEY, OSC_S3_SECRET_KEY

# Hazard indicators bucket
# default_staging_bucket = 'physrisk-hazard-indicators'
prefix = "hazard"

# Acess key and secret key are stored as env vars OSC_S3_HI_ACCESS_KEY and OSC_S3_HI_SECRET_KEY, resp.
s3 = s3fs.S3FileSystem(
    anon=False,
    key=os.environ["OSC_S3_ACCESS_KEY"],
    secret=os.environ["OSC_S3_SECRET_KEY"],
)

In [19]:
group_path = os.path.join(
    default_staging_bucket, prefix, "hazard_MV_prueba2.zarr"
).replace("\\", "/")
store = s3fs.S3Map(root=group_path, s3=s3, check=False)
root = zarr.group(store=store, overwrite=True)

In [4]:
s3.ls("redhat-osc-physical-landing-647521352890/hazard")

['redhat-osc-physical-landing-647521352890/hazard/hazard.zarr',
 'redhat-osc-physical-landing-647521352890/hazard/inventory.json',
 'redhat-osc-physical-landing-647521352890/hazard/wri']

In [41]:
s3.ls("redhat-osc-physical-landing-647521352890/hazard/hazard_MV_prueba.zarr")

['redhat-osc-physical-landing-647521352890/hazard/hazard_MV_prueba2.zarr/.zgroup']

In [31]:
# Check the zarr file was created
group_path in s3.ls("redhat-osc-physical-landing-647521352890/hazard")

True

In [6]:
oscZ = OscZarr(bucket=default_staging_bucket, prefix="hazard", s3=s3, store=store)

In [7]:
# width: 1205561
# height: 1014036

# adding vector:= (-110000, 3900000): = left down corner.

meters_resolution = 100
x_adding = -110000
y_adding = 3900000
transform = Affine(meters_resolution, 0, 0, meters_resolution, x_adding, y_adding)
crs = CRS.from_epsg(25830)
width = math.ceil(1205561 / meters_resolution)
height = math.ceil(1014036 / meters_resolution)
shape = (width, height)
return_periods = [10]

In [8]:
oscZ._zarr_create(
    path=group_path,
    shape=shape,
    transform=transform,
    crs=str(crs),
    overwrite=True,
    return_periods=return_periods,
)

<zarr.core.Array '/redhat-osc-physical-landing-647521352890/hazard/hazard_MV_prueba2.zarr' (1, 12056, 10141) float32>

In [42]:
# Create xr.DataArray from s3 stored zarr object
z = oscZ.root[group_path]
da = xr.DataArray(data=z)

In [11]:
# Read the file
# da = oscZ.read(path=group_path)
# da

# Return RuntimeError because of coords when creating Datarray

## Steps to populate hazard_MV_prueba.zarr for 1m resolution

### Step 1: Read ESP Government flood data

Returns flood depth array, x and y coordinates array in 25830 EPSG

In [12]:
def read_one_file(path_to_file):
    """
    Read spanish gov data.

    Parameters:
        path_to_file (str): full path to tif file.

    Returns:
        fld_depth (numpy array): flood depth at (x1, y1) 25830 EPSG coordinates
        x1 (numpy array)

    """

    src = rasterio.open(path_to_file)
    fld_depth = src.read()

    cols, rows = np.meshgrid(np.arange(src.width), np.arange(src.height))
    x1, y1 = rasterio.transform.xy(src.transform, rows, cols)

    return fld_depth.flatten(), np.array(x1).flatten(), np.array(y1).flatten()

In [13]:
path_to_file = r"C:\Users\mvazquez\Afirma Spain Dropbox\Manuel Vazquez Gandullo\Climate Risk\Jupyter_Notebooks\esp_gov_flood_data\ESNZSNCZIMPFT010E77.tif"
fld_depth, x1, y1 = read_one_file(path_to_file)

### Step 2: Use Affine inverse to translate (x1, x2) to store (x, y)

Since the Affine transformation matrix is the meter_resolution * identy we just subtract the adding vector and divide by the meter_resolution.

Finally, we have to choose the points nearest to the integer.

In [14]:
x = (x1 - x_adding) / meters_resolution
y = (y1 - y_adding) / meters_resolution

# Find closest x-axis coordinate
x_ = x - x.astype(int)
x_ = x_ == x_.min()

# Find closest y-axis coordinate
y_ = y - y.astype(int)
y_ = y_ == y_.min()

# Find common x-axis and y-axis coordinates
index_ = np.logical_and(x_, y_)

# Filter by index
x_coord = x[index_].astype(int)
y_coord = y[index_].astype(int)

### Step 3: Populate the raster file

In [26]:
da.data[x_coord, y_coord] = fld_depth[index_]

In [27]:
oscZ.write(path=group_path, da=da)

In [None]:
# Example using root object. Better to use oscZ object

"""
create_dataset(name, **kwargs) method of zarr.hierarchy.Group instance
    Create an array.
    
    Arrays are known as "datasets" in HDF5 terminology. For compatibility
    with h5py, Zarr groups also implement the require_dataset() method.
    
    Parameters
    ----------
    name : string
        Array name.
    data : array-like, optional
        Initial data.
    shape : int or tuple of ints
        Array shape.
    chunks : int or tuple of ints, optional
        Chunk shape. If not provided, will be guessed from `shape` and
        `dtype`.
    dtype : string or dtype, optional
        NumPy dtype.
    compressor : Codec, optional
        Primary compressor.
    fill_value : object
        Default value to use for uninitialized portions of the array.



root.create_dataset(name='prueba',
                    data = np.array([[0,1], [1,6]]),
                    shape = (2,2),
                    chunks = (1000, 1000),
                    dtype = 'f4')

trans_members = [
    transform.a,
    transform.b,
    transform.c,
    transform.d,
    transform.e,
    transform.f,
]
mat3x3 = [x * 1.0 for x in trans_members] + [0.0, 0.0, 1.0] # Why adding this ??
root.attrs["crs"] = str(crs)
root.attrs["transform_mat3x3"] = mat3x3 
if return_periods is not None:
    root.attrs["index_values"] = return_periods
    root.attrs["index_name"] = "return period (years)"

# Read the file
root['prueba']
"""

In [None]:
# Code to remove a file inside a bucket

""""
import boto3
boto_c = boto3.client('s3', aws_access_key_id=os.environ["OSC_S3_ACCESS_KEY"], aws_secret_access_key=os.environ["OSC_S3_SECRET_KEY"])

to_remove = boto_c.list_objects_v2(Bucket=default_staging_bucket, Prefix='hazard/hazard_MV_prueba.zarr')['Contents']

keys = [item['Key'] for item in to_remove]

for key_ in keys:
    boto_c.delete_object(Bucket=default_staging_bucket, Key=key_)
"""

In [21]:
s3.ls(
    "redhat-osc-physical-landing-647521352890/hazard/hazard.zarr/chronic_heat/osc/v1/mean_degree_days_above_18c_historical_1980"
)

['redhat-osc-physical-landing-647521352890/hazard/hazard.zarr/chronic_heat/osc/v1/mean_degree_days_above_18c_historical_1980/.zarray',
 'redhat-osc-physical-landing-647521352890/hazard/hazard.zarr/chronic_heat/osc/v1/mean_degree_days_above_18c_historical_1980/.zattrs',
 'redhat-osc-physical-landing-647521352890/hazard/hazard.zarr/chronic_heat/osc/v1/mean_degree_days_above_18c_historical_1980/0.0.0']

In [None]:
'redhat-osc-physical-landing-647521352890/redhat-osc-physicalrisk-upload'
 'redhat-osc-physical-landing-647521352890/demo_test-0518145310',
 'redhat-osc-physical-landing-647521352890/demo_test-0518191846',
 'redhat-osc-physical-landing-647521352890/demo_test-0518201258',