## Data exploration 

Naming convention: &lt;region&gt;_&lt;rainfall-event&gt;.npz

In [33]:
import numpy as np 

In [34]:
data = np.load('./data/sims_harvey.npz')

Each data sample is a dictionary with the following keys 

In [35]:
# static, edges, cell_feats
data['static'].shape

(269168, 2)

`static` has two dimensions: number of nodes, and two features.

* feature 1: DEM data (data evelation) 
* feature 2: Manning's coefficient of friction

In [36]:
data['edges'].shape

(268254, 2)

`edges`: edge list of regions represented as a graph 

Note: Since each node has only one outgoing edges, the number of nodes is equal to the numbe of edges.

In [37]:
data['cell_feats'].shape

(269168, 50, 3)

`cell_feats` has three dimensions: number of nodes, time axis, and three features.

* feature 1: water depth 
* feature 2: surface water elevation (water depth + DEM) 
* feature 3: rainfall data.

## Graph: Node indices  

*Note*: The rest of the notebook uses np.array instead of torch.Tensor

Load a dem file and extract its mask 

In [38]:
from esri_data import read_esri_ascii_raster 

# There is a dem file for each watershed/region in the data folder.
dem = read_esri_ascii_raster(f'data/sims_dem.asc', '\s')
dem_data = dem.matrix 
mask = (dem_data != dem.nodata_value)

Generate node indices: each node/cell get assigned index from 0 to N - 1, where N is the total number of nodes

In [39]:
non_valid_idx = -1
node_indx = np.full(mask.shape, fill_value=non_valid_idx) # [H x W]
num_nodes = mask.sum()
node_indx[mask] = np.arange(num_nodes)

## Grid reconstruction from graph 

* Using a random array/tensor with random dimension, except for first dimension (which should be the same as the number of nodes) to show that any output of the model can be converted back to the grid representation.  



In [40]:
random_dim1 = 30 
random_dim2 = 53
output_arr = np.random.randn(data['static'].shape[0], random_dim1, random_dim2) # 

# Create grid 
grid = np.random.randn(*mask.shape[:2], random_dim1, random_dim2)

# Assign output_arr to valid cells grid 
grid[mask] = output_arr 