# Reading hydro h5 output files

This notebook is an example of using pydsm to read DSM2 h5 output.

The timeseries are loaded as pandas DataFrame with datetime index and mcolumns of variable type (e.g. flow, stage, ec). This is similar to pyhecdss read in objects.

In addition to the state of the model as time series, the HDF file also contains the input tables as intepreted by DSM2. I say interpreted because it also has important tables such as virtual cross-sections that is the geometry finally used by DSM2 even though the user specifies the physical geometry in the input files.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import h5py
# main import 
from pydsm.hydroh5 import HydroH5
# Turn on ones below if in debug or development mode
#%load_ext autoreload
#%autoreload 2

## Opening a H5 file
This provides the handle to the HDF5 file. 

In [None]:
filename='../../tests/data/historical_v82.h5'
hydro=HydroH5(filename)

## Hydro data file structure
DSM2 Hydro HDF5 stores data under three groups:

 * /hydro/data
 * /hydro/input
 * /hydro/geometry
 


### Channels

The method get_channels() returns a data frame indexed by internal channel index. The first column contains the external channel id that is referenced in the dsm2 input files

In [None]:
hydro.get_channels()

### Reservoirs
The reservoirs table shows the name of the reservoirs

In [None]:
hydro.get_reservoirs()

### External Flows

These are external flows defined in the input files. E.g all the boundary flow inputs, including the diversions/seepage/returns at nodes are available from this table


?Need reference to dsm2 docs here

In [None]:
hydro.get_qext()

### Data Tables

These are tables that contain time series data. There are corresponding 
get_* for each table. Those are described below

In [None]:
hydro.get_data_tables()

### Channel indices to numbers
The data in DataSets under /hydro/data is typically indexed by time, channel index, upstream/downstream if needed
The channel index can be mapped to the channel number by looking up that information from /hydro/geometry/channel_number


### Extracting time series data
Extracting data can then be done using the channel numbers. All data arrays have the first axis as time. The time start and time interval is available in the attrs along with other meta data.

Flow data shape is *time* x *channel index* x *channel location*

time start is available in attribue "START_TIME"
channel index to channel numbers is explained above
channel location (upstream/downstream) is available in /hydro/geometry/channel_location

#### get_* methods

Each of the data tables has a corresponding get_* method. 
E.g. To the get the channel flow data use the methods below

Time window is an optional argument that can allow to retrieve only a part of the information

In [None]:
up1 = hydro.get_channel_flow('1','upstream')
down1 = hydro.get_channel_flow('1','downstream')
pd.concat([up1,down1],axis=1)

Use the timewindow argument to retrieve only part of the time series

In [None]:
up2 = hydro.get_channel_flow(2,'downstream','05JAN1990 0000 - 07JAN1990 0445')
up2

In [None]:
hydro.get_channel_stage(1,'upstream','08JAN1990 - 10JAN1990')

### Hydro Input Tables
The .h5 file in hydro contains many (though not all) input tables (*.inp). A complete listing of those tables can be read from the echo files. See this [notebook to read input](dsm2_read_input.ipynb)

In [None]:
hydro.get_input_tables()

To read the contents of any of the above tables simply use the get_input_table method

In [None]:
hydro.get_input_table('/hydro/input/channel')

### Hydro geometry input
Hydro also contains the geometry information such as the mapping of internal channel ids to external ones

In [None]:
hydro.get_geometry_tables()

Channel bottoms are a calculation especially when looking at channel stage. These then have to be used in conjunction with that information to calculate depths

In [None]:
channels=['1','331','441']
hydro.get_channel_bottom(channels)

Hydro does its computation at certain points and those are available from the table below

In [None]:
hydro.get_geometry_table('/hydro/geometry/hydro_comp_point')