# Reading qual hdf5 files from DSM2

This notebook is an example of using pydsm to read DSM2 qual h5 output. Please the [notebook for reading hydro h5](dsm2_read_hydro_h5.ipynb) first.

The timeseries are loaded as pandas DataFrame with datetime index and mcolumns of variable type (e.g. flow, stage, ec). This is similar to pyhecdss read in objects.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import h5py
# main import 
from pydsm.qualh5 import QualH5
# Turn on ones below if in debug or development mode
#%load_ext autoreload
#%autoreload 2

## Opening a H5 file
This provides the handle to the HDF5 file. 

In [2]:
filename='../../tests/data/historical_v82_ec.h5'
qual=QualH5(filename)

# Qual data file structure
DSM2 Qual HDF5 stores data under two groups:
 * /input
 * /output

## Display channels

The method get_channels() returns a data frame indexed by internal channel index. The first column contains the external channel id that is referenced in the dsm2 input files

In [3]:
qual.get_channels()

Unnamed: 0,0
0,1
1,2
2,3
3,4
4,5
...,...
516,575
517,700
518,701
519,702


## Reservoirs
The reservoirs table shows the name of the reservoirs

In [4]:
qual.get_reservoirs()

Unnamed: 0,name
0,bethel
1,clifton_court
2,discovery_bay
3,franks_tract
4,liberty
5,mildred


## Constituents
These are the constituents that have been simulated for which data is available in the .h5 file

In [5]:
qual.get_constituents()

Unnamed: 0,constituent_names
0,ec


## Get Data Tables

These are tables that contain time series data. There are corresponding 
get_* for each table. Those are described below

In [6]:
qual.get_data_tables()

['channel avg concentration',
 'channel concentration',
 'reservoir concentration']

## Extracting time series data
Extracting data can then be done using the channel numbers. All data arrays have the first axis as time. The time start and time interval is available in the attrs along with other meta data.

Constituent data shape is *time* x *constituent name* x *channel index* x *channel location*

* time start is available in attribue "START_TIME"
* constituent name is the value found as explained in the get_constituents() method above
* channel index to channel numbers is explained above in the get_channels() method
* channel location (upstream/downstream) if available

### get_* methods

Each of the data tables has a corresponding get_* method. 
E.g. To the get the channel flow data use the methods below

Time window is an optional argument that can allow to retrieve only a part of the information

In [7]:
up1 = qual.get_channel_concentration('ec', '434','upstream')
down1 = qual.get_channel_concentration('ec','434','downstream')
pd.concat([up1,down1],axis=1)

Unnamed: 0,434-upstream,434-downstream
1990-01-04 00:00:00,0.000000,0.000000
1990-01-04 01:00:00,0.039021,0.000000
1990-01-04 02:00:00,0.053210,0.000686
1990-01-04 03:00:00,0.101569,0.005568
1990-01-04 04:00:00,0.127697,0.061579
...,...,...
1990-01-30 20:00:00,1369.167969,3097.956055
1990-01-30 21:00:00,1016.531372,2442.458252
1990-01-30 22:00:00,682.779236,1874.719360
1990-01-30 23:00:00,491.581543,1415.053955


Use the timewindow argument to retrieve only part of the time series

In [8]:
up2 = qual.get_channel_avg_concentration('ec', '380','10JAN1990 - 25JAN1990')
up2

Unnamed: 0,380
1990-01-10 00:00:00,123.189064
1990-01-10 01:00:00,123.963478
1990-01-10 02:00:00,125.721069
1990-01-10 03:00:00,127.292564
1990-01-10 04:00:00,128.298035
...,...
1990-01-24 19:00:00,189.090286
1990-01-24 20:00:00,189.432419
1990-01-24 21:00:00,190.366257
1990-01-24 22:00:00,192.871918


In [9]:
qual.get_channel_concentration('ec', 1,'upstream','08JAN1990 - 10JAN1990')

Unnamed: 0,1-upstream
1990-01-08 00:00:00,1138.650635
1990-01-08 01:00:00,1139.727905
1990-01-08 02:00:00,1139.729492
1990-01-08 03:00:00,1138.398071
1990-01-08 04:00:00,1133.848022
1990-01-08 05:00:00,1130.868042
1990-01-08 06:00:00,1130.011353
1990-01-08 07:00:00,1129.855347
1990-01-08 08:00:00,1129.834106
1990-01-08 09:00:00,1129.832275


## Qual Input Tables
The .h5 file in qual contains many (though not all) input tables (*.inp). A complete listing of those tables can be read from the echo files. See this [notebook to read input](dsm2_read_input.ipynb)

In [10]:
qual.get_input_tables()

['/input/envvar',
 '/input/group',
 '/input/group_member',
 '/input/input_climate',
 '/input/io_file',
 '/input/layers',
 '/input/node_concentration',
 '/input/output_channel',
 '/input/output_channel_source_track',
 '/input/output_reservoir',
 '/input/output_reservoir_source_track',
 '/input/rate_coefficient',
 '/input/reservoir_concentration',
 '/input/scalar',
 '/input/tidefile']

To read the contents of any of the above tables simply use the get_input_table method

In [11]:
qual.get_input_table('/input/tidefile')

Unnamed: 0,start_date,end_date,file
0,runtime,length,./output/historical_v82.h5
