# Reading Hydro echo files into Pandas DataFrame

hydro (and qual) compile the input files and output an echo file that contains all the input data that goes into the model run. This is a very useful file as it is a direct representation of the input as seen by the model. This is especially important as the input system with its layering and overrides and include featues can be quite complex and sometimes it can be hard to see what actually finally make it into the model run.

To create an echo file of the inputs, refer to hydro and qual document. Below is an example 

```
 hydro -e hydro_main.inp 
  
 qual -e qual_main.inp
```

The notebook here shows how to use the pydsm parser to read this echo file into a dictionary of pandas DataFrame objects


In [None]:
import pandas as pd
import io
import re
# main import 
import pydsm
from pydsm.input import read_input,write_input

## Read input into a dictionary of pandas DataFrames

DSM2 input consists of tables that have a name, e.g. CHANNEL. They have named columns and then rows of values for those columns

```
CHANNEL
CHAN_NO LENGTH MANNING DISPERSION UPNODE DOWNNODE
0	1	19500	0.035	360.0	1	2
1	2	14000	0.028	360.0	2	3
...
END
```

The *read_input* method reads the input file, parsing the tables found into data frames and returning a dictionary the keys of which are the names of the tables

Lets see how this looks...

In [None]:
fname='../../tests/data/hydro_echo_historical_v82.inp'
tables=read_input(fname)

## Print list of all tables

In [None]:
print(list(tables.keys()))

## Display the DataFrame for a table, e.g. CHANNEL

In [None]:
display(tables['CHANNEL'])

It is a data frame so you can query its types, etc just like a pandas DataFrame

In [None]:
print(tables['CHANNEL'].dtypes)

## Programmatic inspection of input
This is an important feature as pandas has a multitude of features to allow for filtering, describing and joining with other DataFrames and this can be used to analyze the input file

For example, display all channels with length > 20000 feet

In [None]:
c=tables['CHANNEL']
print('Channels with length > 20000 ft:')
display(c[c.LENGTH>20000])

## Combining input tables
Channels have cross sections but those are defined in the 'XSECT_LAYER' table. 


In [None]:
x=tables['XSECT_LAYER']
display(x)

These can be combined (merged) with the channel table on the 'CHAN_NO' common column to get a larger table with channel and x section information

In [None]:
fc=pd.merge(c,x,on='CHAN_NO')
for name, group in fc[fc.CHAN_NO==441].groupby('DIST'):
    print('DIST: ',name)
    display(group)

# Visualizing input data
Furthermore this information can be displayed with the usual pandas and other visualization libraries that are available

In [None]:
group=fc[fc.CHAN_NO==441].groupby('DIST')
dist,group=next(iter(group))
group.plot(y='ELEV',x='AREA',kind='line',label='AREA',title='AREA with ELEV')
group.plot(y='ELEV',x='WIDTH',kind='line',label='WIDTH',title='WIDTH with ELEV')
_=group.plot(y='ELEV',x='WET_PERIM',kind='line',label='WET_PERIM',title='WET_PERIM with ELEV')

# Writing input to file

Once the tables are manipulated using pandas DataFrame functions, these can be written to a file to be used as input to run DSM2 models

The code below shows 

In [None]:
write_input('../../tests/hydro_echo_historical_v82_copy.inp',tables)