<img src="Images/HSP2.png" />
This Jupyter Notebook Copyright 2016 by RESPEC, INC.  All rights reserved.

$\textbf{HSP}^{\textbf{2}}\ \text{and}\ \textbf{HSP2}\ $ Copyright 2016 by RESPEC INC. and released under this [License](LegalInformation/License.txt)

# Calleg TEST NOTEBOOK for HSP$^2$  (WORST CASE COMPARISONS)

This Notebook will compare the results of running HSPF and HSP$^2$ for the basic hydrology (PWATER, IWATER, and HYDR) to confirm the proper calculations of HSP$^2$

Calleg is a real watershed and has
+ 27 IMPLND segments,
+ 129 PERLND segments,
+ 119 RCHRES segments,
+ 9 years of simulation time with hourly time steps (78,888 timesteps)

### Required Python imports  and setup

In [None]:
import numpy as np
import pandas as pd

pd.options.display.max_rows    = 25
pd.options.display.max_columns = 20
pd.options.display.float_format = '{:.4f}'.format  # display 2 digits after the decimal point

import matplotlib.pyplot as plt
%matplotlib inline

import hspfbintoolbox
import HSP2
import HSP2tools
HSP2tools.versions()

### Setup paths to the necessary datafiles
This assumes the calleg.uci and calleg.wdm files are located in the current working directory for this Notebook. This will create the binary output file, calleg.hbn, along with a number of other standard HSPF outputs.

In [None]:
wdmpath = 'calleg.wdm'
ucipath = 'calleg.uci'
hdfpathx = 'callegx.h5'
hdfpath = 'calleg.h5'
hbnpath = 'calleg.hbn'

## Run HSPF

Using the Basins 4.1 WinHspfLt executable to run calleg.uci.

This assumes the calleg.uci and calleg.wdm files are located in the current working directory for this Notebook.  This will create the binary output file, calleg.hbn, along with a number of other standard HSPF outputs.

In [None]:
!echo %date% - %time%

!C:\BASINS41\models\HSPF\bin\WinHspfLt.exe {ucipath}
    
!echo %date% - %time%

For development, save the best time for reference:

Now run HSP$^2$ on the calleg watershed

In [None]:
HSP2.run(hdfpath)

In [None]:
%timeit HSP2.run(hdfpath)

## Determine Available Calculated Results

Now use Tim Cera's hspfbintoolbox.py to determine the available timeseries created by HSPF and stored into the HBN binary file.

Time interval codes: {5: 'yearly', 4: 'monthly', 3: 'daily', 2: 'bivl'}.

No daily available, so use monthly (4) timeseries for analysis.

In [None]:
keys = hspfbintoolbox.catalog(hbnpath).keys()
keys[:5]  # show only the first 5 as a check

## Automate checking IMPLNDs for SURO

Extract the keys (calculated above) for IMPLD + IWATER + SURO. For each key, compute several columns.  The final column shows the percent difference of the sum of the SURO for the entire run between HSPF and HSP2.

In [None]:
segments = [str(key[1]) for key in keys if key[0]=='IMPLND' and key[2]=='IWATER' and key[3]=='SURO' and key[4]==4]

dfimplnd = pd.DataFrame()
for seg in segments:  
    path = 'IMPLND,' + seg + ',IWATER,SURO'
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path).values
           
    path = 'RESULTS/IMPLND' + '_I' + '{:0>3s}'.format(seg) + '/IWATER'
    hsp2 = pd.read_hdf(hdfpath, path)['SURO'].resample('MS').sum().values
    
    #dfimplnd.at[seg, 'Max Diff'] =  (hspf - hsp2).max()
    dfimplnd.at[seg, 'Sum of HSPF'] = hspf.sum()
    dfimplnd.at[seg, 'Sum of HSP2'] = hsp2.sum()
    dfimplnd.at[seg, '%diff of Sum'] = 100.0 * (hspf.sum() - hsp2.sum()) / hspf.sum()
    dfimplnd.at[seg, 'abs(%diff of Sum)'] = 100.0 * abs(hspf.sum() - hsp2.sum()) / hspf.sum()

dfimplnd = dfimplnd.sort_values(by=['abs(%diff of Sum)'])
dfimplnd

Look at the statistics for the percent difference column

In [None]:
dfimplnd['%diff of Sum'].hist()

In [None]:
dfimplnd['%diff of Sum'].describe()

In [None]:
ils = dfimplnd.index[-1]
print 'WORST IMPLND SEGMENT IS', ils
print '%diff of the total SURO sum of', dfimplnd.loc[ils,'%diff of Sum']

### Define a function to read HSPF and HSP2 data, and plot together for IMPLND

In [None]:
def imp(ils, name, how='sum'):
    # Use Tim Cera's HBN reader to get the HSPF data  
    path = 'IMPLND,' + str(ils) + ',IWATER,' + name
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path)

    # Now read the corresponding HSP2 data and comvert to monthly, MS (Month Start) to match hspfbintoolbox data.
    path = '/RESULTS/IMPLND' + '_I' + '{:0>3s}'.format(str(ils)) + '/IWATER'                                                   
    hsp2 = pd.read_hdf(hdfpath, path)
    if how == 'sum':
        hsp2 = hsp2.resample('MS').sum()
    elif how == 'last':
        hsp2 = hsp2.resample('MS').last()
        
    hsp2 = hsp2[name]
    
    plt.figure(figsize=(10,8))
    plt.plot(hspf.index, hspf, label='HSPF', color='r')
    plt.plot(hsp2.index, hsp2, label='HSP2', color='b')
    plt.legend()
    plt.title('IMPLND ' + 'I' + '{:0>3s}'.format(str(ils)) + ', IWATER ' +  name)
    
    return hspf, hsp2

#### IMPLND IWATER SURO, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'SURO', 'sum')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### IMPLND IWATER IMPEV, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'IMPEV')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### IMPLND IWATER PET, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'PET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### IMPLND IWATER RETS, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'RETS', 'last')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### IMPLND IWATER SUPY, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'SUPY')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### IMPLND IWATER SURS, Monthly

In [None]:
hspf, hsp2 = imp(ils, 'SURS', 'last')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

## Automate checking PERLNDs for PERO

### Define routine to read HSPF and HSP2 data and plot together

In [None]:
def per(pls, name, how='sum'):
    # Use Tim Cera's HBN reader to get the HSPF data  
    path = 'PERLND,' + str(pls) + ',PWATER,' + name
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path)
    
    # Now read the corresponding HSP2 data and comvert to monthly
    path = '/RESULTS/PERLND' +  '_P' + '{:0>3s}'.format(str(pls)) + '/PWATER'
    if how == 'sum':
        hsp2 = pd.read_hdf(hdfpath, path)[name].resample('MS').sum()
    elif how == 'last':
        hsp2 = pd.read_hdf(hdfpath, path)[name].resample('MS').last()
    
    plt.figure(figsize=(10,8))
    plt.plot(hspf.index, hspf, label='HSPF', color='r')
    plt.plot(hsp2.index, hsp2, label='HSP2', color='b')
    plt.legend()
    plt.title('PERLND ' + 'P' + '{:0>3s}'.format(str(pls))+ ', PWATER ' +  name)
    
    return hspf, hsp2

### Now find all available monthly data

In [None]:
segments = [str(key[1]) for key in keys if key[0]=='PERLND' and key[2]=='PWATER' and key[3]=='PERO' and key[4]==4]

dfperlnd = pd.DataFrame()
for seg in segments:  
    path = 'PERLND,' + seg + ',PWATER,PERO'
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path).values
  
    path =  'RESULTS/PERLND' +  '_P' + '{:0>3s}'.format(str(seg)) + '/PWATER'   
    hsp2 = pd.read_hdf(hdfpath, path)['PERO'].resample('MS').sum().values

    #dfperlnd.at[seg, 'Max Diff'] =  (hspf - hsp2).max()
    dfperlnd.at[seg, 'Sum of HSPF'] = hspf.sum()
    dfperlnd.at[seg, 'Sum of HSP2'] = hsp2.sum()
    dfperlnd.at[seg, '%diff of Sum'] = 100.0 * (hspf.sum() - hsp2.sum()) / hspf.sum()
    dfperlnd.at[seg, 'abs(%diff of Sum)'] = 100.0 * abs(hspf.sum() - hsp2.sum()) / hspf.sum()

dfperlnd = dfperlnd.sort_values(by=['abs(%diff of Sum)'])    
dfperlnd

In [None]:
dfperlnd['%diff of Sum'].hist(bins=40)

In [None]:
dfperlnd['%diff of Sum'].describe()

The PERLND segments are ordered in assending "abs(%diff of Sum)", so the last entry is the worst case (by this measure.)

In [None]:
pls = dfperlnd.index[-1]
print 'WORST PERLND SEGMENT IS', pls
print '%diff of the total PERO sum of', dfperlnd.loc[pls,'%diff of Sum']

#### PERLND PWATER AGWO

In [None]:
hspf, hsp2 = per(pls, 'AGWO')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER BASET

In [None]:
hspf, hsp2 = per(pls, 'BASET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

##### PERLND PWATER CEPE

In [None]:
hspf, hsp2 = per(pls, 'CEPE')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER IFWI

In [None]:
hspf, hsp2 = per(pls, 'IFWI')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER IFWO

In [None]:
hspf, hsp2 = per(pls, 'IFWO')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER IGWI

In [None]:
hspf, hsp2 = per(pls, 'IGWI')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER INFIL

In [None]:
hspf, hsp2 = per(pls, 'INFIL')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER LZET

In [None]:
hspf, hsp2 = per(pls, 'LZET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER PERC

In [None]:
hspf, hsp2 = per(pls, 'PERC')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER PERO

In [None]:
hspf, hsp2 = per(pls, 'PERO')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER PERS

In [None]:
hspf, hsp2 = per(pls, 'PERS', 'last')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER PET, Monthly

In [None]:
hspf, hsp2 = per(pls, 'PET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER SUPY

In [None]:
hspf, hsp2 = per(pls, 'SUPY')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER SURO

In [None]:
hspf, hsp2 = per(pls, 'SURO')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER TAET

In [None]:
hspf, hsp2 = per(pls, 'TAET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER UZET

In [None]:
hspf, hsp2 = per(pls, 'UZET')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### PERLND PWATER UZI

In [None]:
hspf, hsp2 = per(pls, 'UZI')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

## RCHRES

### Define routine to read HSPF and HSP2, plot together for RCHRES

In [None]:
def rch(rid, name, how='sum'):
    # Use Tim Cera's HBN reader to get the HSPF data  
    path = 'RCHRES,' + str(rid) + ',HYDR,' + name
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path)
    
    # Now read the corresponding HSP2 data and comvert to monthly
    path = '/RESULTS/RCHRES' +   '_R' + '{:0>3s}'.format(str(rid)) + '/HYDR'
    if how == 'sum':
        hsp2 = pd.read_hdf(hdfpath, path)[name].resample('MS').sum()
    elif how == 'last':
        hsp2 = pd.read_hdf(hdfpath, path)[name].resample('MS').last()
    
    plt.figure(figsize=(10,8))
    plt.plot(hspf.index, hspf, label='HSPF', color='r')
    plt.plot(hsp2.index, hsp2, label='HSP2', color='b')
    plt.legend()
    plt.title('RCHRES ' +   'R' + '{:0>3s}'.format(str(rid)) + ', HYDR ' +  name)
    
    return hspf, hsp2

### Automate checking RCHRESs for ROVOL

In [None]:
segments = [str(key[1]) for key in keys if key[0]=='RCHRES' and key[2]=='HYDR' and key[3]=='ROVOL' and key[4]==4]

dfrchres = pd.DataFrame()
for seg in segments:  
    path = 'RCHRES,' + seg + ',HYDR,ROVOL'
    hspf = hspfbintoolbox.extract(hbnpath, 'monthly', path).values

    path = 'RESULTS/RCHRES'+   '_R' + '{:0>3s}'.format(str(seg)) +'/HYDR' 
    hsp2 = pd.read_hdf(hdfpath, path)['ROVOL'].resample('MS').sum().values
    
    #dfrchres.at[seg, 'Max Diff'] =  (hspf - hsp2).max()
    dfrchres.at[seg, 'Sum of HSPF'] = hspf.sum()
    dfrchres.at[seg, 'Sum of HSP2'] = hsp2.sum()
    dfrchres.at[seg, '%diff of Sum'] = 100.0 * (hspf.sum() - hsp2.sum()) / hspf.sum()
    dfrchres.at[seg, 'abs(%diff of Sum)'] = 100.0 * abs(hspf.sum() - hsp2.sum()) / hspf.sum()    

dfrchres = dfrchres.sort_values(by ='abs(%diff of Sum)')    
dfrchres

In [None]:
dfrchres['%diff of Sum'].hist(bins=40)

In [None]:
dfrchres['%diff of Sum'].describe()

The RCHRES segments are ordered in assending "abs(%diff of Sum)", so the last entry is the worst case (by this measure.)

In [None]:
rid = dfrchres.index[-1]
print 'WORST RCHRES SEGMENT IS', rid
print '%diff of the total PERO sum of', dfrchres.loc[rid,'%diff of Sum']

In [None]:
dfrchres.loc[str(rid),:]    #['%diff of Sum'].describe()

#### RCHRES HYDR IVOL

In [None]:
hspf, hsp2 = rch(rid, 'IVOL')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### RCHRES HYDR PRSUPY

In [None]:
hspf, hsp2 = rch(rid, 'PRSUPY')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### RCHRES HYDR ROVOL

In [None]:
hspf, hsp2 = rch(rid, 'ROVOL')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### RCHRES HYDR VOL

In [None]:
hspf, hsp2 = rch(rid, 'VOL', 'last')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])

#### RCHRES HYDR VOLEV

In [None]:
hspf, hsp2 = rch(rid, 'VOLEV')

In [None]:
plt.scatter(hspf, hsp2)
top = 1.05 * max(hspf.values.max(), hsp2.values.max())
plt.plot([0.0, top], [0.0, top])