<img src="AW&H2015.png" style="float: left">

<img src="flopylogo.png" style="float: center">

# History match the Freyberg model using a two parameters ``K`` and ``R``

#### Using only one parameter is __*very*__ (VERY) likely to be on the "oversimplified" part of our Goldilocks complexity curve:


<img src="Hunt1998_sweetspot.png" style="float: center">



But this is a fitting starting point for adding complexity. For many model forecasts, getting a representative simulation of the water budget is a primary objective.  From the groundwater governing eqation we can see what we need to be looking at:

<img src="GW_GE.png" style="float: left">

where $W^*$ represents the sources and sinks into the model. We've got K as an adjustable parameter already, but  __Recharge__ is typically one of the most important sources of water to a groundwater system. Unfortunately, it is hard to measure in the field at the scale we need for a model. So we can't fix it (i.e., think structural error).  So it typically is a parameter adjusted during history matching.

#### Objectives for this notebook are to:

1) add a recharge parameter to the Freyberg model

2) explore the effect of adding a parameter to history matching, parameter uncertainty, and forecast uncertainty


# Quick reminder of what the model looks like:

It is a heterogenous 2D areal (1-layer) model that is a step up in complexity from our xsec model. Recall it looks like this, as shown in the original Freyberg (1988) paper on the left, and a Groundwater Vistas version on the right (from the file in the GW_Vistas subdirectory).   

<img src="Freyburg1988_fig1.png" style="float: left">

<img src="Freyberg_k_plot_GW_Vistas.png" style="float: right">

### Standard two code blocks to set the notebook up

In [1]:
%matplotlib inline
import os
import shutil
import pandas as pd
import matplotlib.pyplot as plt
import pyemu
import pestools as pt
import platform
if 'window' in platform.platform().lower():
    ppp = 'pest++'
    pestchek = 'pestchek'
else:
    ppp = './pestpp'
    pestchek = './pestchek'

setting random seed

GIS dependencies not installed. Please see readme for instructions on installation


In [2]:
base_dir = os.path.join("..","..","models","Freyberg","Freyberg_K_and_R")
assert os.path.exists(base_dir)
[shutil.copy2(os.path.join(base_dir,f),f) for f in os.listdir(base_dir)];

### Let's run PESTCHEK and see what it says about our freyberg.pst file

In [3]:
os.system("{0} freyberg.pst".format(pestchek))

0

#### Curious, in the PESTCHEK warning section it says "All parameters belonging to the parameter group "rch" are either fixed or tied". That is flagged as a warning because PESTCHEK is wondering why would it not be adjustable after you went to all the trouble to define it as a parameter.  But, there  are times you may want to do this, so it is classified as a warning and isn't going to stop you.

### But that is not what we want, we want to make recharge a parameter in this activity.

### Open the PEST control file freyberg.pst in your text editor.  

1) Look in the parameter data section

2) Find the parameter __rch1__ (the recharge for the calibration period) and make it adjustable (hint:  look at the other parameters) 

3) Save the file

4) Run PESTCHEK on the PEST control file in a seperate terminal window or by executing the next code block and looking at the terminal window where you launched this notebook

In [6]:
os.system("{0} freyberg.pst".format(pestchek))

0

Did that parameter group "rch" warning go away?


4) When no errors, run PEST++ by executing the next block (look in your terminal window to see the run progress and wait for the 0 to show up below the code block before continuing).


In [7]:
os.system("{0} freyberg.pst".format(ppp))

0

### ``PEST++`` only ran the model one time - why?

#### This was of course in the warnings when we ran PESTCHEK. Let's do that again in the next block and pay attention to that part of the warnings section (look at the terminal window where you launched this notebook to see the PESTCHEK output). 

In [8]:
os.system("{0} freyberg.pst".format(pestchek))

0

In [9]:
pst = pyemu.Pst("freyberg.pst")
pst.control_data.noptmax = 20
pst.write("freyberg.pst")

#### Let's run PESTCHEK again to make sure

In [10]:
os.system("{0} freyberg.pst".format(pestchek))

0

### Now that we've changed NOPTMAX, let's run a PEST++ again.  We set NOPTMAX to 20 so the run is longer - watch your terminal window for progress.  Again, don't advance until you see a 0 returned below the code block.


#### As you watch your terminal window scroll by, look at the right edge where it reports the counter of the runs needed for each PEST++ iteration.  Why does it change?

In [None]:
os.system("{0} freyberg.pst".format(ppp))

### Let's see how we did for our fit:

In [None]:
df_obj = pd.read_csv("freyberg.iobj",index_col=0)
df_obj

(you can see this information unformatted by opening the .iobj file, which is a comma delimited ASCII PEST++ output file)



### Additional knobs given to PEST gives it more flexibility for fitting the observed data - is the final Phi lower than the last activity?


### You can see individual residuals (and the weights you specified) in the freyberg.rei file.  Here's we'll use Python to calculate the summary statistics for this new run:

In [None]:
res = pt.Res('freyberg.rei')

res.describe_groups('head_cal')

### And plot up the results again:

In [None]:
res.plot_one2one('head_cal',print_stats=['Mean', 'MAE', 'RMSE'])

res.plot_measured_vs_residual('head_cal')

### Hmm, looks familiar.....let's look at the parameter uncertainties:

In [None]:
df_paru = pd.read_csv("freyberg.par.usum.csv",index_col=0)
df_paru

### Well at least that table is different.  Now let's look at the parameter uncertainties:

#### Here's the K only case from the last activity:

In [None]:
df_paru_single = pd.read_csv(os.path.join("..","freyberg_k","freyberg.par.usum.csv"),index_col=0)
df_paru_single

#### Here's the new K+R (left) next the to K only results (right)

In [None]:
df_paru_concat = pd.concat([df_paru,df_paru_single],join="outer",axis=1,keys=["k+r","k_only"])
df_paru_concat

In [None]:
for pname in df_paru_concat.index:
    ax = df_paru_concat.loc[pname,(slice(None),("prior_stdev","post_stdev"))].plot(kind="bar")
    ax.set_title(pname)
    plt.show()

### How does the uncertainty reduction for ``hk1`` change when ``rch1`` is included?

## Now let's look at the forecasts:

#### First, here's our previous results for the K only case:

In [None]:
df_foreu_single = pd.read_csv(os.path.join("..","freyberg_k","freyberg.pred.usum.csv"),index_col=0)
df_foreu_single.loc[:,"reduction"] = 100.0 *  (1.0 - (df_foreu_single.post_stdev / df_foreu_single.prior_stdev))
df_foreu_single

#### Here's our new run where K and R are parameters:

In [None]:
df_foreu = pd.read_csv("freyberg.pred.usum.csv",index_col=0)
df_foreu.loc[:,"reduction"] = 100.0 *  (1.0 - (df_foreu.post_stdev / df_foreu.prior_stdev))
df_foreu_concat = pd.concat([df_foreu,df_foreu_single],join="outer",axis=1,keys=["k+r","k_only"])
df_foreu_concat

## Better to plot them:

### (blue is k and r, green is k only)

In [None]:
df_foreu_concat.loc[:,(slice(None),"reduction")].plot(kind="bar", legend=False)

### Which forecasts are influenced by the ``rch1`` parameter?  

### Which forecasts were more or less unchanged - why?

### Which case (``K`` or ``K+R``) provides the more robust uncertainty estimate?

# Wait, something is amiss.  

### Look at this slightly modified version of the groundwater governing equation from Anderson et al. (2015) below.  Is this problem well posed? That is, if recharge increased (represented by an increase in $W^*$) *and* at the same time K increased (represented by an increase in q) could they offset each other so that the righthand side stays the same? What is this called?

 
  <img src="GW_GE2.jpg" style="float: center">
 
 
 ### Recall the Bravo et al. (2002) trough when calibrating K and R with only heads:
 
  <img src="Fig9.11a_bravo_trough.jpeg" style="float: center">
  
  ****************
 

###  Even though estimating both R and K using only head observations is not possible, PEST++ gave you an answer.  How? (Hint:  look in the .rec file and see what happens to the parameters over the course of the run)


## Let's dig into the PEST result a bit more...

1) Compare the optimal rch1 parameter value in the __freyberg.rec__ file (search for "Optimal parameter values" without the quotation marks) to the rch1 parameter data the instructors supplied in the PEST control file.  

2) Where does the optimal parameter lie with respect to the bounds that were given for the parameter? 

3) Open the __freyberg.ipar__ file in a text editor.  What was the iteration history of rch1?

# Let's look at the correlation

In [None]:
sc = pyemu.Schur('freyberg.jcb')
cov = pyemu.Cov(sc.xtqx.x, names=sc.pst.adj_par_names)
R = cov.to_pearson()
R.df()

# Let's test how "optimal" these parameters are. 

1) Open freyberg.pst in a text editor, 

2) change the upper bound of the rch1 parameter from 1.5500000000E-04 to 2.5500000000E-04

3) re-run PEST++

4) open the __freyberg.ipar__ file in a text editor.  

# Last points:

## Do you believe that value is optimal, or even defensible?  Should we believe the forecast uncertainty either?

# ADVANCED

## How could we make parameter estimation for this model tractable?  (e.g., what did we do in the xsec model?  What did Bravo et al. (2002) do?)