<img src="AW&H2015.png" style="float: left">

<img src="flopylogo.png" style="float: center">

# History match the Freyberg model using a two parameters ``K`` and ``R`` using head and flux observations

#### Where are we on the Goldilocks complexity curve? 

<img src="Hunt1998_sweetspot.png" style="float: center">



The runs so far were intended to be greatly oversimplified so as to be a starting point for adding complexity. However, when we added just __*one more parameter*__ for a total of 2 parameters some uncerainty for some forecasts got appreciably __worse__.  And these parameters cover the entire model domain, which is unrealistic for the natural world!  Are we past the "sweetspot" and should avoid any additional complexity even if our model looks nothing like reality?  

Adding parameters in and of itself is not problematic.  Rather, it is adding parameters that influence forecasts but are unconstrained by observations so that they are free to wiggle and ripple uncertainty to our forcasts.  If observations are added that help constrain the parameters, the forecast observation will be more certain. That is, the natural flip side of adding parameters is constraining them, with data (first line of defense) or soft-knowledge and problem dimension reduciton (SVD).  

Anderson et al. (2015) suggest that at a minimum groundwater models be history matched to heads and fluxes.  There is a flux observation in our PEST control file, but it was given zero weight.  Let's see what happens if we move our model to the minimum calibration of Anderson et al.

#### Objectives for this notebook are to:

1) Add a flux observation to the measurement objective function of our Freyberg model

2) Explore the effect of adding the observation to history matching, parameter uncertainty, and forecast uncertainty

In [None]:
%matplotlib inline
import os
import shutil
import pandas as pd
import matplotlib.pyplot as plt
import pyemu
import platform
import pestools as pt
if 'window' in platform.platform().lower():
    ppp = 'pest++'
    pestchek = 'pestchek'
else:
    ppp = './pestpp'
    pestchek = './pestchek'

In [None]:
base_dir = os.path.join("..","..","models","Freyberg","Freyberg_K_and_R")
assert os.path.exists(base_dir)
[shutil.copy2(os.path.join(base_dir,f),f) for f in os.listdir(base_dir)]

In [None]:
os.system("{0} freyberg.pst".format(ppp))

``PEST++`` only ran the model one time - why?

In [None]:
pst = pyemu.Pst("freyberg.pst")
pst.observation_data

Let's give the observation ``rivflux_cal`` a non-zero weight

In [None]:
pst.observation_data.loc["rivflux_cal","weight"] = 0.01 #super subjective
pst.observation_data

### Now let's change NOPTMAX from 0 to 20 so we can see what the effect of weighting the flux target is

In [None]:
pst.control_data.noptmax = 20
pst.write("freyberg.pst")

### And we'll run the model - look at the terminal window where you launched this notebook to see the progress of PEST++.  Advance through the code blocks when you see a 0 returned.

In [None]:
os.system("{0} freyberg.pst".format(ppp))

Let's explore the results

In [None]:
df_obj = pd.read_csv("freyberg.iobj",index_col=0)
df_obj

## Egads!  Our Phi is huuuuuge!  Oh wait, we added a new observation, so we can't compare it to what we had with only head observations.

###  Here's what you'd see if it was only head observations

In [None]:
res = pt.Res('freyberg.rei')

res.describe_groups('head_cal')

### Hmmm, the fit is not as good as before....Here's what the flux target (n=1) did

In [None]:
res.describe_groups('flux_cal')

In [None]:
res.plot_one2one('head_cal',print_stats=['Mean', 'MAE', 'RMSE'])

In [None]:
res.plot_measured_vs_residual('head_cal')

#### Okay, not just the summary statistics, this head fit even looks visually worse.  What did it do to our parameter uncertainty?

In [None]:
df_paru = pd.read_csv("freyberg.par.usum.csv",index_col=0)
df_paru

# Hold the phone - only K is showing here.  Did you run PESTCHEK before burning the silicon? 

 (Remember last notebook where we said:  "Let's run PESTCHEK and see what it says about our freyberg.pst file"?)

In [None]:
os.system("{0} freyberg.pst".format(pestchek))

#### Well the instructors gave you the same PEST control file as last exerisce!  Some one should tell them that it was "curious" in the last notebook but vexxing now, because we again see that in the PESTCHEK warning section it says "All parameters belonging to the parameter group "rch" are either fixed or tied". That is flagged as a warning because PESTCHEK is wondering (with good reason in this case) why would it not be adjustable after you went to all the trouble to define it as a parameter.  But, there  are times you may want to do this, so it is classified as a warning and isn't going to stop you.

#### But that is not what we want, we want to make recharge a parameter in this activity and redo our work (did we remember to mention the importance of running PESTCHEK?)

### Open the PEST control file freyberg.pst in your text editor.  

1) Look in the parameter data section

2) Find the parameter __rch1__ (the recharge for the calibration period) and make it adjustable (hint:  look at the other parameters) 

3) Save the file

4) Run PESTCHEK on the PEST control file in a seperate terminal window or by executing the next code block and looking at the terminal window where you launched this notebook

In [None]:
os.system("{0} freyberg.pst".format(pestchek))

### Now to redo our steps from above....look at the terminal window where you launched this notebook to see the progress of PEST++.  Advance through the code blocks when you see a 0 returned.

In [None]:
os.system("{0} freyberg.pst".format(ppp))

### Again, let's look at results

In [None]:
df_obj = pd.read_csv("freyberg.iobj",index_col=0)
df_obj

### Well that Phi is very different!  Funny what a little parameter flexibility will get you......

In [None]:
res = pt.Res('freyberg.rei')

res.describe_groups('head_cal')

In [None]:
res.describe_groups('flux_cal')

In [None]:
res.plot_one2one('head_cal',print_stats=['Mean', 'MAE', 'RMSE'])
res.plot_measured_vs_residual('head_cal')

In [None]:
df_paru = pd.read_csv("freyberg.par.usum.csv",index_col=0)
df_paru

### Much better - thanks PESTCHEK.  Now let's compare the parameter uncertainty results with the flux observation above to the previous run where we zero weighted the flux observation below:

In [None]:
df_paru_base = pd.read_csv(os.path.join("..","freyberg_k_and_r","freyberg.par.usum.csv"),index_col=0)
df_paru_base

The posterior standard deviation got worse for HK1 with the flux observation than with it, but RCH1 has a smaller standard deviation with the flux observation.  Why?


###  Here's the parameter uncertainty for the K and R parameters, side by side, heads+flux observation vs heads only

In [None]:
df_paru_concat = pd.concat([df_paru,df_paru_base],join="outer",axis=1,keys=["heads+fluxobs","heads_only"])
df_paru_concat

Interesting - a tradeoff with fit between the two types of observations...


###  Let's plot these up like before.  Here's the prior and posterior standard deviations


In [None]:
for pname in df_paru_concat.index:
    ax = df_paru_concat.loc[pname,(slice(None),("prior_stdev","post_stdev"))].plot(kind="bar")
    ax.set_title(pname)
    plt.show()


### Let's look at our forecasts - here's the K and R model with the flux observations:

In [None]:
df_foreu = pd.read_csv("freyberg.pred.usum.csv",index_col=0)
df_foreu.loc[:,"reduction"] = 100.0 *  (1.0 - (df_foreu.post_stdev / df_foreu.prior_stdev))

df_foreu

### Compare these results with the ``k_and_r`` model *without* the flux observation (below):

In [None]:
df_foreu_single = pd.read_csv(os.path.join("..","freyberg_k_and_r","freyberg.pred.usum.csv"),index_col=0)
df_foreu_single.loc[:,"reduction"] = 100.0 *  (1.0 - (df_foreu_single.post_stdev / df_foreu_single.prior_stdev))
df_foreu_single

### And here the forecast uncertianties are side by side

In [None]:
df_foreu_concat = pd.concat([df_foreu,df_foreu_single],join="outer",axis=1,keys=["heads+fluxobs","heads_only"])
df_foreu_concat

### and plotted

In [None]:
df_foreu_concat.loc[:,(slice(None),"reduction")].plot(kind="bar",legend=False)

### The information in the flux obs has reduced ``rivflux_fore`` forecast uncertainty dramatically, but has not really helped with ``travel_time`` or heads.  So on first blush we see that the same model/observation data set can make some forecasts better but not others

### But there is more to it than that - think about which observation helped which parameter and which forecast the most.  Is there an "birds of the feather" type of thing going on?  
