<img src="AW&H2015.png" style="float: left">

<img src="flopylogo.png" style="float: center">

# History match the Freyberg model using a single ``K`` parameter

Freyberg (1988) was notable for the discussing what is often called "point calibration" and "overfitting". It is a heterogenous 2D areal (1-layer) model that is a step up in complexity from our xsec model. Recall it looks like this, as shown in the original Freyberg (1988) paper on the left, and a Groundwater Vistas version on the right (from the file in the GW_Vistas subdirectory).   

<img src="Freburg1988_fig1.png" style="float: left">
<img src="Freyberg_k_plot_GW_Vistas.png" style="float: right">

As we discussed in our first notebook, one way to get to an optimal model is to start simple and add complexity. To do this righte we need to add our forecasts first along with the calibration targets, and look at both as more model complexity is added.  One of the findings of the Freyberg (1988) model is that some students made their model too complex, which diminished its performance for the prediction of interest. Starting simple will keep us out of danger (at least for this notebook). 

This version of the Freyberg model has 3 stress periods:  1 steady state -> 1 transient for 5 years -> 1 steady state

The river stage and node conductance changes in each stress period (see .riv file)

The recharge starts out wetter in stress period one, then is drier in stress periods 2 and 3 (see .rch file)

Pumping well discharge changes each stress period (see .wel file)

In addition to heads and fluxes, MODPATH is also included in the PEST++ model run file to perform particle tracking after MODFLOW finishes

***********************************************
__So, the objectives of this notebook are to:__

1) Ease you into the Freyberg model as we'll be using it for the rest of the class

2) Revisit the PEST control (pst) file

3) Look at typical summary statistics and plots the describe our degree of fit

4) Look at how head data constraints ripple to different forecast types

## Standard two blocks to prep the notebook

In [None]:
%matplotlib inline
import os
import shutil
import pandas as pd
import matplotlib.pyplot as plt
import pyemu
import platform
import pestools as pt
if 'window' in platform.platform().lower():
    ppp = 'pest++'
    pestchek = 'pestchek'
    inschek = 'inschek'
    tempchek = 'tempchek'
else:
    ppp = './pestpp'
    pestchek = './pestchek'
    inschek = './inschek'
    tempchek = './tempchek'

In [None]:
base_dir = os.path.join("..","..","models","Freyberg","Freyberg_K")
assert os.path.exists(base_dir)
[shutil.copy2(os.path.join(base_dir,f),f) for f in os.listdir(base_dir)];

pst = pyemu.Pst("freyberg.pst")
pst.control_data.noptmax = 0
pst.write("freyberg.pst")

## We've given you all the files you need to run this PEST++ on this model.  Open up the .pst file.  To guide your eyes through the PEST control file, answer these questions:

1) How many parameters are we running? 

2) How many are adjustable? 

3) How many types of observations are included?

4) How many forecasts? What types?

5) How many template (tpl) files do we have?

6) How many instruction (ins) files do we have? 

## Now that you know the .ins, and .tpl files, open them in a text editor (make sure you are looking at the ones in /activities/freyberg_k subdirectory). In a seperate terminal window, run TEMPCHEK, INSCHEK and PESTCHEK on the files we've given you. 

### Okay, you've got running PEST utilities in a seperate terminal window by now?  To speed things up we've given you a way to execute these utilities from within the Freyberg notebook.  We've included the equivalent version of what you just did in the next four code blocks.  Execute the code block then look at terminal window where you launched this notebook.

In [None]:
os.system("{0} freyberg.rch.tpl".format(tempchek))

In [None]:
os.system("{0} hk.ref.tpl".format(tempchek))

In [None]:
os.system("{0} freyberg.heads.ins freyberg.heads".format(inschek))

In [None]:
os.system("{0} freyberg.pst".format(pestchek))

# Okay, let's run this thing. 

## Because we call a program from within the Jupyter Notebook you have to look at the terminal window that you used to start the notebook to see the screen report of the run.  So, when executing this next block look at your terminal window to see the run.  It will say "Simulation complete..." when finished.

### NOTE:  And/or wait until the standard out  reports a "0" below this next block (=when the run is finished) before going on.

In [None]:
os.system("{0} freyberg.pst".format(ppp))

## ``PEST++`` only ran the model one time - why?

Yeah, that's right, the NOPTMAX=0 thing again.  We had that set to zero because we want to check the plumbing before burning the silicon. Did everything run (i.e., did you see "Simulation Complete..." in your terminal window?  Like before, you *could* change NOPTMAX to 20 in a text editor.  But, pyemu can do it for you with the next block.  

In [None]:
pst = pyemu.Pst("freyberg.pst")
pst.control_data.noptmax = 20
pst.write("freyberg.pst")

#### "Trust but verify"....by running PESTCHEK

In [None]:
os.system("{0} freyberg.pst".format(pestchek))

### Now we let's run it.  Just like before  you have to look at the terminal window that you used to start the notebook to see the screen report of the run.  So, when executing this next block look at your terminal window to see the run.  It will say "Simulation complete..." when finished.

Or wait until the standard out  reports a "0" below this next block (=when the run is finished) before going on.

In [None]:
os.system("{0} freyberg.pst".format(ppp))

### Let's explore the results

First let's look at the measurement objective function (Phi), which is calculated using the sum of squared weighted residuals.   

In [None]:
df_obj = pd.read_csv("freyberg.iobj",index_col=0)
df_obj

### Which are the only target group to matter?  How was that accomplished in the PEST control file?

For this problem, recall our objective function is calculated using this equation:


<img src="SOSWR_eq_AW&H2015.png" style="float: center">

where Phi is the "sum of squared weighted residuals" that we look to minimize, *whi* is the weight for the ith head observation; *hm* is the measured (observed) head target; *hs* is the simulated head; and n is the number of observations.

# Hey, we told PEST to try 20 parameter estimation upgrades but it stopped at 3!  What gives?!?

(hint: search the .rec file for OPTIMIZATION COMPLETE)

PEST and PEST++ will quit the parameter estimation process if one of these 4 conditions is met:

1) The maximum number of interations specified in NOPTMAX is reached

2) The fit is not getting any better based on a user-supplied closure

3) The parameters are not changing based on a user-supplied closure

4) The user killed the run, usually with a ctrl-c  (happens quite frequently)

##  Let's evaulate our fit using the observed-simulated residuals

In [None]:
res = pt.Res('freyberg.rei')

res.describe_groups('head_cal')

### These represent some *summary* statistics for the history matching - we only have on observation group (heads), if there were other observation types we'd have separate columns.  Note however, the model isn't considered *calibrated* until the parameters are verified to be reasonable.  

### Let's plot up the 1:1 line of measured to simulated values (should be in most all evaluations). The closer the line the better the fit.

In [None]:
res.plot_one2one('head_cal',print_stats=['Mean', 'MAE', 'RMSE'])

### Not a bad fit!  Thanks PEST++.

### We can also look at the residual (y-axis) compared to the observation magnitude (x-axis).  The closer the circle is to the black line the better the fit.  The mean residual is shown as a red line, the pink zone is 1 standard deviation on either side of the mean.

In [None]:
res.plot_measured_vs_residual('head_cal')

# Now let's look at what the calibration did for uncertainty

### First, let's look the change in uncertainty for our one horizontal hydraulic conductivity (Kh) parameter

In [None]:
df_paru = pd.read_csv("freyberg.par.usum.csv")
df_paru

### NOTE: Because we log transformed the Kh parameter the uncertainty results are reported as logrithms in the dataframe above.  What you'll see in the MODFLOW input file is the non-log transformed Kh value, which is 10^0.693406 = 4.9465457


# Now let's look at changes in model forecast uncertainty

In [None]:
df_predu = pd.read_csv("freyberg.pred.usum.csv",index_col=0)
df_predu

### Or maybe just easier to look at the graphs

In [None]:
for forecast in df_predu.index:
    ax = df_predu.loc[forecast,["prior_stdev","post_stdev"]].plot(kind="bar")
    ax.set_title(forecast)
    plt.show()

### And we can use summary statistics to help be quantitative on the reduction in forecast uncertainty

In [None]:
df_predu.loc[:,"percent_reduction"] = 100.0 * (1.0 - (df_predu.post_stdev / df_predu.prior_stdev))
df_predu.percent_reduction

### Does it make sense that the travel time uncertainty would be reduced in the same proportion as the head forecasts?

### We saw in `xsec_response_surface` activity that if we only have head data we get a trough in the objective function and there is no best fit, but here we got a best fit.  How did we did we get away with this?