<img src="AW&H2015.png" style="float: left">

<img src="flopylogo.png" style="float: center">

# Setting up PEST:  Communicating with the model
Recall that, for all its sophistication, parameter estimation calibration does the same things as a modeler does in manual trial-and-error.  Here is a flow chart manual trial-and-error history matching from *Applied Groundwater Modeling (2nd edition)* by Anderson et al. (2015): 

<img src="Fig9.1_manual_t&e.png" style="float: center">

### Compare this to the full overdetermined parameter estimation flow chart:


<img src="Fig9.9_full_overdetermined_PE_flowchart.png" style="float: center">

In this notebook we'll spend time on the steps needed to move from manual trial-and-error so the actions are automated once you set them up. PEST and PEST++ are called "universal" inverse codes because they can be bolted on the outside of any model.  Well, that is any model they can talk to and run, which means the model needs to meet these two criteria:

 1. The model input and output are ASCII text files or can be converted to text files.
 2. The model can be run at the command line without user intervention (also known as "batch mode")
 
 
 ### In this exercise we will get under the hood and see how PEST communicates with the model.

To be more specific, in the blue box in Figure 9.9 above there are certain steps that occur before and after each forward model run that PEST does.  They exactly equate to what a manual trial-and-error calibration does, but PEST does it for you! Here are the steps that happen:

<img src="Fig9.8_PE_flowchart.png" style="float: center">

### Objectives:  

1) During this lesson we'll spend time on "the plumbing" that allows PEST to manipulate model input and output files - these are shown in the 1st and 3rd box in Figure 9.8.  

2) And - we'll run PEST++!

3) And we'll include a forecast in our PEST run

# Revisiting the xsec_trial_and_error model

## Recall that the profile model looks like ths:

<img src="xsect_figure.png" style="float: center">

We want to run PEST so it does what you were doing by hand.  To do this we need to provide conduits that change a model input file and that extract model results after the forward run finishes. First we'll do some Python notebook prep (push shift-enter in the next code block)


In [1]:
import shutil, os


### First copy over the raw model files;  for this activity, we are providing a PEST control file and most of the files needed. 


In [2]:
base_dir = os.path.join("..","..","models","10par_xsec","raw_model_files")
[shutil.copy2(os.path.join(base_dir,f),f) for f in os.listdir(base_dir)]

['10par_xsec.bas',
 '10par_xsec.cbc',
 '10par_xsec.ddn',
 '10par_xsec.dis',
 '10par_xsec.hds',
 '10par_xsec.hds.init',
 '10par_xsec.hds.ins',
 '10par_xsec.list',
 '10par_xsec.lpf',
 '10par_xsec.nam',
 '10par_xsec.oc',
 '10par_xsec.pcg',
 '10par_xsec.wel',
 '10parxsec.wel',
 'botm_Layer_1.ref',
 'delc.ref',
 'delr.ref',
 'grid.csv',
 'hk_Layer_1.ref',
 'hk_Layer_1.ref.tpl',
 'ibound_Layer_1.ref',
 'inschek',
 'inschek.exe',
 'junk.out',
 'junk.out.ins',
 'mf2005',
 'mf2005.exe',
 'mfnwt',
 'model_top.ref',
 'mp6',
 'mp6.exe',
 'pest++.exe',
 'pestchek',
 'pestchek.exe',
 'pestpp',
 'runmodel.py',
 'simple.lpf.tpl',
 'strt_Layer_1.ref',
 'sweep',
 'sweep.exe',
 'tempchek',
 'tempchek.exe',
 'vka_Layer_1.ref',
 'xsect.pst',
 'xsect_simple.pst']

## Now we'll make *template (.TPL) and instruction (.INS)* files

# 1) Template files are used to create model input

### Template files simply replace parameter numerical values with a code variable, named in the PEST Control File
There needs to be one template file __for each model__ input file that has parameters that we want PEST to estimate. PEST will read in each template file,  use it as a template to substitute its updated parameter value, then write one model input file for each TPL file it read.  In the PEST control file we specify each template file __and__ the associated model model input file we want PEST to create after it has updated estimates for the parameters on separate lines. So, say we had a MODFLOW input file named 'my_aquifer.lpf' for which we made a template file 'my_aquifer_lpf.tpl'. In the "model input/output" section of the PEST control file there will be a line containing this:

my_aquifer_lpf.tpl       my_aquifer.lpf

#### Open the PEST control file `xsect_simple.pst` in a text editor and find the template (tpl) file listed in the `   model input/output` section
#### Open that template file in a text editor 
### Rules for constructing TPL Files 

 1. The first line of the TPL file must identify that it is a template file by listing "`ptf ~`" where "`~`" is a "parameter delimiter" that tells PEST where a parameter sits in the file. We used a tilde here but it can be any symbol. __However__, whatever delimiter symbol is listed in the first line must be used consistently throughout that template file.
 2. The template file looks exactly like the original model input file __BUT__ parameters are substituted for the  model input(s) that we want PEST to estimate.  Parameters are identified by surrounding the parameter name listed in the PEST control (.pst) file with the delimiter.  For the "`~`" delimiter that we used above, and a horizontal K parameter named "`hk1`" listed as a parameter in a PEST .pst file, the template file would have "`~   hk1 ~`" __wherever that Kh value__ was listed in the original model input file. 
   * Note that the parameter name can be anywhere between the parameter delimiters
   * PEST will fill the space up to and including the parameter delimiters with a value, so make them as wide as possible for maximum precision
   
#### Example
"`~    hk2    ~`" will be replaced by the value for `hk2`. If that value is 3.14, PEST will write "`3.14000000000`" in its place.

#### The PEST manual explains more detail about how you can control the writing of model input (e.g. scientific notation, double precision, etc.); see http://www.pesthomepage.org/Downloads.php to get the most recent version of the manual.

### Checking a template file with the `TEMPCHEK` utility

Let's check to see if this template file is correct using TEMPCHEK.  TEMPCHEK is a handy PEST utility that allows us to error check our template files without having to do a full PEST run. You can see exactly what files and what order TEMPCHECK is expecting them by simply typing 'TEMPCHEK" at the command line (tempchek for Windows and ./tempchek for Mac).  You'll see:

```
TEMPCHEK Version 14.01. Watermark Numerical Computing.

TEMPCHEK is run using the command:

   tempchek tempfile [modfile [parfile]]

where

   "tempfile" is a PEST template file,
   "modfile" is an [optional] model input file to be written by TEMPCHEK, and
   "parfile" is an [optional] parameter value file.

```

#### Run `TEMPCHEK` __on the template file listed in  `xsect_simple.pst`__ and open the associated output file listed.  The TEMPCHEK output file is useful when you have many parameters in a template file. 

### Easiest way to make a template file? Modify an existing input file. Let's modify horizontal hydraulic conductivity (HK) in a new version of the XSECT model

 1. Look inside __xsect.pst__ to see which forward model is to be run. 
 2. It's MODFLOW, so read the NAM file to find whether LPF, BCF, or UPW is being used
 3. Open LPF (in this case) in a text editor to find if there are external files being used
 4. Make a copy of the input file that contains HK and name it the same as the input file but with .tpl extension
 5. Add a new line on top of your tpl file to tell PEST that it is a template file and what the delimiter is
 6. Substitute the variable hk1 surrounded by the delimiter you chose where appropriate
 7. Save the file and run TEMPCHEK 
 

# 2) Instruction files extract results from model output
Similar to the template files, the names of instruction files and which model output files they should work on are listed after all the template files in the * model input/output section of the PEST control file.  As you might expect with the wide range of model output file variation, creating instruction files is slightly more complicated than creating template files. There is basically an internal scripting language for reading text files of model output, extracting the output of interest, and providing it directly to PEST.

#### Open the PEST control file `xsect_simple.pst` in a text editor and find the Instruction (ins) file listed in the `model input/output` section
#### Open that Instruction file in a text editor 

### Rules for INS Files 

 * The first line on an .ins file must be "`pif ~`" where "`~`" is a "marker delimiter"--a symbol that can be used to identify text to search for.  It is expected on this first line but it's not always used.
 * The scripting options are extensive but particular. Some options on how to navigate to the numerical data you want to read are:
   1. Using a line advance.  Given that PEST will start at the first line of the model output file, you can tell PEST to move down the file _`n`_ lines using the `l` character (=lowercase letter l) with a number.  So "`l1`" moves down one line, "`l3`" moves down 3 lines.  
   2. Using the marker delimiter, the INS file can search through a file until it finds a "primary marker". For example:  
   "`~VOLUMETRIC BUDGET FOR ENTIRE MODEL~`" can be used to search for budgets in a LST file  
   This is particularly well suited for output files (like a LST file) that have unpredictable lengths.  Note though that PEST will always start at the top of the file and go down, never up and never wrapping once it reaches the end.  This can be a problem when the order of some observations with respect to other observations is not consistent (e.g., some MODPATH output).  When searching for mutiple observations that may vary in order in an output file, it is easiest to have multiple instruction files open the same model output file multiple times so you are always starting at the top of the file (PEST does not mind). 
   3. Next, you can search for a "secondary marker" within a line using the marker delimiter again. This navigates from the left of the line until the secondary marker is found.
   4. Once on the line you can specify which columns on a line to extract.  So a line in an instruction file that says '~101  138~ (depth_T2-90)46:58'means that PEST will look for '101  138' in the model output file (with the exact number of spaces between the two numbers) then extract column 46:58 to determine the model output that equates to the target observation 'depth_T2-90' that is listed in the PEST control file.   
5. Finally, you can read in whitespace-delimited numerical data using "`!`" around the observation name:  
   for example, if the output file is:  
   ```
   Output file from run 5
   Run date: 1/1/2012
   Run time: 24.3 hours
   Converged in 350 iterations
   Head Observations:
   H1=33.345 H2=45.34
   ...
   ```  
   The instruction file would be like 
   ```
   pif ~
    ~Head Observations~
    l1 ~H1=~ !h1val! ~H2=~ !h2val!
   ```
   
 These are only a few of the most commonly used options but more options, and more detail on these, are available in the PEST manual.  
   
   Let's check an instruction file using `INSCHEK`, a handy utility that allows us to check our instruction files without having to do a full PEST run. We've given you a file called `junk.out.ins` and an accompanying output file called `junk.out`. You can see what INSCHEK is looking for by simply typing 'INSCHEK" at the command line.  You'll see: 
   

```
INSCHEK Version 14.01. Watermark Numerical Computing.

INSCHEK is run using the command:

    INSCHEK insfile [modfile]

where

    "insfile" is a PEST instruction file, and
    "modfile" is an [optional] model output file to be read by INSCHEK.


```

#### Run INSCHEK: 1) without the optional model output file/ look at output; and 2) with the optional model output file/look at output.  

(Note:  yes the author of PEST John Doherty knows how to spell! He could have made it INSCHECK but chose to be consistent across all his checking programs and for some, like TEMPCHEK above, proper spelling would not fit in the 8.3 filename format required at the time.  The good news is you only have one to remember - just think CHEK.)

## Good to know: The simplest INS file

In some cases, the output is orderly and easy to read (or you make it so by preprocessing -  e.g., what the Groundwater Vistas targpest.exe utility does). If it's all numeric, life is easy!

For a file with 5 head values like:  
```
1.1  1.2  1.3345 2e-6 5
```

The `INS` file would simply be:  
```
pif ~
!h1! !h2! !h3! !h4! !h5!
```

# Your turn! Look at `10par_xsect.hds` and make an INS file to read it

 1. Using a text editor, determine how to navigate the head output file for the XSECT model
 2. Consult the a) observation section of the file `xsect.pst` to see what you should name the observations in the INS file; and 2) model input/output section to see what to name the instruction file.
 3. Note that line 1 of the file has observations `h01_1`, `h01_2`, ..., `h01_10` and line 2 has observations `h02_1`, `h02_2`, ..., `h02_10` (__hint:__ you can take advantage of the "simplest INS file" approach)
 4. Run `INSCHEK` to be sure all is well (i.e., you get the numbers extracted that you see in the 10par_xsect.hds file).

# Now that we have all this, let's check the plumbing by running PEST++!

Just like TEMPCHEK and INSCHEK, we also have a handy utility that we run on our PEST setup before pulling the trigger. __(note: never never never run PEST without running PESTCHEK first!!!)__ Just like TEMPCHEK and INSCHEK, you can see what PESTCHEK is looking for by simply typing `pestchek` (Windows) or `./pestchek` (Mac) at the command line.  If you did that you would see that we have to put this on the command line to check our PEST setup: __`pestchek xsect.pst`__ 

If errors are indicated, PEST won't run so we have to correct them. Warnings, on the other hand, highlight potentially good information about what you have specified in the control file but they don't require a change to run. However, the warnings may guide your eyes to things you are not intending so always read them too.

If no errors, run __`pest++ xsect.pst`__ (if Windows) or __`./pestpp xsect.pst`__ (if Mac).

MAC USERS: Remind Randy that he has a point about cross platform execution from PEST to make

### Look at the main PEST output, the `.rec` file, to see results. What other files were made and what's in them?

As mentioned in the PESTCHEK warning, the control file we gave you has `NOPTMAX=0`, which means the model only is run once, and then PEST++ processes all the output and reports the objective function phi. So, not much too exciting with only one run.  However, we __always__ run with `NOPTMAX=0` first to "test the plumbing"of the template and instruction files, and to see if we like the contribution of observation groups to the total objective function. If we don't like the objective function distribution we can reweight, then re-run PEST++ with `NOPTMAX=0` again.  


# Finally - let's get a best fit for this simple cross-section problem
Now __change `NOPTMAX` to a value = 20__ (`NOPTMAX` is the first number listed in the 9th line of the PEST control file).  You can see its location below, taken from Appendix 1 from SIR 2010-5169 we handed out:

<img src="2010-5169_Appendix1_PST_file.png" style="float: center">

The full listing of the PEST control file and a description of each variable is in Appendix 1. __*However, most of these you never will need to touch - the defaults are fine!*__  NOPTMAX is one that you will routinely touch. 


Now run PESTCHEK again - note that the NOPTMAX=0 warning is now gone.  

#### If no errors, run PEST++ again.

This will run parameter estimation on the model and will max out at 20 parameter upgrades. You may have figured out by now, but NOPTMAX stands for __N__umber of __OPT__imization iterations __MAX__imum --cryptic variable names were the price one had to pay when computer RAM was exceedingly small! 

### Look at the PEST `.rec` file - what changed? What new output files were created that weren't there with NOTPMAX=0?

# Define a forecast to invoke forecast uncertainty information in PEST++

Recall that in your xsec_trial_and_error activity you focused on a forecast, what is the head in node 8 during the second stress period:

<img src="xsect_problem_fig1.png" style="float: center">



If there is __*one thing*__ we want you to take away from this class it is this:  for most models there is a forecast/prediction that someone needs; rather than waiting until the end of the project, the forecast should be entered into your thinking and workflow *right at the beginning*. 

#### PEST++ made this very easy - for our model here simply add this line at the bottom of the PEST control `.pst` file: 

__`++forecasts(h02_8)`__  

The `++` at the beginning means this input will only be seen by PEST++; if you use PEST it will be ignored. The h02_8 refers to the 8th node/2nd stress period head observation that is listed in the PEST pst control file. (As you might expect, we can only identify a model output as a forecast *if it is also listed* in the observation section of the PEST control file - we can do this even if we don't have measured values for the forecasts! Stay tuned!).

Now run PEST++ on __`xsect.pst`__ again and look at the forecast uncertainty output reported at end of the .rec file.


# LAST THING IN THIS LESSON: Open the rec file and write down the forecast uncertainty for h02_8

(we'll be using it later)

__Note:__ other PEST++ specific input options can be found at https://github.com/dwelter/pestpp.  During this course we will touch on the most commonly used ones. 

# ADVANCED

### Define other hydraulic conductivity values as parameters - how did forecast uncertainty change?


# An ADVANCED aside for non-GUI people: What if we didn't already have a PEST Control File?

# We could make one from ASCII file parts. 
Here we are using capabilities of `pyemu` to read a set of template and instruction files and, from that, to create a populated PEST control file.  

__Note:__  This uses MODFLOW input and output files so you should have a complete MODFLOW forward run in the directory before running.

In [6]:
import os
import numpy as np
import pyemu

# only K
observation_data = np.loadtxt("10par_xsec.hds.init").reshape(20,1)
observation_wghts = np.zeros_like(observation_data)
# only set nonzero weights to observations in cells 3 and 5
obs_wghts[[3,5]] = 1.0
pst = pyemu.pst_utils.pst_from_io_files(["hk_Layer_1.ref.tpl"],["hk_Layer_1.ref"],
                                        ["10par_xsec.hds.ins"],["10par_xsec.hds"])
pst.observation_data.obgnme = ['cal' if '01' in i else 'fore' for i in pst.observation_data.obsnme ]
pst.observation_data.obgnme = [i + '_use'.format(j) if int(j) in np.array([3,5]) 
                       else  i for j,i in enumerate(pst.observation_data.obgnme) ]
pst.observation_data.obsval = observation_data
pst.observation_data.weight = observation_wghts
pst.model_command = ['python runmodel.py']
pst.prior_information = pst.null_prior
pst.control_data.pestmode = "estimation"
pst.write("k.pst")

  "it doesn't propagate changes back 'observation_data'")


In [7]:
# create the runmodel.py script that will run MODFLOW with the correct namefile
with open(('runmodel.py'), 'w') as ofp:
    ofp.write("import os\n")
    ofp.write("import platform\n")
    ofp.write("if 'window' in platform.platform().lower():\n")
    ofp.write("    pref = ''\n")
    ofp.write("else:\n")
    ofp.write("    pref = './'\n")
    ofp.write("os.system('{0}mf2005 10par_xsec.nam'.format(pref))\n")
