# Freyberg USG Model PEST setup example
Herein, we will show users how to use pyEMU to setup a groundwater model for use in pest. Except using the Unstructured Grid (usg) version of MODFLOW. We will cover the following topics:
- setup pilot points as parameters, including 1st-order tikhonov regularization
- setup other model inputs as parameters
- setup simulated water levels as observations
- setup simulated water budget components as observations (or forecasts)
- create a pest control file and adjust observation weights to balance the objective function

Note that, in addition to `pyemu`, this notebook relies on `flopy`. `flopy` can be obtained (along with installation instructions) at https://github.com/modflowpy/flopy.


In [1]:
%matplotlib inline
import os
import shutil
import platform
import numpy as np
import pandas as pd
from matplotlib.patches import Rectangle as rect
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore", 
    message="ModflowDis.sr is deprecated. use Modflow.sr")
from mpl_toolkits.axes_grid1 import make_axes_locatable
import matplotlib as mpl
newparams = {'legend.fontsize':10, 'axes.labelsize':10,
             'xtick.labelsize':10, 'ytick.labelsize':10,
             'font.family':'Univers 57 Condensed', 
             'pdf.fonttype':42}
plt.rcParams.update(newparams)
import pyemu

## Model background
This example is based on the synthetic classroom model of Freyberg(1988).  The  model is a 2-dimensional MODFLOW model with 3 layers,  40 rows, and 20 columns.  The model has 2 stress periods: an initial steady-state stress period used for calibration, and a 5-year transient stress period.  The calibration period uses the recharge and well flux of Freyberg(1988); the last stress period use 25% less recharge and 25% more pumping to represent future conditions for a forecast period.

This model has been modified using Gridgen to include a quadtree mesh at the location of the river.

Freyberg, David L. "AN EXERCISE IN GROUND‐WATER MODEL CALIBRATION AND PREDICTION." Groundwater 26.3 (1988): 350-360.

In [6]:
#load the existing model and save it in a new dir and make sure it runs
import flopy
model_ws = os.path.join("freyberg_usg")
ml = flopy.modflow.Modflow.load("freyberg.usg.nam",model_ws=model_ws,verbose=False,version='mfusg',forgive=True,check=False)
ml.exe_name = "mfusg"
ml.model_ws = "temp"
EXE_DIR = os.path.join("..","bin")
if "window" in platform.platform().lower():
    EXE_DIR = os.path.join(EXE_DIR,"win")
elif "darwin" in platform.platform().lower():
    EXE_DIR = os.path.join(EXE_DIR,"mac")
else:
    EXE_DIR = os.path.join(EXE_DIR,"linux")

[shutil.copy2(os.path.join(EXE_DIR,f),os.path.join("temp",f)) for f in os.listdir(EXE_DIR)]

ml.write_input()
ml.run_model()


creating model workspace...
   temp

changing model workspace...
   temp
FloPy is using the following  executable to run the model: C:\Users\rossk\Desktop\github\pyemu\bin\win\mfusg.EXE

                                  MODFLOW-USG      
    U.S. GEOLOGICAL SURVEY MODULAR FINITE-DIFFERENCE GROUNDWATER FLOW MODEL
                             Version 1.5.00 02/27/2019                       

 Using NAME file: freyberg.usg.nam 
 Run start date and time (yyyy/mm/dd hh:mm:ss): 2021/06/03 14:45:31

 Solving:  Stress period:     1    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     2    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     3    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     4    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     5    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     6    Time step:     1    Groundwater Flow Eqn.
 Solving:  Stress period:     7    Tim

(True, [])

In [11]:
def hdobj2data(hdsobj): 
    # convert usg hdsobj to array of shape (nper, nnodes)
    hds = []
    kstpkpers = hdsobj.get_kstpkper()
    for kstpkper in kstpkpers:
        data = hdsobj.get_data(kstpkper=kstpkper)
        fdata = []
        for lay in range(len(data)):
            fdata += data[lay].tolist()
        hds.append(fdata)

    return np.array(hds)

In [18]:
node_df = pd.read_csv(os.path.join("Freyberg","misc","obs_nodes.dat"),delim_whitespace=True)
hdsobj = flopy.utils.HeadUFile(os.path.join(ml.model_ws,"freyberg.usg.hds"))
hds = hdobj2data(hdsobj)
nper,nnodes = hds.shape
(nper,nnodes)

(25, 4497)

In [36]:
data = []
for i, dfrow in node_df.iterrows():
    name, node = dfrow['name'], dfrow['node']
    
    r = np.random.randn(nper) #add some random noise to the observations
    for sp in range(nper):
        
        hd = hds[sp,node-1]
        rhd = r[sp] + hd #add some random noise to the observations
        
        data.append([hd,rhd,name,node,sp])


obs_df = pd.DataFrame(data,columns=['simval','obsval','name','node','sp'])
obs_df.to_csv(os.path.join(ml.model_ws,'obs.csv'),index=False)
obs_df

Unnamed: 0,simval,obsval,name,node,sp
0,34.552040,34.259154,obs_01,107,0
1,34.601082,33.559407,obs_01,107,1
2,34.646980,35.943502,obs_01,107,2
3,34.673031,33.873621,obs_01,107,3
4,34.666801,35.415490,obs_01,107,4
...,...,...,...,...,...
295,33.163342,33.991730,obs_13,1348,20
296,33.149437,32.615261,obs_13,1348,21
297,33.196201,33.521894,obs_13,1348,22
298,33.287243,33.565243,obs_13,1348,23


# Parameters

## pilot points

Here we will import pilot point locations from a csv


In [39]:
pp_df = pd.read_csv(os.path.join("Freyberg","misc","pp_usg.csv"))
pp_df['xy'] = pp_df.apply(lambda i: (i['x'],i['y']),axis=1)
pp_df.head()

Unnamed: 0,name,x,y,xy
0,pp_0000,620116.4,3372795.9,"(620116.4, 3372795.9)"
1,pp_0001,620866.4,3372795.9,"(620866.4, 3372795.9)"
2,pp_0002,621616.4,3372795.9,"(621616.4, 3372795.9)"
3,pp_0003,622366.4,3372795.9,"(622366.4, 3372795.9)"
4,pp_0004,623116.4,3372795.9,"(623116.4, 3372795.9)"


### Use the GSF to make a Spatial Refrence structure  

In [45]:
gsf = pyemu.gw_utils.GsfReader(os.path.join(model_ws,"freyberg.usg.gsf"))
gsf_df = gsf.get_node_data()

### setup pilot point locations

first specify what pilot point names we want to use for each model layer (counting from 0).  Here we will setup pilot points for ``hk``, ``sy`` and ``rech``.  The ``rech`` pilot points will be used as a single multiplier array for all stress periods to account for potential spatial bias in recharge.   

In [41]:
nlay = ml.nlay

sr_dict_by_layer = {}
for lay in rnage(nlay):
    