# Run PESTPP-OPT 

In this notebook, we will run pestpp-opt using the pestpp-ies results

In [None]:
%matplotlib inline
import os
import shutil
import numpy as np
import pandas as pd
import pyemu
import matplotlib.pyplot as plt
import helpers

In [None]:
_ = helpers.get_domain_map()

Set some important vars for this notebook

In [None]:
num_workers = 10
master_d = "master_opt"

The existing master ies directory - needed so we can get the calibrated par values

In [None]:
ies_d = "master_ies"
assert os.path.exists(ies_d)

Define the working directory where the opt-modified control file will be 

In [None]:
working_d = "model_and_pest_files_opt"
if os.path.exists(working_d):
    shutil.rmtree(working_d)
shutil.copytree(ies_d, working_d)

Load the control file from the working directory:

In [None]:
assert pst.nobs > pst.nnz_obs
assert pst.nnz_obs > 0

get a reference to the parameter data and call it "par"

Get the posterior parameter ensemble and look for the "base" realization

In [None]:
post_iter = pst.ies.phiactual.iteration.max()

Get the posterior parameter ensemble from the ies run and call it "pe"

If the "base" realization exists in the posterior parameter ensemble (hopefully!), assign it the `parval1` values in the control file

In [None]:
if "base" not in pe.index:
    print("no base realization, using uncalibrated model")
else:
    par.loc[pe.columns, "parval1"] = pe.loc["base", :].round(6).values

Check that the `parval1` value have been updated:

### Setting up the decision variable
select the block of "par" that has "wel" in the `parnme` column, copy it and call it "wpar"


In [None]:
assert wpar.shape[0] > 0

cast the `kper` to an "int", then make sure we are only adjusting wel pars that correspond to the future period

In [None]:
wpar["kper"] = wpar.kper.astype(int)
start_pred_kper = wpar.loc[wpar.parnme.str.contains("pred"), "kper"].min()
pred_wpar = wpar.loc[wpar.kper >= start_pred_kper, :]
assert pred_wpar.shape[0] > 0
print(pred_wpar.shape[0], "decision variables")
pred_wpar.index.to_list()

Mark these well parameters as "none" transformed (the "partrans" columnn) and set their parameter group name to "decvar" (the "pargp" column):


In [None]:
par.loc[pred_wpar.parnme, "partrans"] = "none"
par.loc[pred_wpar.parnme, "pargp"] = "decvar"
par.loc[pred_wpar.parnme, "parubnd"] = 1.5
par.loc[pred_wpar.parnme, "parlbnd"] = 0.0

and modify their derivative calculation quantities to make sure we are using large enough perturbation increments

In [None]:
predwel_wpar = wpar.loc[wpar.parnme.str.contains("pred"), "parnme"]
assert len(predwel_wpar) > 0

par.loc[predwel_wpar, "parval1"] = 0.0  # set the pred well rates to zero initially

pst.rectify_pgroups()
pst.parameter_groups.loc["decvar", "derinc"] = 0.2
pst.parameter_groups.loc["decvar", "inctyp"] = "absolute"
pst.pestpp_options["opt_dec_var_groups"] = "decvar"

Now modify the observation data section: set all obs to zero weight, then find constraints and set them accordingly.  Get a reference to the observation data called "obs" and set all weights in obs to zero:

In [None]:
assert obs.weight.sum() == 0

find any observation name that have "forecast", "diff" and "riv-swgw" in the name.  There should be exactly 1 of these

In [None]:
# find the river sw-gw exchange difference observation and perfer it to be 0 so that historical and future are equal
fobs = obs.loc[
    (obs.obsnme.str.contains("forecast"))
    & (obs.obsnme.str.contains("diff"))
    & (obs.obsnme.str.contains("riv-swgw")),
    :,
]
assert fobs.shape[0] == 1

What is the value for this quantity in the calibrated model?

In [None]:
cal_diff = pst.ies.get("obsen", post_iter).loc["base", fobs.obsnme]
cal_diff

Set the observation in "fobs" to have a weight of 1.0, set the `obgnme` to "greater_than" and the `obsval` to 0.0.  This tells pestpp-opt that we want the future long-term average swgw exchange to be at least the same amount as the historic equivalent.

In [None]:
assert obs.loc[fobs.obsnme, "weight"] == 1
assert obs.loc[fobs.obsnme, "obgnme"] == "greater_than"
assert obs.loc[fobs.obsnme, "obsval"] == 0.0

Find the sum of future gw pumping observation - this will be our objective function that we want to maximize.  The observation should have "forecast","pred" and "wel" in the name.  There should be only one observation.  Assign this observation to a variable called `obj_name`

In [None]:
wobs = obs.loc[
    (obs.obsnme.str.contains("forecast"))
    & (obs.obsnme.str.contains("pred"))
    & (obs.obsnme.str.contains("wel")),
    :,
]
assert len(wobs) == 1
# obs.loc[wobs.obsnme,"weight"] = 1.0
# obs.loc[wobs.obsnme,"obgnme"] = "greater_than"
obj_name = wobs.obsnme.values[0]
obj_name

We also need to make sure that during each future stress period, we are meeting the long-term historic production rates.  
First find all observations with "bud" and "pwell--out" in the name:

In [None]:
wfobs = obs.loc[
    (obs.obsnme.str.contains("bud")) & (obs.obsnme.str.contains("pwell--out")), :
].copy()
wfobs.index.to_list()

Now cast the `datetime` column in "wfobs" using `pd.to_datetime()`

In [None]:
wfobs["datetime"] = pd.to_datetime(wfobs.datetime)

Now split these well-flux obs into historic and predictive/future by the year 2015

In [None]:
hist_wfobs = wfobs.loc[wfobs.datetime.dt.year < 2015, :]
pred_wfobs = wfobs.loc[wfobs.datetime.dt.year >= 2015, :]

What was the historic maximum gw production rate (using the posterior ensemble base realization)?

In [None]:
hist_max = pst.ies.get("obsen", post_iter).loc["base", hist_wfobs.obsnme].max()
hist_max

In [None]:
pred_wfobs.index.to_list()

for the future/predictive well flux observations, set the `weight` to 1.0, the obsval to 90% of the historic max (assuming some future water conservation) and the `obgnme` to "greater_than"

In [None]:
obs.loc[pred_wfobs.obsnme, "weight"] = 1.0
obs.loc[pred_wfobs.obsnme, "obsval"] = hist_max * 0.9
obs.loc[pred_wfobs.obsnme, "obgnme"] = "greater_than"

Set `noptmax` to 1 and identify the objective function via the "opt_objective_function" argument.  Set the "opt_direction" to "max", telling pestpp-opt to maximize the future groundwater production

In [None]:
pst.control_data.noptmax = 1
pst.pestpp_options["opt_objective_function"] = obj_name
pst.pestpp_options["opt_direction"] = "max"

Save the control file

In [None]:
pst.write(os.path.join(working_d, "pest.pst"), version=2)

Run pestpp-opt using the `pyemu.os_utils.start_workers()` function:

In [None]:
if os.path.exists(master_d):
    shutil.rmtree(master_d)
os.makedirs(master_d)
pyemu.os_utils.start_workers(
    working_d,
    "pestpp-opt",
    "pest.pst",
    num_workers=num_workers,
    worker_root=master_d,
    master_dir=master_d,
)

# post processing PESTPP-OPT

check the final and initial objective function value:

In [None]:
opt_phi_value = pd.read_csv(
    os.path.join(master_d, "pest.slp.iobj.csv"), index_col=0
).values[0][0]
hist_max, opt_phi_value

So we are able to extract more gw but also make river sw-gw exchange have zero change into the future - nice!

Plot the optimal decision variable values.  First read the optimal decision variables stored in the "pest.1.par" parameter value file into a dataframe called "opt_par_vals" using `pyemu.pst_utils.read_parfile()`:

In [None]:
ax = opt_par_vals.loc[wpar.parnme, "parval1"].plot(kind="bar", figsize=(10, 10))
ax.grid()

## investigating the response matrix

Load the response matrix ("pest.1.jcb") using `pyemu.Matrix.from_binary()` into a variable called "rm" and the cast it to a dataframe using the `to_dataframe()` method:

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
cf = np.abs(rm.loc[pst.nnz_obs_names[0], :].values) / np.abs(rm.loc[obj_name, :].values)
ax.bar(np.arange(cf.shape[0]), cf)
ax.set_xticks(np.arange(cf.shape[0]))
_ = ax.set_xticklabels(rm.columns, rotation=90)
ax.set_title(" long-term difference sw-gw capture fraction")

# Reliability

plot the posterior histogram of the river swgw flux difference (the obs we used as a constraint) using the `pst.ies` results handler:

tell pestpp-opt how to use the uncertainty. We can also take advantage of the linear assumption and reuse the previous results so we dont need to any additional model runs (!)

In [None]:
# save the posterior obs ensemble in the opt working dir
oe.to_csv(os.path.join(working_d, "obs_stack.csv"))
pst.pestpp_options["opt_obs_stack"] = "obs_stack.csv"
# dont try to update the chance estimates
pst.pestpp_options["opt_recalc_chance_every"] = 999

# reuse the response matrix
shutil.copy2(
    os.path.join(master_d, "pest.1.jcb"), os.path.join(working_d, "respmat.jcb")
)
pst.pestpp_options["base_jacobian"] = "respmat.jcb"

# reuse the initial residuals
shutil.copy2(
    os.path.join(master_d, "pest.1.jcb.rei"), os.path.join(working_d, "hotstart.rei")
)
pst.pestpp_options["hotstart_resfile"] = "hotstart.rei"

# dont do a final model run
pst.pestpp_options["opt_skip_final"] = True

# look for a 60% reliable solution
pst.pestpp_options["opt_risk"] = 0.6

par = pst.parameter_data
pst.write(os.path.join(working_d, "pest.pst"))

Rerun pestpp-opt using the `pyemu.os_utils.run()` method in the working directory (we dont need paralle workers any more)

In [None]:
pyemu.os_utils.run("pestpp-opt pest.pst", cwd=working_d)

Visualize these results the same we did earlier:

In [None]:
reliable_phi_value = pd.read_csv(
    os.path.join(working_d, "pest.slp.iobj.csv"), index_col=0
).values[0][0]
hist_max, opt_phi_value, reliable_phi_value

In [None]:
opt_par_vals = pyemu.pst_utils.read_parfile(os.path.join(working_d, "pest.1.par")).loc[
    wpar.parnme
]
ax = opt_par_vals["parval1"].plot(kind="bar", figsize=(10, 10))
ax.grid()

In [None]:
# how much water convervation and usage efficiency we expect in the future
obs.loc[pred_wfobs.obsnme, "obsval"] = hist_max * 1
# how reliable the optimal solution is
pst.pestpp_options["opt_risk"] = 0.8

In [None]:
pst.write(os.path.join(working_d, "pest.pst"))

In [None]:
pyemu.os_utils.run("pestpp-opt pest.pst", cwd=working_d)