In [1]:
import sys
sys.path.append('../')
import pestools as pt

# Set options to allow for better display in the iPython notebook
import matplotlib.pyplot as plt
import pandas as pd
% matplotlib inline
plt.rcParams.update({'figure.figsize':[14,8]})
pd.options.display.max_rows = 10

<center>ABSTRACT</center>

PESTools is an open-source Python package for processing and visualizing information associated with the parameter estimation software PEST. While PEST output can be reformatted for post- processing in spreadsheets or other menu-driven software packages, this approach can be error-prone and time-consuming. Managing information from highly parameterized models with thousands of parameters and observations presents additional challenges. Pestools consists of a set of Python parameters and observations presents additional challenges. PESTools consists of a set of Python object classes to facilitate efficient processing and visualization of PEST-related information. Processing and visualization of observation residuals, objective function contributions, parameter and observation sensitivities, parameter correlation and identifiability, and other common PEST outputs have been implemented. The use of dataframe objects (pandas Python package) facilitates rapid slicing and querying of large datasets, as well as the incorporation of ancillary information such as observation locations and times and measurement types. Pestools’ object methods can be easily be observation locations and times and measurement types. PESTools’ object methods can be easily be scripted with concise code, or alternatively, the use of IPython notebooks allows for live interaction with the information. Pestools is designed to not only streamline workflows, but also provide deeper insight into model behavior, enhance troubleshooting, and improve transparency in the calibration with the information. PESTools is designed to not only streamline workflows, but also provide deeper insight into model behavior, enhance troubleshooting, and improve transparency in the calibration process.

# Examples of working with Pest

PESTools is designed with flexibility in mind. There are several ways to access and process data, allowing for both interactive exploration or more direct methods for processing only those information the user is interested in.

At the top level of PESTools is the Pest class.  The Pest class input is the path and basename of the PEST run.  An file extention of .pst is assumed for the PEST control file and inclution is optional.

In [None]:
columbia = pt.Pest(r'../cc/Columbia')

Once the Pest class is created is can be used to access nemerous PEST related inputs and outputs.  Many of the common datasets are exposed as attributs of the PEST class, most commly returned in the form of a Pandas dataframe.

The parameter data and observation data sections of the PEST control file can be accessed using:

In [None]:
columbia.parameter_data

In [None]:
columbia.observation_data

Other parts of the PEST control file information can be accessed through the pst subclass

In [None]:
columbia.pst.prior_information

In [None]:
columbia.pst.nobs

Assuming that PEST has been run additional data are accessiable through the Pest class. The residual data are access with the res_df attribute

In [None]:
columbia.res_df

The jacobian matrix can be read into a DataFrame

In [None]:
columbia.jco_df

#Parameter Sensitivity

PESTools uses the data from the Jacobian matrix and weights assigned by PEST to calculate the parameter sensitivity.  Convenently, there is no need to save or read in the parameter sensitivity file generated by PEST.

There are several ways to generate the ParSen class; either directly from the Pest class or independantly by providing path and basename for a PEST run.


In [None]:
parsen = pt.ParSen(r'../cc/Columbia', drop_regul = True)

# Alternativly generate ParSen class from Pest class
#parsen = columbia.parsen(drop_regul = True)

Parameter sensitivity values can be access from the df attribute (df standing for DataFrame).

In [None]:
parsen.df

To access the most sensitive parameters use the head(n) method can be used, where n indicates the number to show.  Similarly, the tail(n) method can be used to show the least sensitive parameters.

In [None]:
parsen.head(10)

In [None]:
parsen.tail(10)

The parameter sensitivy of a single parameter is access with the par() method.

In [None]:
parsen.par('sfrc')

The parameter sensitivites of a single group can be accessed using the group() method.

In [None]:
parsen.group('kp')

The most sensitive parameters of the group is accessiable with an optional argument n.  Postive n will return the n most sensitive.  Negative n will return the n least sensitive parameters from the group.

In [None]:
parsen.group('kp', n=-5)

The sum of all parameter sensitivies by group is accesses using the sum_group() method.

In [None]:
parsen.sum_group()

The ParSen class can be used to recalculate parameter sensitivies with individial observation or groups of observations removed.  This can be useful for removeing regularization information or drilling down to look at which parameters are most sensitive to individual observations or groups of observations.

In [None]:
parsen.plot(30)


### Plot the most sensitive parameter for a single group

In [None]:
parsen.plot(n=30, group = 'kp')


The ParSen class can be used to recalculate parameter sensitivies with individial observation or groups of observations removed. This can be useful for removeing regularization information or drilling down to look at which parameters are most sensitive to individual observations or groups of observations

### Alternative lables can be defined with a dictionary.

In [None]:
alt_labels = {'r_col' : 'Full Parameter Description1', 'kh_cr' : 'K of the something'}
parsen.plot(n = 30, alt_labels = alt_labels)

### Plot the sum of sensitivity by group

In [None]:
parsen.plot_sum_group()

### Use alternative group names and change the color

In [None]:
alt_labels = {'sfr_cond' : 'Stream Conductance', 'rech' : 'Recharge', 'kz' : 'Vertical K', 'Kp' : 'Hydraulic Conductivity'}
parsen.plot_sum_group(alt_labels = alt_labels, color = "DarkGray")

### Plot the mean of sensitivity of group

In [None]:
parsen.plot_mean_group(alt_labels = alt_labels, color = 'LightGreen', linewidth = 0.0)

### plot() method uses the 'Set2' cmap by default to get colors for different groups.  A different cmap can be used

In [None]:
parsen.plot(n=30, cmap = 'jet')

### You can also use a dictionary with group names yp set colors for groups

In [None]:
color_dict = {'kz': (0.89411765336990356, 0.10196078568696976, 0.10980392247438431, 1.0), 'kp': 'g', 'sfr_cond' : 'b',
              'rech' : 'k'}
parsen.plot(n=20, color_dict = color_dict)

###### exclude observation groups

In [None]:
parsen.drop_groups(['head_poor', 'WCRs1'])
parsen.plot(n=20)

###### only keep one or a few groups

In [None]:
parsen.keep_groups(['cc_streams'])
parsen.plot(n=20)

# Cor Class

In [None]:
cor = columbia.cor

In [None]:
cor.df

In [None]:
cor.plot_heatmap(label_rows=False, label_cols=False)

In [None]:
import seaborn

smaller_matrix = cor.pars(['kpkpx_tc2', 'kpkpx_tc34', 'kpkpx_ww60', 'kpkpx_ms14', 'kv_w', 'wr_drnc', 'kv_lo', 'sfrc', 'r_col'])
seaborn.heatmap(smaller_matrix)