# SWEEP

### Sweep (with the executable name ``pestpp-swp``) is a simple utility to evaluate a set of parameter values in serial (one at a time) or in parallel.  This set of parameter is stored in a CSV file.  ``pestpp-swp`` writes a new csv file with model-simulated values for observations listed in the control file.  Let's check it out...

The usual boiler plate to setup a model - in this case, we are using the freyberg hk + rch model

In [None]:
%matplotlib inline
import os
import sys
sys.path.append("..")
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pyemu



In [None]:
import freyberg_setup as fs
pst_name = fs.PST_NAME_KR
working_dir = fs.WORKING_DIR_KR
fs.setup_pest_kr()

### sweep accepts several optional ++ arguments in the control file:
### - ++sweep_parameter_csv_file(): the name of the csv file for rows of parameter values
### - ++sweep_output_csv_file(): the name of the csv file sweep writes with model outputs
### - ++sweep_chunk(): how many parameter sets to evaluate before writing to the output csv file


### each of these are optional - there are default values supplied if you don't pass them:
### - ++sweep_parameter_csv_file = "sweep_in.csv"
### - ++sweep_output_csv_file = "sweep_out.csv"
### - ++sweep_chunk = 500


### Let's make a csv file for sweep use and write some random values to it.  You can do this by hand in excel (!) or you embrace the sweetness that is python:

In [None]:
f =  open(os.path.join(working_dir,"my_first_sweep.csv"),'w')
    

### First, we need to write a header to the sweep in file - listing the run id and the parameter names...
### But wait!  We need to know the names of the parameter for the csv file header? How can get those?

In [None]:
pst = pyemu.Pst(os.path.join(working_dir,pst_name))

In [None]:
pst.parameter_data

In [None]:
f.write("run_id,rch_0,rch_1,hk,porosity\n")

Let's write one set of parameter values to this csv file, use whatever values you want...let's get crazy!

In [None]:
#you code here....
f.write("crap,-999,-999,1.0E+30,-999\n")

In [None]:
f.close()

### Let's go look at the file "my_first_sweep.csv"

### Cool...now let's run sweep (`pestpp-swp`)! But from the command line, like real modelers!

### Oh crap!  Why didn't that work?  Remember, we named our input csv "my_first_sweep.csv", but the ``++sweep_parameter_csv_file()`` arg was not set, so ``sweep`` is looking for "sweep_in.csv".  

### So either rename your CSV file or add that optional arg to the control file and rerun...

In [None]:
pst.pestpp_options["sweep_parameter_csv_file"] = "my_first_sweep.csv"

In [None]:
pst.write(os.path.join(working_dir,pst_name))

### Now go open up "sweep_out.csv".  What do you see?  If you used crazy parameter values, you have a failed run (look for a '0' or a '1' in the third column).  So ``sweep`` tracks run failures for you.  Let's fix the parameter values to be more reasonable and rerun sweep, this time all with python...

In [None]:
f =  open(os.path.join(working_dir,"my_first_sweep.csv"),'w')
f.write("run_id,rch_0,rch_1,hk,porosity\n")
f.write("0,0.9,0.8,3.14159,0.01\n")
f.close()
pst.control_data.noptmax = 0
pst.write(os.path.join(working_dir,pst_name))
pyemu.os_utils.run("pestpp-swp {0}".format(pst_name),cwd=working_dir)

In [None]:
df = pd.read_csv(os.path.join(working_dir,"sweep_out.csv"))
df

### What do you see?  ``sweep`` is calculating the composite (total) objective function (``phi``), as well as the ``phi`` for each objective function component.  Then, past the ``phi`` info, are the simulated values for each observation listed in the control file.  Cool!

### ``sweep`` is total general and flexible, which lets you do all kinds of cool things with your PEST datasets, such as Monte Carlo or design of experiments.  All ``sweep`` does is run each row in the input CSV through the model and write the resulting simulated outputs to a new csv.  that it...

### pyEMU interfaces nicely with ``sweep`` through the ``pandas.DataFrame`` object to let you easily generate and evaluate lots of parameter value combinations - Monte Carlo.  For demonstration purposes, let's do that...

In [None]:
pst.pestpp_options.pop("sweep_parameter_csv_file")
pst.write(os.path.join(working_dir,pst_name))
mc = pyemu.MonteCarlo(pst=pst)

In [None]:
mc.draw(50)

In [None]:
mc.parensemble

### Let's use ``pandas`` to write the CSV file we need

In [None]:
mc.parensemble.to_csv(os.path.join(working_dir,"sweep_in.csv"))

### When execute this next block, go the terminal where the notebook is running and what the output...

In [None]:
pyemu.os_utils.run("pestpp-swp {0}".format(pst_name),cwd=working_dir)

### That was too slow...let's do it in parallel.  We are going to use a pyemu helper to start the parallel run manager that is built in to ``sweep``.  Don't worry about this for now, we will cover in much greater detail later...after executing this code block, go back to the terminal and watch

### To better understand how pestpp runs things in parallel, let's redo the ``freyberg_k_and_r`` calibration, but we will start the master and slaves... 










### ...Wow - that was painful...luckily, pyemu has a helper function to "help"

In [None]:
os.chdir(working_dir)
pyemu.helpers.start_slaves('.',"pestpp-swp",pst_name,num_slaves=15,master_dir='.')
os.chdir("..")

### That was way faster...let's check out the results

In [None]:
df = pd.read_csv(os.path.join(working_dir,"sweep_out.csv"))
df

### Boom!  Monte Carlo done! In the next excersize, we will talk more about how this Monte Carlo is implemented and also what the results mean...

In [None]:
df.phi.hist(bins=10)