# VERSPM Model Interface

In [1]:
import emat
import os
import pandas as pd
import numpy as np
import gzip
from emat.util.show_dir import show_dir, show_file_contents

This notebook is meant to illustrate the use of TMIP-EMAT's
with VisionEval's RSPM Model.  It provides an illustration of how to use 
TMIP-EMAT and the interface to run the model.

In this example notebook, we will activate some logging features.  The 
same logging utility is written directly into the EMAT and the
`emat_verspm.py` module. This will give us a view of what's happening
inside the code as it runs.

In [2]:
import logging
from emat.util.loggers import log_to_stderr
log = log_to_stderr(logging.INFO)

## Connecting to the Model

The interface for this model is located in the `emat_verspm.py`
module, which we will import into this notebook. 

In [3]:
import emat_verspm

Let's initialize a database file to store results.

In [4]:
database_path = os.path.expanduser("~/sandbox/ve-rspm-20200727.db")
initialize = not os.path.exists(database_path)
db = emat.SQLiteDB(database_path, initialize=initialize)

Within this module, you will find a definition for the 
`VERSPModel` class.  We initialize an instance of the model interface object.

In [5]:
fx = emat_verspm.VERSPModel(db=db)



## Single Run Operation for Development and Debugging

Before we take on the task of running this model in exploratory mode, we'll
want to make sure that our interface code is working correctly. To check each
of the components of the interface (setup, run, post-process, load-measures,
and archive), we can run each individually in sequence, and inspect the results
to make sure they are correct.

### setup

This method is the place where the core model *set up* takes place,
including creating or modifying files as necessary to prepare
for a core model run.  When running experiments, this method
is called once for each core model experiment, where each experiment
is defined by a set of particular values for both the exogenous
uncertainties and the policy levers.  These values are passed to
the experiment only here, and not in the `run` method itself.
This facilitates debugging, as the `setup` method can be used 
without the `run` method, as we do here. This allows us to manually
inspect the prepared files and ensure they are correct before
actually running a potentially expensive model.

Each input exogenous uncertainty or policy lever can potentially
be used to manipulate multiple different aspects of the underlying
core model.  For example, a policy lever that includes a number of
discrete future network "build" options might trigger the replacement
of multiple related network definition files.  Or, a single uncertainty
relating to the cost of fuel might scale both a parameter linked to
the modeled per-mile cost of operating an automobile and the
modeled total cost of fuel used by transit services.

In our RSPM module's `setup`, parameters that are omitted are set at their
deafult values, but we can give a subset of parameters with non-default values
if we like.

In [6]:
params = {
    'ValueOfTime': 13,
    'Income': 46300,
    'Transit': 1.34,
    'ElectricCost': 0.14,
    'FuelCost': 4.25,
} 

fx.setup(params)

[00:04.13] MainProcess/INFO: VERSPM SETUP...
[00:04.20] MainProcess/INFO: VERSPM SETUP complete


### run

The `run` method is the place where the core model run takes place.
Note that this method takes no arguments; all the input
exogenous uncertainties and policy levers are delivered to the
core model in the `setup` method, which will be executed prior
to calling this method. This facilitates debugging, as the `setup`
method can be used without the `run` method as we did above, allowing
us to manually inspect the prepared files and ensure they
are correct before actually running a potentially expensive model.

In [7]:
fx.run()

[00:04.21] MainProcess/INFO: VERSPM RUN ...
[02:22.28] MainProcess/INFO: VERSPM RUN complete


The `VERSPModel` class includes a custom `last_run_logs` method,
which displays both the "stdout" and "stderr" logs generated by the 
model executable during the most recent call to the `run` method.
We can use this method for debugging purposes, to identify why the 
core model crashes (if it does crash).  In this first test it did not
crash, and the logs look good.

In [8]:
show_dir(os.path.join(fx.master_directory.name, 'VERSPM', 'output'))

output/
├── Azone.csv
├── Bzone.csv
├── Household.csv
├── Marea.csv
├── Region.csv
├── Vehicle.csv
└── Worker.csv


In [9]:
fx.last_run_logs()

=== STDOUT ===
run_model.R: script entered
run_model.R: library visioneval loaded
[1] "2020-07-27 15:50:33 -- Initializing Model. This may take a while."
[1] "2020-07-27 15:50:37 -- Model successfully initialized."
run_model.R: initializeModel completed
[1] "2020-07-27 15:50:37 -- Starting module 'CreateHouseholds' for year '2010'."
[1] "2020-07-27 15:50:39 -- Finish module 'CreateHouseholds' for year '2010'."
[1] "2020-07-27 15:50:39 -- Starting module 'PredictWorkers' for year '2010'."
[1] "2020-07-27 15:50:41 -- Finish module 'PredictWorkers' for year '2010'."
[1] "2020-07-27 15:50:41 -- Starting module 'AssignLifeCycle' for year '2010'."
[1] "2020-07-27 15:50:42 -- Finish module 'AssignLifeCycle' for year '2010'."
[1] "2020-07-27 15:50:42 -- Starting module 'PredictIncome' for year '2010'."
[1] "2020-07-27 15:50:45 -- Finish module 'PredictIncome' for year '2010'."
[1] "2020-07-27 15:50:45 -- Starting module 'PredictHousing' for year '2010'."
[1] "2020-07-27 15:50:47 -- Finish modu

In [10]:
os.path.join(fx.master_directory.name, 'VERSPM', 'output')

'/var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmp0phkbmg8/VERSPM/output'

### post-process

There is a `post_process` step that is separate from the `run` step.

For VERSPM, the post-processing replicates the calculations needed to
create the same summary performance measures as the `R` version of
VisionEval does when run with scenarios.

In [11]:
fx.post_process()

### load-measures

The `load_measures` method is the place to actually reach into
files in the core model's run results and extract performance
measures, returning a dictionary of key-value pairs for the 
various performance measures. It takes an optional list giving a 
subset of performance measures to load, and like the `post_process` 
method also can be pointed at an archive location instead of loading 
measures from the local working directory (which is the default).
The `load_measures` method should not do any post-processing
of results (i.e. it should read from but not write to the model
outputs directory).

In [12]:
fx.load_measures()

{'GHGReduction': 0,
 'DVMTPerCapita': 19.931893301649605,
 'WalkTravelPerCapita': 0.3472318360421489,
 'TruckDelay': 0,
 'AirPollutionEm': 824329.461495448,
 'FuelUse': 36263244.806371376,
 'VehicleCost': 3.969387886906534,
 'VehicleCostLow': 20.830175577270325}

You may note that the implementation of `RoadTestFileModel` in the `core_files_demo` module
does not actually include a `load_measures` method itself, but instead inherits this method
from the `FilesCoreModel` superclass. The instructions on how to actually find the relevant
performance measures for this file are instead loaded into table parsers, which are defined
in the `RoadTestFileModel.__init__` constructor.  There are [details and illustrations
of how to write and use parsers in the file parsing examples page of the TMIP-EMAT documentation.](https://tmip-emat.github.io/source/emat.models/table_parse_example.html)

### archive

The `archive` method copies the relevant model output files to an archive location for 
longer term storage.  The particular archive location is based on the experiment id
for a particular experiment, and can be customized if desired by overloading the 
`get_experiment_archive_path` method.  This customization is not done in this demo,
so the default location is used.

In [13]:
fx.get_experiment_archive_path(parameters=params)

'/Users/jeffnewman/sandbox/VisionEval-Archive/scp_VERSPM/exp_1'

Actually running the `archive` method should copy any relevant output files
from the `model_path` of the current active model into a subdirectory of `archive_path`.

In [15]:
fx.archive(params)

[02:22.95] MainProcess/INFO: VERSPM ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmp0phkbmg8/VERSPM/output
   to: /Users/jeffnewman/sandbox/VisionEval-Archive/scp_VERSPM/exp_1.zip


In [16]:
show_dir(fx.local_directory)

tmp0phkbmg8/
├── VERSPM/
│   ├── .Rprofile
│   ├── Datastore/
│   │   ├── 2010/
│   │   │   ├── Azone/
│   │   │   │   ├── Age0to14.Rda
│   │   │   │   ├── Age15to19.Rda
│   │   │   │   ├── Age20to29.Rda
│   │   │   │   ├── Age30to54.Rda
│   │   │   │   ├── Age55to64.Rda
│   │   │   │   ├── Age65Plus.Rda
│   │   │   │   ├── AutoCarSvcSubProp.Rda
│   │   │   │   ├── AutoMeanAge.Rda
│   │   │   │   ├── AveCarSvcVehicleAge.Rda
│   │   │   │   ├── AveHhSize.Rda
│   │   │   │   ├── Azone.Rda
│   │   │   │   ├── ElectricityCI.Rda
│   │   │   │   ├── FuelCost.Rda
│   │   │   │   ├── FuelTax.Rda
│   │   │   │   ├── GQIncomePC.Rda
│   │   │   │   ├── GrpAge0to14.Rda
│   │   │   │   ├── GrpAge15to19.Rda
│   │   │   │   ├── GrpAge20to29.Rda
│   │   │   │   ├── GrpAge30to54.Rda
│   │   │   │   ├── GrpAge55to64.Rda
│   │   │   │   ├── GrpAge65Plus.Rda
│   │   │   │   ├── HHIncomePC.Rda
│   │   │   │   ├── HighCarSvcAccessTime.Rda
│   │   │   │   ├── HighCarSvcCost.Rda
│   │   │   │   ├── LowCarSvcA

In [17]:
STOP

NameError: name 'STOP' is not defined

It is permissible, but not required, to simply copy the entire contents of the 
former to the latter, as is done in this example. However, if the current active model
directory has a lot of boilerplate files that don't change with the inputs, or
if it becomes full of intermediate or temporary files that definitely will never
be used to compute performance measures, it can be advisable to selectively copy
only relevant files. In that case, those files and whatever related sub-directory
tree structure exists in the current active model should be replicated within the
experiments archive directory.

## Normal Operation for Running Multiple Experiments

For this demo, we'll create a design of experiments with only 3 experiments.
The `design_experiments` method of the `VERSPModel` object is not defined
in the custom code written for this model, but rather is a generic
function provide by the TMIP-EMAT main library.
Real applications will typically use a larger number of experiments, but this small number
is sufficient to demonstrate the operation of the tools.

In [20]:
design1 = fx.design_experiments(n_samples=3)
design1

Unnamed: 0_level_0,Bicycles,DemandManagement,ElectricCost,FuelCost,Income,LandUse,Parking,TechMix,Transit,ValueOfTime
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
10,0.097509,0.73284,0.112086,4.062616,47141,growth,0.105612,0.600716,0.755257,16.735413
11,0.151414,0.271721,0.088456,2.530513,44071,base,0.5227,0.047922,3.912346,12.629127
12,0.191199,0.386272,0.188169,6.475403,48623,base,0.956376,0.90142,2.182498,14.921822


The `run_experiments` command will automatically run the model once for each experiment in the named design.

In [21]:
fx.run_experiments(design1)

[05:50.26] MainProcess/INFO: performing 3 scenarios/policies * 1 model(s) = 3 experiments
[05:50.28] MainProcess/INFO: performing experiments sequentially
[05:50.28] MainProcess/INFO: VERSPM SETUP...
[05:50.34] MainProcess/INFO: VERSPM SETUP complete
[05:50.34] MainProcess/INFO: VERSPM RUN ...
[08:14.12] MainProcess/INFO: VERSPM RUN complete
[08:14.71] MainProcess/INFO: VERSPM ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmp0phkbmg8/VERSPM/output
   to: /Users/jeffnewman/sandbox/VisionEval-Archive/scp_VERSPM/exp_10.zip
[08:18.96] MainProcess/INFO: 1 cases completed
[08:18.97] MainProcess/INFO: VERSPM SETUP...
[08:19.02] MainProcess/INFO: VERSPM SETUP complete
[08:19.02] MainProcess/INFO: VERSPM RUN ...
[10:45.38] MainProcess/INFO: VERSPM RUN complete
[10:46.07] MainProcess/INFO: VERSPM ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmp0phkbmg8/VERSPM/output
   to: /Users/jeffnewman/sandbox/VisionEval-Archive/scp_VERSPM/exp_11.zip
[10:50.37] MainProce

Unnamed: 0_level_0,ValueOfTime,Income,LandUse,FuelCost,ElectricCost,TechMix,Bicycles,Transit,Parking,DemandManagement,GHGReduction,DVMTPerCapita,WalkTravelPerCapita,AirPollutionEm,FuelUse,TruckDelay,VehicleCost,VehicleCostLow
experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
10,16.735413,47141,growth,4.062616,0.112086,0.600716,0.097509,0.755257,0.105612,0.73284,0.0,18.704606,0.347953,798335.419095,35154270.0,0.0,3.808172,23.013648
11,12.629127,44071,base,2.530513,0.088456,0.047922,0.151414,3.912346,0.5227,0.271721,0.0,19.121169,0.359865,794644.35321,35021540.0,0.0,3.987629,18.35477
12,14.921822,48623,base,6.475403,0.188169,0.90142,0.191199,2.182498,0.956376,0.386272,0.0,18.338426,0.363667,749499.133927,32991710.0,0.0,3.717578,20.236893


## Multiprocessing for Running Multiple Experiments

The examples above are all single-process demonstrations of using TMIP-EMAT to run core model
VERSPM experiments. 

This core model is single threaded, and such that you can run multiple independent instances of
the model side-by-side on the same machine, so you can benefit from a multiprocessing 
approach.  This can be accomplished by splitting a design of experiments over several
processes that you start manually, or by using an automatic multiprocessing library such as 
`dask.distributed`.

In [None]:
design3 = fx.design_experiments(design_name='lhs_a', random_seed=3, n_samples=6)
design3

The module is set up to facilitate distributed multiprocessing. During the `setup`
step, the code detects if it is being run in a distributed "worker" environment instead of
in a normal Python environment.  If the "worker" environment is detected, then a copy
of the entire VERSPM model is made into the worker's local workspace, and the model
is run there instead of in the master workspace.  This allows each worker to edit the files
independently and simultaneously, without disturbing other parallel workers.

To run the model with parallel subprocesses,
we simply import the `get_client` function, and use that for the `evaluator` argument
in the `run_experiments` method.

In [None]:
from emat.util.distributed import get_client # for multi-process operation

In [None]:
fx.run_experiments(design=design3, evaluator=get_client(n_workers=6))