# Getting Started with ActivitySim

This getting started guide is a [Jupyter notebook](https://jupyter.org/). It is an interactive Python 3 environment that describes how to set up, run, and begin to analyze the results of ActivitySim modeling scenarios. It is assumed users of ActivitySim are familiar with the basic concepts of activity-based modeling.  This tutorial covers:

*   Installation and setup
*   Setting up and running a base model
*   Inputs and outputs
*   Setting up and running an alternative scenario
*   Comparing results
*   Next steps and further reading

This notebook depends on [Anaconda Python 3 64bit](https://www.anaconda.com/distribution/).

# Install ActivitySim

The first step is to install activitysim from [pypi](https://pypi.org/project/activitysim/) (the Python package index).  It also installs dependent packages such as [tables](https://pypi.org/project/tables/) for reading/writing HDF5, [openmatrix](https://pypi.org/project/OpenMatrix/) for reading/writing OMX matrix, and [pyyaml](https://pypi.org/project/PyYAML/) for yaml settings files.

In [None]:
!pip install activitysim

# Creating an Example Setup

The example is included in the package and can be copied to a user defined location using the package's command line interface.  The example includes all model steps.  The command below copies the example_mtc example to a new example folder.  It also changes into the new example folder so we can run the model from there.

In [1]:
!activitysim create -e example_mtc -d example
%cd example

copying data ...
copying configs ...
copying configs_mp ...
copying output ...
copying README.MD ...
copied! new project files are in C:\projects\development\activitysim\activitysim\examples\example_mtc\notebooks\example
C:\projects\development\activitysim\activitysim\examples\example_mtc\notebooks\example


# Run the Example

The code below runs the example, which runs in a few minutes.  The example consists of 100 synthetic households and the first 25 zones in the example model region.  The full example (**example_mtc_full**) can be created and downloaded from the [activitysim resources](https://github.com/RSGInc/activitysim_resources) repository using activitysim's create command above.  As the model runs, it logs information to the screen.  

To run the example, use activitysim's built-in run command.  As shown in the script help, the default settings assume a configs, data, and output folder in the current directory.

In [2]:
!activitysim run -c configs -d data -o output

Configured logging using basicConfig
INFO:activitysim:Configured logging using basicConfig
INFO - Read logging configuration from: configs\logging.yaml
INFO - SETTING configs_dir: ['configs']
INFO - SETTING settings_file_name: settings.yaml
INFO - SETTING data_dir: ['data']
INFO - SETTING output_dir: output
INFO - SETTING households_sample_size: 100
INFO - SETTING chunk_size: 0
INFO - SETTING chunk_method: hybrid_uss
INFO - SETTING chunk_training_mode: training
INFO - SETTING multiprocess: None
INFO - SETTING num_processes: None
INFO - SETTING resume_after: None
INFO - SETTING trace_hh_id: 982875
INFO - ENV MKL_NUM_THREADS: None
INFO - ENV OMP_NUM_THREADS: None
INFO - ENV OPENBLAS_NUM_THREADS: None
INFO - NUMPY blas_info libraries: ['cblas', 'blas', 'cblas', 'blas', 'cblas', 'blas']
INFO - NUMPY blas_opt_info libraries: ['cblas', 'blas', 'cblas', 'blas', 'cblas', 'blas']
INFO - NUMPY lapack_info libraries: ['lapack', 'blas', 'lapack', 'blas']
INFO - NUMPY lapack_opt_info libraries: ['l

15.316797    1
15.641096    1
13.574187    1
15.586471    1
15.430079    1
15.412180    1
14.011234    1
15.522393    1
15.725096    1
13.564067    1
Name: logsum, dtype: int64
INFO - #run_model running step auto_ownership_simulate
INFO - Running auto_ownership_simulate with 100 households
INFO - auto_ownership_simulate.simple_simulate Running adaptive_chunked_choosers with 100 choosers
INFO - Running chunk 1 of 1 with 100 of 100 choosers
INFO - auto_ownership top 10 value counts:
0    60
1    40
Name: auto_ownership, dtype: int64
INFO - #run_model running step free_parking
INFO - Running free_parking with 97 persons
INFO - free_parking.simple_simulate Running adaptive_chunked_choosers with 97 choosers
INFO - Running chunk 1 of 1 with 97 of 97 choosers
INFO - free_parking top 10 value counts:
False    163
True       4
Name: free_parking_at_work, dtype: int64
INFO - #run_model running step cdap_simulate
INFO - Pre-building cdap specs
INFO - Time to execute build_cdap_spec hh_size 2 : 0.

INFO - Running non_mandatory_tour_destination.shopping.logsums with 366 rows
INFO - non_mandatory_tour_destination.shopping.logsums.compute_logsums Running adaptive_chunked_choosers with 366 choosers
INFO - Running chunk 1 of 1 with 366 of 366 choosers
INFO - Running tour_destination_simulate with 30 persons
INFO - non_mandatory_tour_destination.shopping.simulate.interaction_sample_simulate Running adaptive_chunked_choosers_and_alts with 30 choosers and 366 alternatives
INFO - Running chunk 1 of 1 with 30 of 30 choosers
INFO - Running eval_interaction_utilities on 366 rows
INFO - running non_mandatory_tour_destination.othmaint.sample with 20 tours
INFO - non_mandatory_tour_destination.othmaint.sample.interaction_sample Running adaptive_chunked_choosers with 20 choosers
INFO - Running chunk 1 of 1 with 20 of 20 choosers
INFO - Running eval_interaction_utilities on 500 rows
INFO - Running non_mandatory_tour_destination.othmaint.logsums with 286 rows
INFO - non_mandatory_tour_destination.

# Inputs and Outputs Overview

An ActivitySim model requires:

*  Configs: settings, model step expressions files, etc.​
  * settings.yaml - main settings file for running the model
  * network_los.yaml - network level-of-service (skims) settings file
  * [model].yaml - configuration file for the model step (such as auto ownership)
  * [model].csv - expressions file for the model step
*  Data: input data - input data tables and skims​
  * land_use.csv - zone data file
  * households.csv - synthethic households
  * persons.csv - synthethic persons
  * skims.omx - all skims in one open matrix file
*  Output: output data - output data, tables, tracing info, etc.
  * pipeline.h5 - data pipeline database file (all tables at each model step)
  * final_[table].csv - final household, person, tour, trip CSV tables
  * activitysim.log - console log file
  * trace.[model].csv - trace calculations for select households
*  simulation.py: main script to run the model

Run the command below to list the example folder contents.

In [3]:
import os
for root, dirs, files in os.walk(".", topdown=False):
   for name in files:
      print(os.path.join(root, name))
   for name in dirs:
      print(os.path.join(root, name))

.\configs\accessibility.csv
.\configs\accessibility.yaml
.\configs\annotate_households.csv
.\configs\annotate_households_cdap.csv
.\configs\annotate_households_workplace.csv
.\configs\annotate_landuse.csv
.\configs\annotate_persons.csv
.\configs\annotate_persons_after_hh.csv
.\configs\annotate_persons_cdap.csv
.\configs\annotate_persons_jtp.csv
.\configs\annotate_persons_mtf.csv
.\configs\annotate_persons_nmtf.csv
.\configs\annotate_persons_school.csv
.\configs\annotate_persons_workplace.csv
.\configs\atwork_subtour_destination.csv
.\configs\atwork_subtour_destination.yaml
.\configs\atwork_subtour_destination_coefficients.csv
.\configs\atwork_subtour_destination_sample.csv
.\configs\atwork_subtour_frequency.csv
.\configs\atwork_subtour_frequency.yaml
.\configs\atwork_subtour_frequency_alternatives.csv
.\configs\atwork_subtour_frequency_annotate_tours_preprocessor.csv
.\configs\atwork_subtour_frequency_coefficients.csv
.\configs\auto_ownership.csv
.\configs\auto_ownership.yaml
.\configs

# Inputs

Run the commands below to: 
* Load required Python libraries for reading data
* Display the settings.yaml, including the list of `models` to run
* Display the land_use, households, and persons tables
* Display the skims

In [4]:
print("Load libraries.")
import pandas as pd
import openmatrix as omx
import yaml
import glob

Load libraries.


In [5]:
print("Display the settings file.\n")

with open(r'configs/settings.yaml') as file:
    file_contents = yaml.load(file, Loader=yaml.FullLoader)
    print(yaml.dump(file_contents))

Display the settings file.

cbd_threshold: 2
check_for_variability: false
checkpoints: true
chunk_method: hybrid_uss
chunk_size: 0
chunk_training_mode: training
default_initial_rows_per_chunk: 500
distributed_vot_mu: 0.684
distributed_vot_sigma: 0.85
household_median_value_of_time:
  1: 6.01
  2: 8.81
  3: 10.44
  4: 12.86
households_sample_size: 100
input_table_list:
- filename: households.csv
  index_col: household_id
  keep_columns:
  - home_zone_id
  - income
  - hhsize
  - HHT
  - auto_ownership
  - num_workers
  rename_columns:
    HHID: household_id
    PERSONS: hhsize
    TAZ: home_zone_id
    VEHICL: auto_ownership
    workers: num_workers
  tablename: households
- filename: persons.csv
  index_col: person_id
  keep_columns:
  - household_id
  - age
  - PNUM
  - sex
  - pemploy
  - pstudent
  - ptype
  rename_columns:
    PERID: person_id
  tablename: persons
- filename: land_use.csv
  index_col: zone_id
  keep_columns:
  - DISTRICT
  - SD
  - county_id
  - TOTHH
  - TOTPOP
  

In [6]:
print("Display the network_los file.\n")

with open(r'configs/network_los.yaml') as file:
    file_contents = yaml.load(file, Loader=yaml.FullLoader)
    print(yaml.dump(file_contents))

Display the network_los file.

read_skim_cache: false
skim_time_periods:
  labels:
  - EA
  - EA
  - AM
  - MD
  - PM
  - EV
  period_minutes: 60
  periods:
  - 0
  - 3
  - 5
  - 9
  - 14
  - 18
  - 24
  time_window: 1440
taz_skims: skims.omx
write_skim_cache: true
zone_system: 1



In [7]:
print("Input land_use.  Primary key: TAZ.  Required additional fields depend on the downstream submodels (and expression files).")
pd.read_csv("data/land_use.csv")

Input land_use.  Primary key: TAZ.  Required additional fields depend on the downstream submodels (and expression files).


Unnamed: 0,TAZ,DISTRICT,SD,COUNTY,TOTHH,HHPOP,TOTPOP,EMPRES,SFDU,MFDU,...,area_type,HSENROLL,COLLFTE,COLLPTE,TOPOLOGY,TERMINAL,ZERO,hhlds,sftaz,gqpop
0,1,1,1,1,46,74,82,37,1,60,...,0,0.0,0.0,0.0,3,5.89564,0,46,1,8
1,2,1,1,1,134,214,240,107,5,147,...,0,0.0,0.0,0.0,1,5.84871,0,134,2,26
2,3,1,1,1,267,427,476,214,9,285,...,0,0.0,0.0,0.0,1,5.53231,0,267,3,49
3,4,1,1,1,151,239,253,117,6,210,...,0,0.0,0.0,0.0,2,5.6433,0,151,4,14
4,5,1,1,1,611,974,1069,476,22,671,...,0,0.0,72.14684,0.0,1,5.52555,0,611,5,95
5,6,1,1,1,2240,3311,3963,2052,0,2406,...,0,0.0,0.0,0.0,1,5.00004,0,2240,6,652
6,7,1,1,1,3762,5561,6032,3375,0,4174,...,0,0.0,0.0,0.0,1,5.35435,0,3762,7,471
7,8,1,1,1,4582,7565,9907,3594,19,4898,...,0,0.0,0.0,0.0,2,4.64648,0,4582,8,2342
8,9,1,1,1,5545,9494,10171,4672,35,6032,...,0,26.92893,2035.58118,20.60887,2,5.22542,0,5545,9,677
9,10,1,1,1,5344,9205,9308,5137,5,5663,...,0,0.0,690.54974,0.0,3,4.73802,0,5344,10,103


In [8]:
print("Input households.  Primary key: HHID.  Foreign key: TAZ.  Required additional fields depend on the downstream submodels (and expression files).")
pd.read_csv("data/households.csv")

Input households.  Primary key: HHID.  Foreign key: TAZ.  Required additional fields depend on the downstream submodels (and expression files).


Unnamed: 0,HHID,TAZ,SERIALNO,PUMA5,income,PERSONS,HHT,UNITTYPE,NOC,BLDGSZ,...,hschpred,hschdriv,htypdwel,hownrent,hadnwst,hadwpst,hadkids,bucketBin,originalPUMA,hmultiunit
0,2717868,25,2715386,2202,361000,2,1,0,0,9,...,0,0,2,1,0,0,0,3,2202,1
1,763899,6,5360279,2203,59220,1,4,0,0,9,...,0,0,2,2,0,0,0,4,2203,1
2,2222791,9,77132,2203,197000,2,2,0,0,9,...,0,0,2,1,0,0,1,5,2203,1
3,112477,17,3286812,2203,2200,1,6,0,0,8,...,0,0,2,2,0,0,0,7,2203,1
4,370491,21,6887183,2203,16500,3,1,0,1,8,...,1,0,2,2,0,0,0,7,2203,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4995,109218,10,3592966,2203,15000,1,4,0,0,8,...,0,0,2,2,0,0,0,7,2203,1
4996,570708,23,2418140,2202,13100,1,6,0,0,9,...,0,0,2,2,0,0,0,6,2202,1
4997,2762199,21,4016973,2203,0,1,0,2,0,0,...,0,0,2,2,0,0,0,2,2203,1
4998,2049372,18,965334,2203,103000,1,4,0,0,9,...,0,0,2,2,0,0,0,9,2203,1


In [9]:
print("Input persons.  Primary key: PERID.  Foreign key: household_id.  Required additional fields depend on the downstream submodels (and expression files).")
pd.read_csv("data/persons.csv")

Input persons.  Primary key: PERID.  Foreign key: household_id.  Required additional fields depend on the downstream submodels (and expression files).


Unnamed: 0,PERID,household_id,age,RELATE,ESR,GRADE,PNUM,PAUG,DDP,sex,WEEKS,HOURS,MSP,POVERTY,EARNS,pagecat,pemploy,pstudent,ptype,padkid
0,25671,25671,47,1,6,0,1,0,0,1,0,0,6,39,0,6,3,3,4,2
1,25675,25675,27,1,6,7,1,0,0,2,52,40,2,84,7200,5,3,2,3,2
2,25678,25678,30,1,6,0,1,0,0,2,0,0,6,84,0,5,3,3,4,2
3,25683,25683,23,1,6,0,1,0,0,1,0,0,6,1,0,4,3,3,4,2
4,25684,25684,52,1,6,0,1,0,0,1,0,0,6,94,0,7,3,3,4,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8207,7554848,2863513,68,22,6,0,1,0,0,1,0,0,4,0,0,8,3,3,5,2
8208,7554855,2863520,68,22,6,0,1,0,0,1,0,0,4,0,0,8,3,3,5,2
8209,7554859,2863524,93,22,6,0,1,0,0,2,0,0,2,0,0,9,3,3,5,2
8210,7554887,2863552,76,22,6,0,1,0,0,2,0,0,2,0,0,8,3,3,5,2


In [10]:
print("Skims.  All skims are input via one OMX file.  Required skims depend on the downstream submodels (and expression files).\n")
print(omx.open_file("data/skims.omx"))

Skims.  All skims are input via one OMX file.  Required skims depend on the downstream submodels (and expression files).

data/skims.omx (File) ''
Last modif.: 'Thu Apr 22 08:35:45 2021'
Object Tree: 
/ (RootGroup) ''
/data (Group) ''
/data/DIST (CArray(25, 25), shuffle, zlib(1)) ''
/data/DISTBIKE (CArray(25, 25), shuffle, zlib(1)) ''
/data/DISTWALK (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_BOARDS__AM (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_BOARDS__EA (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_BOARDS__EV (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_BOARDS__MD (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_BOARDS__PM (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_DDIST__AM (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_DDIST__EA (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_DDIST__EV (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_COM_WLK_DDIST__MD (CArray(25, 25), shuffle, zlib(1)) ''
/data/DRV_C

# Outputs

Run the commands below to: 
* Display the output household and person tables
* Display the output tour and trip tables

In [11]:
print("The output pipeline contains the state of each table after each model step.")
pipeline = pd.io.pytables.HDFStore('output/pipeline.h5')
pipeline.keys()

The output pipeline contains the state of each table after each model step.


['/checkpoints',
 '/workplace_modeled_size/workplace_location',
 '/workplace_destination_size/initialize_households',
 '/trips/stop_frequency',
 '/trips/trip_destination',
 '/trips/trip_mode_choice',
 '/trips/trip_purpose',
 '/trips/trip_purpose_and_destination',
 '/trips/trip_scheduling',
 '/tours/atwork_subtour_destination',
 '/tours/atwork_subtour_frequency',
 '/tours/atwork_subtour_mode_choice',
 '/tours/atwork_subtour_scheduling',
 '/tours/joint_tour_composition',
 '/tours/joint_tour_destination',
 '/tours/joint_tour_frequency',
 '/tours/joint_tour_participation',
 '/tours/joint_tour_scheduling',
 '/tours/mandatory_tour_frequency',
 '/tours/mandatory_tour_scheduling',
 '/tours/non_mandatory_tour_destination',
 '/tours/non_mandatory_tour_frequency',
 '/tours/non_mandatory_tour_scheduling',
 '/tours/stop_frequency',
 '/tours/tour_mode_choice_simulate',
 '/school_modeled_size/school_location',
 '/school_destination_size/initialize_households',
 '/persons/cdap_simulate',
 '/persons/fr

In [12]:
print("Households table after trip mode choice, which contains several calculated fields.")
pipeline['/households/joint_tour_frequency'] #watch out for key changes if not running all models

Households table after trip mode choice, which contains several calculated fields.


Unnamed: 0_level_0,home_zone_id,income,hhsize,HHT,auto_ownership,num_workers,sample_rate,income_in_thousands,income_segment,median_value_of_time,...,hh_work_auto_savings_ratio,num_under16_not_at_school,num_travel_active,num_travel_active_adults,num_travel_active_preschoolers,num_travel_active_children,num_travel_active_non_preschoolers,participates_in_jtf_model,joint_tour_frequency,num_hh_joint_tours
household_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
982875,16,30900,2,5,1,2,0.02,30.90,2,8.81,...,0.399721,0,2,2,0,0,2,True,0_tours,0
1810015,16,99700,9,2,1,4,0.02,99.70,3,10.44,...,0.711955,0,7,6,1,1,6,True,0_tours,0
1099626,20,58160,3,1,1,1,0.02,58.16,2,8.81,...,0.264600,0,3,2,1,1,2,True,0_tours,0
763879,6,59220,1,4,1,0,0.02,59.22,2,8.81,...,0.000000,0,1,1,0,0,1,False,0_tours,0
824207,18,51000,1,4,0,1,0.02,51.00,2,8.81,...,0.187061,0,1,1,0,0,1,False,0_tours,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
287819,9,6500,2,3,1,0,0.02,6.50,1,6.01,...,0.000000,0,2,1,0,1,2,True,0_tours,0
2832313,10,0,1,0,0,0,0.02,0.00,1,6.01,...,0.000000,0,1,1,0,0,1,False,0_tours,0
2222549,7,112500,2,5,0,2,0.02,112.50,4,12.86,...,0.243357,0,2,2,0,0,2,True,0_tours,0
2048809,11,145450,1,4,1,1,0.02,145.45,4,12.86,...,0.138205,0,1,1,0,0,1,False,0_tours,0


In [13]:
print("Final output households table to written to CSV, which is the same as the table in the pipeline.")
pd.read_csv("output/final_households.csv")

Final output households table to written to CSV, which is the same as the table in the pipeline.


Unnamed: 0,household_id,home_zone_id,income,hhsize,HHT,auto_ownership,num_workers,sample_rate,income_in_thousands,income_segment,...,hh_work_auto_savings_ratio,num_under16_not_at_school,num_travel_active,num_travel_active_adults,num_travel_active_preschoolers,num_travel_active_children,num_travel_active_non_preschoolers,participates_in_jtf_model,joint_tour_frequency,num_hh_joint_tours
0,982875,16,30900,2,5,1,2,0.02,30.90,2,...,0.399721,0,2,2,0,0,2,True,0_tours,0
1,1810015,16,99700,9,2,1,4,0.02,99.70,3,...,0.711955,0,7,6,1,1,6,True,0_tours,0
2,1099626,20,58160,3,1,1,1,0.02,58.16,2,...,0.264600,0,3,2,1,1,2,True,0_tours,0
3,763879,6,59220,1,4,1,0,0.02,59.22,2,...,0.000000,0,1,1,0,0,1,False,0_tours,0
4,824207,18,51000,1,4,0,1,0.02,51.00,2,...,0.187061,0,1,1,0,0,1,False,0_tours,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,287819,9,6500,2,3,1,0,0.02,6.50,1,...,0.000000,0,2,1,0,1,2,True,0_tours,0
96,2832313,10,0,1,0,0,0,0.02,0.00,1,...,0.000000,0,1,1,0,0,1,False,0_tours,0
97,2222549,7,112500,2,5,0,2,0.02,112.50,4,...,0.243358,0,2,2,0,0,2,True,0_tours,0
98,2048809,11,145450,1,4,1,1,0.02,145.45,4,...,0.138205,0,1,1,0,0,1,False,0_tours,0


In [14]:
print("Final output persons table to written to CSV, which is the same as the table in the pipeline.")
pd.read_csv("output/final_persons.csv")

Final output persons table to written to CSV, which is the same as the table in the pipeline.


Unnamed: 0,person_id,household_id,age,PNUM,sex,pemploy,pstudent,ptype,age_16_to_19,age_16_p,...,num_joint_tours,non_mandatory_tour_frequency,num_non_mand,num_escort_tours,num_eatout_tours,num_shop_tours,num_maint_tours,num_discr_tours,num_social_tours,num_non_escort_tours
0,26478,26478,46,1,1,3,3,4,False,True,...,0,12,2,0,1,0,1,0,0,2
1,26686,26686,39,1,1,3,3,4,False,True,...,0,12,2,0,1,0,1,0,0,2
2,26844,26844,51,1,1,3,3,4,False,True,...,0,2,1,0,0,0,0,0,1,1
3,27726,27726,52,1,1,3,3,4,False,True,...,0,1,1,0,0,0,0,1,0,1
4,27748,27748,57,1,2,3,3,4,False,True,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
162,7523517,2832182,87,1,2,3,3,5,False,True,...,0,1,1,0,0,0,0,1,0,1
163,7523648,2832313,78,1,2,3,3,5,False,True,...,0,1,1,0,0,0,0,1,0,1
164,7523764,2832429,93,1,2,3,3,5,False,True,...,0,0,0,0,0,0,0,0,0,0
165,7539466,2848131,38,1,2,3,2,3,False,True,...,0,16,1,0,0,1,0,0,0,1


In [15]:
print("Final output tours table to written to CSV, which is the same as the table in the pipeline.  Joint tours are stored as one record.")
pd.read_csv("output/final_tours.csv")

Final output tours table to written to CSV, which is the same as the table in the pipeline.  Joint tours are stored as one record.


Unnamed: 0,tour_id,person_id,tour_type,tour_type_count,tour_type_num,tour_num,tour_count,tour_category,number_of_participants,destination,...,end,duration,composition,destination_logsum,tour_mode,mode_choice_logsum,atwork_subtour_frequency,parent_tour_id,stop_frequency,primary_purpose
0,10828426,264107,work,1,1,1,1,mandatory,1,24.0,...,19.0,12.0,,,WALK_LRF,5.706465,no_subtours,,0out_0in,work
1,10834207,264248,work,1,1,1,1,mandatory,1,22.0,...,19.0,12.0,,,WALK_LRF,5.740181,no_subtours,,3out_0in,work
2,13271288,323689,work,1,1,1,1,mandatory,1,1.0,...,18.0,7.0,,,WALK_LRF,5.762750,no_subtours,,0out_0in,work
3,13286130,324051,work,1,1,1,1,mandatory,1,13.0,...,18.0,12.0,,,WALK,1.894204,no_subtours,,0out_0in,work
4,13286171,324052,work,1,1,1,1,mandatory,1,2.0,...,17.0,10.0,,,WALK_LOC,2.055649,eat,,0out_0in,work
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
197,143309067,3495343,eat,1,1,1,1,atwork,1,16.0,...,14.0,0.0,,15.544104,WALK,4.189554,,143309102.0,3out_0in,atwork
198,171036547,4171623,eat,1,1,1,1,atwork,1,7.0,...,10.0,0.0,,12.963491,WALK,-0.212087,,171036582.0,0out_1in,atwork
199,220897896,5387753,maint,1,1,1,1,atwork,1,15.0,...,14.0,1.0,,15.821604,WALK,5.934528,,220897912.0,0out_0in,atwork
200,220958270,5389226,eat,1,1,1,1,atwork,1,2.0,...,13.0,0.0,,15.712982,WALK,6.339068,,220958305.0,0out_0in,atwork


In [16]:
print("Final output trips table to written to CSV, which is the same as the table in the pipeline.  Joint trips are stored as one record")
pd.read_csv("output/final_trips.csv")

Final output trips table to written to CSV, which is the same as the table in the pipeline.  Joint trips are stored as one record


Unnamed: 0,trip_id,person_id,household_id,tour_id,primary_purpose,trip_num,outbound,trip_count,purpose,destination,origin,destination_logsum,depart,trip_mode,mode_choice_logsum
0,8684833,26478,26478,1085604,eatout,1,True,1,eatout,13,8,,11.0,WALK,-1.171760
1,8684837,26478,26478,1085604,eatout,1,False,1,Home,8,13,,11.0,WALK,-1.238719
2,8685009,26478,26478,1085626,othmaint,1,True,1,othmaint,10,8,,12.0,BIKE,6.198626
3,8685013,26478,26478,1085626,othmaint,1,False,1,Home,8,10,,13.0,BIKE,6.175681
4,8753057,26686,26686,1094132,eatout,1,True,1,eatout,5,8,,19.0,WALK,4.457539
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
477,2472945113,7539466,2848131,309118139,shopping,1,True,1,shopping,8,3,,18.0,WALK_LOC,12.537675
478,2472945117,7539466,2848131,309118139,shopping,1,False,2,shopping,25,8,56.842247,21.0,WALK_LOC,11.880804
479,2472945118,7539466,2848131,309118139,shopping,2,False,2,Home,3,25,,22.0,WALK,13.710030
480,2473024473,7539708,2848373,309128059,univ,1,True,1,univ,13,18,,16.0,WALK_LOC,-0.530696


# Other notable outputs

In [17]:
print("Final output accessibility table to written to CSV.")
pd.read_csv("output/final_accessibility.csv")

Final output accessibility table to written to CSV.


Unnamed: 0,zone_id,auPkRetail,auPkTotal,auOpRetail,auOpTotal,trPkRetail,trPkTotal,trOpRetail,trOpTotal,nmRetail,nmTotal
0,1,9.316494,12.615176,9.307437,12.607849,7.764264,11.145248,7.693086,11.037286,8.137361,11.726242
1,2,9.316898,12.613461,9.304627,12.604209,7.511301,10.950046,7.42706,10.763102,8.142717,11.724186
2,3,9.293217,12.580014,9.286242,12.574902,7.340975,10.787608,7.252678,10.574954,8.050369,11.478913
3,4,9.357349,12.630894,9.348249,12.623586,7.873327,11.224171,7.814365,11.135416,8.371197,11.775231
4,5,9.343551,12.585069,9.333262,12.574554,7.589356,11.08255,7.549557,11.027965,8.318059,11.431764
5,6,9.27135,12.523449,9.265762,12.519698,7.313872,10.504311,7.068341,10.25179,7.838241,11.023738
6,7,9.293194,12.528401,9.286373,12.520416,7.64191,10.805003,7.607878,10.75251,8.016915,11.108805
7,8,9.267844,12.497146,9.262133,12.489886,7.546934,10.834136,7.501424,10.77932,7.981951,11.052153
8,9,9.189503,12.426036,9.184035,12.41546,7.188751,10.303186,7.149057,10.26061,7.41563,10.758663
9,10,9.186004,12.40389,9.180762,12.396344,7.379336,10.548675,7.306522,10.495922,7.567826,10.694411


In [18]:
print("Joint tour participants table, which contains the person ids of joint tour participants.")
pipeline['joint_tour_participants/joint_tour_participation']

Joint tour participants table, which contains the person ids of joint tour participants.


Unnamed: 0_level_0,tour_id,household_id,person_id,participant_num
participant_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
22095827901,220958279,2223759,5389226,1
22095827902,220958279,2223759,5389227,2
10079851903,100798519,1173905,2458502,1
10079851904,100798519,1173905,2458503,2
13072777702,130727777,1402945,3188483,1
13072777703,130727777,1402945,3188484,2
13072777704,130727777,1402945,3188485,3


In [19]:
print("Destination choice sample logsums table for school location if want_dest_choice_sample_tables=True.")
if '/school_location_sample/school_location' in pipeline:
    pipeline['/school_location_sample/school_location']

Destination choice sample logsums table for school location if want_dest_choice_sample_tables=True.


# Trip matrices

A **write_trip_matrices** step at the end of the model adds boolean indicator columns to the trip table in order to assign each trip into a trip matrix and then aggregates the trip counts and writes OD matrices to OMX (open matrix) files.  The coding of trips into trip matrices is done via annotation expressions.  

In [20]:
print("trip matrices by time of day for assignment")
output_files = os.listdir("output")
for output_file in output_files:
    if "omx" in output_file:
        print(output_file)

trip matrices by time of day for assignment
trips_am.omx
trips_ea.omx
trips_ev.omx
trips_md.omx
trips_pm.omx


# Tracing calculations

Tracing calculations is an important part of model setup and debugging.  Often times data issues, such as missing values in input data and/or incorrect submodel expression files, do not reveal themselves until a downstream submodels fails.  There are two types of tracing in ActivtiySim: household and origin-destination (OD) pair. If a household trace ID is specified via `trace_hh_id`, then ActivitySim will output a comprehensive set of trace files for all calculations for all household members.  These trace files are listed below and explained.

In [21]:
print("All trace files.\n")
glob.glob("output/trace/*.csv")


All trace files.



['output/trace\\atwork_subtour_destination.csv',
 'output/trace\\atwork_subtour_frequency.atwork_subtour_frequency_annotate_tours_preprocessor.csv',
 'output/trace\\atwork_subtour_frequency.atwork_subtour_frequency_annotate_tours_preprocessor_locals.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.choices.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.choosers.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.eval_utils.expression_values.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.eval_utils.expression_value_business1.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.eval_utils.expression_value_business2.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.eval_utils.expression_value_eat.csv',
 'output/trace\\atwork_subtour_frequency.simple_simulate.eval_mnl.eval_utils.expression_value_eat_business.csv',
 'output/trace\\atwork_subtour_frequency.simp

In [22]:
print("Trace files for auto ownership.\n")
glob.glob("output/trace/auto_ownership*.csv")

Trace files for auto ownership.



['output/trace\\auto_ownership.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.choices.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.choosers.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_values.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_value_cars0.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_value_cars1.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_value_cars2.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_value_cars3.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_value_cars4.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.probs.csv',
 'output/trace\\auto_ownership_simulate.simple_simulate.eval_mnl.rands.csv',
 'output/trace\\auto_owne

In [23]:
print("Trace chooser data for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.choosers.csv")

Trace chooser data for auto ownership.



Unnamed: 0,label,value
0,household_id,982875
1,home_zone_id,16
2,income,30900
3,hhsize,2
4,HHT,5
...,...,...
59,TERMINAL,4.75017
60,household_density,71.8980801556283
61,employment_density,273.02374467923295
62,density_index,56.91110757846511


In [24]:
print("Trace utility expression values for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.eval_utils.expression_values.csv")

Trace utility expression values for auto ownership.



Unnamed: 0,Expression,Label,0
0,num_drivers==2,util_drivers_2,1.0
1,num_drivers==3,util_drivers_3,0.0
2,num_drivers>3,util_drivers_4_up,0.0
3,num_children_16_to_17,util_persons_16_17,0.0
4,num_college_age,util_persons_18_24,0.0
5,num_young_adults,util_persons_25_34,2.0
6,num_young_children>0,util_presence_children_0_4,0.0
7,(num_children_5_to_15+num_children_16_to_17)>0,util_presence_children_5_17,0.0
8,@df.num_workers.clip(upper=3),util_num_workers_clip_3,2.0
9,"@df.income_in_thousands.clip(0, 30)",util_hh_income_0_30k,30.0


In [25]:
print("Trace alternative total utilities for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.utilities.csv")

Trace alternative total utilities for auto ownership.



Unnamed: 0,alternative,utility
0,household_id,982875.0
1,cars0,0.0
2,cars1,-0.150164
3,cars2,-5.680333
4,cars3,-12.77977
5,cars4,-14.672508


In [26]:
print("Trace alternative probabilities for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.probs.csv")

Trace alternative probabilities for auto ownership.



Unnamed: 0,alternative,probability
0,household_id,982875.0
1,cars0,0.5364857
2,cars1,0.4616819
3,cars2,0.001830715
4,cars3,1.511383e-06
5,cars4,2.277031e-07


In [27]:
print("Trace random number for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.rands.csv")

Trace random number for auto ownership.



Unnamed: 0,household_id,rand
0,982875,0.746306


In [28]:
print("Trace choice for auto ownership.\n")
pd.read_csv("output\\trace\\auto_ownership_simulate.simple_simulate.eval_mnl.choices.csv")

Trace choice for auto ownership.



Unnamed: 0,household_id,auto_ownership
0,982875,1


# Run the Multiprocessor Example

The command below runs the multiprocessor example, which runs in a few minutes.  It uses settings inheritance to override setings in the configs folder with settings in the configs_mp folder.  This allows for re-using expression files and settings files in the single and multiprocessed setups.  The multiprocessed example uses the following additional settings:

```
num_processes: 2
chunk_size: 0

multiprocess_steps:
  - name: mp_initialize
    begin: initialize_landuse
  - name: mp_households
    begin: school_location
    slice:
      tables:
        - households
        - persons
  - name: mp_summarize
    begin: write_data_dictionary

```

In brief, `num_processes` specifies the number of processors to use and a `chunk_size` of `0` means ActivitySim is free to use all the available RAM if needed.  The `multiprocess_steps` specifies the beginning, middle, and end steps in multiprocessing.  The `mp_initialize` step is single processed because there is no `slice` setting.  It starts with the `initialize_landuse` submodel and runs until the submodel identified by the next multiprocess submodel starting point, `school_location`.  The `mp_households` step is multiprocessed and the households and persons tables are sliced and allocated to processes using the chunking settings.  The rest of the submodels are run multiprocessed until the final multiprocess step.  The `mp_summarize` step is single processed because there is no `slice` setting and it writes outputs.  See [multiprocessing](https://activitysim.github.io/activitysim/core.html#multiprocessing) and [chunk_size](https://activitysim.github.io/activitysim/core.html#chunk) for more information.  

In [29]:
!activitysim run -c configs_mp -c configs -d data -o output

Configured logging using basicConfig
INFO:activitysim:Configured logging using basicConfig
[WinError 32] The process cannot access the file because it is being used by another process: 'output\\pipeline.h5'
INFO - activitysim - Read logging configuration from: configs_mp\logging.yaml
INFO - activitysim.cli.run - SETTING configs_dir: ['configs_mp', 'configs']
INFO - activitysim.cli.run - SETTING settings_file_name: settings.yaml
INFO - activitysim.cli.run - SETTING data_dir: ['data']
INFO - activitysim.cli.run - SETTING output_dir: output
INFO - activitysim.cli.run - SETTING households_sample_size: 100
INFO - activitysim.cli.run - SETTING chunk_size: 0
INFO - activitysim.cli.run - SETTING chunk_method: hybrid_uss
INFO - activitysim.cli.run - SETTING chunk_training_mode: training
INFO - activitysim.cli.run - SETTING multiprocess: True
INFO - activitysim.cli.run - SETTING num_processes: 2
INFO - activitysim.cli.run - SETTING resume_after: None
INFO - activitysim.cli.run - SETTING trace_hh

INFO - activitysim.core.mp_tasks - mp_households_0 joint_tour_frequency : 1.845 seconds (0.0 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 joint_tour_composition : 1.016 seconds (0.0 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 joint_tour_participation : 1.752 seconds (0.0 minutes)
INFO - activitysim.core.mp_tasks - mp_households_1 joint_tour_destination : 9.291 seconds (0.2 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 joint_tour_destination : 4.38 seconds (0.1 minutes)
INFO - activitysim.core.mp_tasks - mp_households_1 joint_tour_scheduling : 3.463 seconds (0.1 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 joint_tour_scheduling : 3.254 seconds (0.1 minutes)
INFO - activitysim.core.mp_tasks - mp_households_1 non_mandatory_tour_frequency : 13.382 seconds (0.2 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 non_mandatory_tour_frequency : 12.86 seconds (0.2 minutes)
INFO - activitysim.core.mp_tasks - mp_households_0 non

# Next Steps and Further Reading

For futher information on the software, management consortium, and activity-based models in general, see the resources below. 

* ActivitySim
  * [User Documentation](https://activitysim.github.io/activitysim/)
  * [GitHub Repository](https://github.com/ActivitySim/activitysim)
  * [Project Wiki](https://github.com/ActivitySim/activitysim/wiki)
* [Activity-Based Travel Demand Models: A Primer](http://www.trb.org/Publications/Blurbs/170963.aspx)