# Estimating Tour Mode Choice

This notebook illustrates how to re-estimate tour and subtour mode choice for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import os
import larch  # !conda install larch -c conda-forge # for estimation
import pandas as pd
import activitysim


In [None]:
from larch import excel

We'll work in our `test` directory, where ActivitySim has saved the estimation data bundles.

In [3]:
os.chdir('test')

In [4]:
data = pd.read_csv("output/estimation_data_bundle/tour_mode_choice/tour_mode_choice_values_combined.csv")

# Load data and prep model for estimation

In [5]:
modelname = "tour_mode_choice"

from activitysim.estimation.larch import component_model
model, data = component_model(modelname, return_data=True)

# Review data loaded from the EDB

The next step is to read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

### Coefficients

In [6]:
data.coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_one,1.0000,F
coef_nest_root,1.0000,F
coef_nest_AUTO_mandatory,0.1700,F
coef_nest_NONMOTORIZED_mandatory,0.1700,F
coef_wait_mandatory,-0.0300,F
...,...,...
SHARED3_coef_school_nonmandatory,0.0000,F
WALK_coef_school_nonmandatory,0.0000,F
SHARED2_coef_cargo_nonmandatory,0.0729,F
SHARED3_coef_cargo_nonmandatory,0.0819,F


#### Utility specification

In [7]:
data.spec

Unnamed: 0,Label,Description,Expression,DRIVEALONE,SHARED2,SHARED3,WALK
0,util_auto_wait,Auto Wait Time,(1 - sentri_crossing) * std_wait,coef_wait,coef_wait,coef_wait,
1,util_auto_wait_sentri,Auto Wait Time - Sentri,sentri_crossing * sentri_wait,coef_wait,coef_wait,coef_wait,
2,util_ped_wait,Walk - Wait Time,ped_wait,,,,coef_wait
3,util_trip_logsum_tour_da,Drive alone - Trip Logsum,logsum_DRIVEALONE_outbound + logsum_DRIVEALONE...,coef_trip_logsum,,,
4,util_trip_logsum_tour_s2,Shared Ride 2 - Trip Logsum,logsum_SHARED2_outbound + logsum_SHARED2_inbound,,coef_trip_logsum,,
5,util_trip_logsum_tour_s3,Shared Ride 3 - Trip Logsum,logsum_SHARED3_outbound + logsum_SHARED3_inbound,,,coef_trip_logsum,
6,util_trip_logsum_tour_walk,Walk - Trip Logsum,logsum_WALK_outbound + logsum_WALK_inbound,,,,coef_trip_logsum
7,util_ASC_s2,Shared Ride 2 - ASC,1,,SHARED2_asc,,
8,util_ASC_s3,Shared Ride 3 - ASC,1,,,SHARED3_asc,
9,util_ASC_walk,Walk - ASC,1,,,,WALK_asc


### Chooser data

In [8]:
data.chooser_data

Unnamed: 0_level_0,model_choice,override_choice,util_auto_wait,util_auto_wait_sentri,util_trip_logsum_tour_da,util_trip_logsum_tour_s2,util_trip_logsum_tour_s3,util_ped_wait,util_trip_logsum_tour_walk,util_ASC_s2,...,logsum_SHARED3_inbound,logsum_WALK_outbound,logsum_WALK_inbound,in_period,out_period,std_wait,sentri_wait,ped_wait,tour_id.1,override_choice_code
tour_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
246,WALK,WALK,0.0,12.9,-7.625506,-5.545335,-3.951179,38.8,-2.589052,1,...,-1.985876,-1.291829,-1.297223,EV,AM,56.7,12.9,38.8,246,4
618,SHARED2,SHARED2,0.0,12.9,-2.211632,-1.755930,-1.318645,38.8,-0.470712,1,...,-0.666771,-0.368134,-0.102578,PM,AM,56.7,12.9,38.8,618,2
756,SHARED2,DRIVEALONE,0.0,6.3,-1.355597,-0.109633,0.291086,12.8,-1.026423,1,...,0.138701,-0.506372,-0.520051,EV,MD,43.8,6.3,12.8,756,1
758,WALK,WALK,45.3,0.0,-6.438561,-4.499242,-3.604986,10.8,-5.380926,1,...,-1.813668,-2.677271,-2.703656,PM,MD,45.3,7.3,10.8,758,4
950,DRIVEALONE,DRIVEALONE,52.8,0.0,-5.532719,-4.551411,-3.753119,17.1,-4.497290,1,...,-1.931748,-2.192776,-2.304514,EV,EA,52.8,7.0,17.1,950,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
113224,SHARED2,SHARED2,47.9,0.0,-9.671152,-8.686475,-7.907415,12.9,-9.008587,1,...,-3.940172,-4.647307,-4.361280,EV,MD,47.9,5.7,12.9,113224,2
113541,DRIVEALONE,SHARED2,0.0,7.0,-3.958021,-3.497978,-2.976279,7.7,-1.707410,1,...,-1.479184,-0.882006,-0.825404,EV,PM,42.7,7.0,7.7,113541,2
113781,WALK,WALK,0.0,12.9,-1.269312,-0.041744,0.428863,38.8,0.118229,1,...,0.190405,0.063580,0.054649,PM,AM,56.7,12.9,38.8,113781,4
113857,DRIVEALONE,DRIVEALONE,52.6,0.0,-0.598727,1.609812,2.790274,16.8,1.152831,1,...,1.380744,0.582162,0.570668,PM,EA,52.6,12.5,16.8,113857,1


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [9]:
model.load_data()
model.doctor(repair_ch_av="-")

req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>
problem: low-variance-data-co (1 issues)
problem: low-variance-data-co (1 issues)
problem: low-variance-data-co (1 issues)
problem: low-variance-data-co (1 issues)
problem: low-variance-data-co (1 issues)
problem: low-variance-data-co (1 issues

[(<larch.Model (GEV) "shop">,
  ┣ low_variance_data_co:                   n                             example cols
  ┃                       low_variance_co  24  util_ASC_s2, util_ASC_s3, util_ASC_walk),
 (<larch.Model (GEV) "work">,
  ┣ low_variance_data_co:                   n                             example cols
  ┃                       low_variance_co  24  util_ASC_s2, util_ASC_s3, util_ASC_walk),
 (<larch.Model (GEV) "other">,
  ┣ low_variance_data_co:                   n                             example cols
  ┃                       low_variance_co  24  util_ASC_s2, util_ASC_s3, util_ASC_walk),
 (<larch.Model (GEV) "visit">,
  ┣ low_variance_data_co:                   n                             example cols
  ┃                       low_variance_co  24  util_ASC_s2, util_ASC_s3, util_ASC_walk),
 (<larch.Model (GEV) "cargo">,
  ┣ low_variance_data_co:                   n                             example cols
  ┃                       low_variance_co  24  util_ASC_

In [10]:
model.maximize_loglike(method="SLSQP", options={"maxiter": 1000})

Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
SHARED2_asc_nonmandatory,-0.665827,-1.08,0.0,,,0,,-0.665827
SHARED2_coef_calib_adj_nonmandatory,0.0,0.0,0.0,,,1,,0.0
SHARED2_coef_cargo_nonmandatory,-0.002668,0.0729,0.0,,,0,,-0.002668
SHARED2_coef_other_nonmandatory,0.144895,0.0723,0.0,,,0,,0.144895
SHARED2_coef_school_nonmandatory,0.0,0.0,0.0,,,0,,0.0
SHARED2_coef_sentri_nonmandatory,0.254594,0.421,0.0,,,0,,0.254594
SHARED2_coef_shop_nonmandatory,0.279895,-0.0608,0.0,,,0,,0.279895
SHARED2_coef_visit_nonmandatory,0.151851,0.0754,0.0,,,0,,0.151851
SHARED2_coef_work_nonmandatory,0.0,0.0,0.0,,,0,,0.0
SHARED3_asc_nonmandatory,-0.918745,-1.352,0.0,,,0,,-0.918745


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
SHARED2_asc_nonmandatory,-0.665827
SHARED2_coef_calib_adj_nonmandatory,0.000000
SHARED2_coef_cargo_nonmandatory,-0.002668
SHARED2_coef_other_nonmandatory,0.144895
SHARED2_coef_school_nonmandatory,0.000000
SHARED2_coef_sentri_nonmandatory,0.254594
SHARED2_coef_shop_nonmandatory,0.279895
SHARED2_coef_visit_nonmandatory,0.151851
SHARED2_coef_work_nonmandatory,0.000000
SHARED3_asc_nonmandatory,-0.918745

Unnamed: 0,0
SHARED2_asc_nonmandatory,-0.665827
SHARED2_coef_calib_adj_nonmandatory,0.0
SHARED2_coef_cargo_nonmandatory,-0.002668
SHARED2_coef_other_nonmandatory,0.144895
SHARED2_coef_school_nonmandatory,0.0
SHARED2_coef_sentri_nonmandatory,0.254594
SHARED2_coef_shop_nonmandatory,0.279895
SHARED2_coef_visit_nonmandatory,0.151851
SHARED2_coef_work_nonmandatory,0.0
SHARED3_asc_nonmandatory,-0.918745

Unnamed: 0,0
SHARED2_asc_nonmandatory,0.007694
SHARED2_coef_calib_adj_nonmandatory,0.0
SHARED2_coef_cargo_nonmandatory,0.001093
SHARED2_coef_other_nonmandatory,0.007205
SHARED2_coef_school_nonmandatory,0.0
SHARED2_coef_sentri_nonmandatory,0.003409
SHARED2_coef_shop_nonmandatory,-0.001503
SHARED2_coef_visit_nonmandatory,0.000899
SHARED2_coef_work_nonmandatory,0.0
SHARED3_asc_nonmandatory,-0.005265


### Estimated coefficients

In [11]:
model.parameter_summary()

Unnamed: 0,Value,Null Value
SHARED2_asc_nonmandatory,-0.666,0.0
SHARED2_coef_calib_adj_nonmandatory,0.0,0.0
SHARED2_coef_cargo_nonmandatory,-0.00267,0.0
SHARED2_coef_other_nonmandatory,0.145,0.0
SHARED2_coef_school_nonmandatory,0.0,0.0
SHARED2_coef_sentri_nonmandatory,0.255,0.0
SHARED2_coef_shop_nonmandatory,0.28,0.0
SHARED2_coef_visit_nonmandatory,0.152,0.0
SHARED2_coef_work_nonmandatory,0.0,0.0
SHARED3_asc_nonmandatory,-0.919,0.0


# Output Estimation Results

In [12]:
from activitysim.estimation.larch import update_coefficients


In [13]:
result_dir = data.edb_directory/"estimated"


In [14]:
update_coefficients(
    model, data, result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

### Write the model estimation report, including coefficient t-statistic and log likelihood

In [17]:
for i,model_i in enumerate(model):
    model_i.to_xlsx(
        result_dir/f"{modelname}_{i}_model_estimation.xlsx", 
        data_statistics=False,
    )

  xl = ExcelWriter(filename, engine='xlsxwriter_larch', model=model, **kwargs)


# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file to the configs folder, rename it to `*_coefficients.csv`, and run ActivitySim in simulation mode.

In [None]:
pd.read_csv(result_dir/f"{modelname}_coefficients_revised.csv")