# Estimating Joint and Non-Mandatory Tour Destination Choice

This notebook illustrates how to re-estimate both the joint tour destination and 
non-mandatory tour destination choice models.  These models share parameters and need
to be re-estimated together. This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import larch  # !conda install larch #for estimation
import pandas as pd
import numpy as np
import yaml 
import larch.util.excel
import os

We'll work in our `test` directory, where ActivitySim has saved the estimation data bundles.

In [2]:
os.chdir('test')

# Load data and prep model for estimation

In [3]:
modelnames = ("non_mandatory_tour_destination", "joint_tour_destination")

In [4]:
from activitysim.estimation.larch import component_model
model, data = component_model(modelnames, return_data=True)

The resulting model is actually a larch `ModelGroup`, which exposes a similar 
API as a single model, and allows for joint estimation of all parameters across
all models in the group.

In [5]:
type(model)

larch.model.model_group.ModelGroup

# Review data loaded from EDB

Next we can review what was read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

## coefficients

In [6]:
data[0].coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_mode_logsum,0.6755,F
coef_escort_dist_0_2,-0.1499,F
coef_eatout_dist_0_2,-0.5609,F
coef_eatout_social_0_2,-0.5609,F
coef_othdiscr_dist_0_2,-0.1677,F
coef_escort_dist_2_5,-0.8671,F
coef_shopping_dist_2_5,-0.5655,F
coef_eatout_dist_2_5,-0.3192,F
coef_othmaint_dist_2_5,-0.6055,F
coef_social_dist_2_5,-0.3485,F


## alt_values

In [7]:
data[0].alt_values

Unnamed: 0,tour_id,variable,1,2,3,4,5,6,7,8,...,181,182,183,184,185,186,187,188,189,190
0,6812,variable_label0001,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,6812,variable_label0004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,6812,variable_label0002,1.3199999332427979,1.0299999713897705,0.8499999046325684,1.190000057220459,1.0299999713897705,0.7799999713897705,0.8399999141693115,0.9900000095367432,...,3.0,3.0,3.0,2.7800002098083496,2.25,3.0,3.0,3.0,3.0,3.0
3,6812,variable_label0003,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.17999982833862305,1.0900001525878906,0.619999885559082,0.0,0.0,0.36999988555908203,0.7699999809265137,0.880000114440918,1.8299999237060547,1.9099998474121094
4,6812,variable_label0005,6.577240859271166,7.294423493874009,5.752909090156044,7.026094604325162,8.012728626482545,6.442966827027993,7.224737382332969,6.539973163879868,...,4.866333729879064,5.1845549872093715,5.945708525269778,6.047381635266343,6.06715626447492,5.987948372605412,5.730242629913322,7.343549119396827,3.6375335267623483,5.304200029706192
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22405,309796969,variable_label0005,6.920922213193693,7.5031529068855916,5.943363520428754,7.177433500369317,7.744893706655902,7.132142856175178,7.776358418630959,7.492867258990462,...,6.295404368250281,6.542860933838223,7.009064302237333,6.995735936068004,7.245322332115837,6.883649044438551,6.242536445981009,7.341552978800834,6.826019054785318,6.176073070133304
22406,309796969,variable_label0006,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
22407,309796969,variable_label0008,5.933573903741054,5.344185216853384,6.89562358280515,5.647237616473043,5.057110403289843,5.668668226461269,5.005364682150095,5.194710813754951,...,6.636636760302007,6.2913541739318335,5.78578181451264,5.731482098740245,5.204415731979751,5.650324057176968,6.038731661211121,4.657280162420779,5.539483971790338,6.352944918087955
22408,309796969,variable_label0000,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [8]:
data[1].alt_values

Unnamed: 0,tour_id,variable,1,2,3,4,5,6,7,8,...,181,182,183,184,185,186,187,188,189,190
0,7785298,variable_label0001,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,7785298,variable_label0004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,7785298,variable_label0002,3.0,3.0,3.0,3.0,3.0,3.0,3.0,2.8600001335144043,...,3.0,3.0,2.8299999237060547,2.739999771118164,2.179999828338623,2.3499999046325684,1.4000000953674316,0.8399999141693115,1.5699999332427979,1.9100000858306885
3,7785298,variable_label0003,0.6700000762939453,0.6100001335144043,0.5399999618530273,0.4200000762939453,0.2199997901916504,0.2199997901916504,0.059999942779541016,0.0,...,0.940000057220459,0.07000017166137695,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,7785298,variable_label0005,6.577240859271166,7.294423493874009,5.752909090156044,7.026094604325162,8.012728626482545,6.442966827027993,7.224737382332969,6.539973163879868,...,4.866333729879064,5.1845549872093715,5.945708525269778,6.047381635266343,6.06715626447492,5.987948372605412,5.730242629913322,7.343549119396827,3.6375335267623483,5.304200029706192
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
814,301810980,variable_label0005,7.1032677847373105,7.823256899589877,6.287721004233,7.524478114358913,8.260705863875177,7.010517622311502,7.824700996306483,6.949116008358706,...,4.686971346676491,4.924888571639326,6.1286361810072245,6.008528132533862,6.327743907987008,6.145720935633594,5.563462376541104,7.432773706720803,3.829901471825824,5.955267258796094
815,301810980,variable_label0006,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
816,301810980,variable_label0008,5.435061452848071,4.708514344404673,6.2363992236454076,4.98543309636194,4.2284383398225796,5.478626581386255,4.646955223240626,5.500680232862738,...,5.932115933542424,6.178598653123427,4.696321042115209,4.610559147685053,4.291343372231907,4.552081408216856,5.61268490695845,3.809978548664433,7.98807577917251,5.735554926665741
817,301810980,variable_label0000,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,0.9399999976158142,1.0,1.0,1.0,1.0,1.0


## chooser_data

In [9]:
data[0].chooser_data

Unnamed: 0,tour_id,model_choice,override_choice,person_id,tour_type,home_zone_id
0,6812,72,72.0,166,eatout,71
1,8110,62,47.0,197,shopping,80
2,11013,33,32.0,268,othdiscr,91
3,11016,71,71.0,268,othmaint,91
4,15403,67,67.0,375,othmaint,105
...,...,...,...,...,...,...
2485,309760814,154,154.0,7555141,shopping,85
2486,309760815,41,41.0,7555141,shopping,85
2487,309790009,36,36.0,7555853,social,115
2488,309796968,94,94.0,7556023,othdiscr,136


In [10]:
data[1].chooser_data

Unnamed: 0,tour_id,model_choice,override_choice,person_id,tour_type,home_zone_id
0,7785298,113,103.0,189885,eatout,135
1,8708454,103,106.0,212401,eatout,8
2,9715006,188,188.0,236951,othdiscr,183
3,10831112,105,105.0,264173,shopping,10
4,20334787,157,157.0,495970,othmaint,140
...,...,...,...,...,...,...
86,283676518,115,115.0,6918939,shopping,121
87,295260168,7,7.0,7201469,social,114
88,297646485,90,89.0,7259670,othdiscr,25
89,298814741,147,147.0,7288164,othmaint,74


## landuse

In [11]:
data[0].landuse

Unnamed: 0_level_0,DISTRICT,SD,county_id,TOTHH,TOTPOP,TOTACRE,RESACRE,CIACRE,TOTEMP,AGE0519,...,area_type,HSENROLL,COLLFTE,COLLPTE,TOPOLOGY,TERMINAL,household_density,employment_density,density_index,is_cbd
zone_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1,1,1,46,82,20.3,1.0,15.00000,27318,7,...,0,0.00000,0.00000,0.00000,3,5.89564,2.875000,1707.375000,2.870167,False
2,1,1,1,134,240,31.1,1.0,24.79297,42078,19,...,0,0.00000,0.00000,0.00000,1,5.84871,5.195214,1631.374751,5.178722,False
3,1,1,1,267,476,14.7,1.0,2.31799,2445,38,...,0,0.00000,0.00000,0.00000,1,5.53231,80.470405,736.891913,72.547987,False
4,1,1,1,151,253,19.3,1.0,18.00000,22434,20,...,0,0.00000,0.00000,0.00000,2,5.64330,7.947368,1180.736842,7.894233,False
5,1,1,1,611,1069,52.7,1.0,15.00000,15662,86,...,0,0.00000,72.14684,0.00000,1,5.52555,38.187500,978.875000,36.753679,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
186,4,4,1,2779,8062,376.0,172.0,15.00000,1760,1178,...,3,0.00000,0.00000,0.00000,1,2.04173,14.860963,9.411765,5.762347,False
187,4,4,1,1492,4139,214.0,116.0,10.00000,808,603,...,3,0.00000,0.00000,0.00000,2,1.73676,11.841270,6.412698,4.159890,False
188,4,4,1,753,4072,232.0,11.0,178.00000,4502,1117,...,2,3961.04761,17397.79102,11152.93652,1,2.28992,3.984127,23.820106,3.413233,False
189,4,4,1,3546,8476,201.0,72.0,6.00000,226,1057,...,2,0.00000,0.00000,0.00000,1,2.88773,45.461538,2.897436,2.723836,False


## spec

In [12]:
data[0].spec

Unnamed: 0,Label,Expression,Description,escort,escortkids,escortnokids,shopping,eatout,othmaint,social,othdiscr
0,variable_label0000,"@skims['DIST'].clip(0,1)","Distance, piecewise linear from 0 to 1 miles",coef_escort_dist_0_2,coef_escort_dist_0_2,coef_escort_dist_0_2,0,coef_eatout_dist_0_2,0,coef_eatout_dist_0_2,coef_othdiscr_dist_0_2
1,variable_label0001,"@(skims['DIST']-1).clip(0,1)","Distance, piecewise linear from 1 to 2 miles",coef_escort_dist_0_2,coef_escort_dist_0_2,coef_escort_dist_0_2,0,coef_eatout_dist_0_2,0,coef_eatout_dist_0_2,coef_othdiscr_dist_0_2
2,variable_label0002,"@(skims['DIST']-2).clip(0,3)","Distance, piecewise linear from 2 to 5 miles",coef_escort_dist_2_5,coef_escort_dist_2_5,coef_escort_dist_2_5,coef_shopping_dist_2_5,coef_eatout_dist_2_5,coef_othmaint_dist_2_5,coef_social_dist_2_5,coef_othdiscr_dist_2_5
3,variable_label0003,"@(skims['DIST']-5).clip(0,10)","Distance, piecewise linear from 5 to 15 miles",coef_escort_dist_5_plus,coef_escort_dist_5_plus,coef_escort_dist_5_plus,coef_shopping_dist_5_plus,coef_eatout_dist_5_plus,coef_othmaint_dist_5_plus,coef_social_dist_5_plus,coef_othdiscr_dist_5_plus
4,variable_label0004,@(skims['DIST']-15.0).clip(0),"Distance, piecewise linear for 15+ miles",coef_escort_dist_5_plus,coef_escort_dist_5_plus,coef_escort_dist_5_plus,coef_shopping_dist_5_plus,coef_eatout_dist_5_plus,coef_othmaint_dist_5_plus,coef_social_dist_5_plus,coef_othdiscr_dist_5_plus
5,variable_label0006,@df['size_term']==0,No attractions,-999,-999,-999,-999,-999,-999,-999,-999
6,variable_label0007,mode_choice_logsum,Mode choice logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum,coef_mode_logsum
7,variable_label0008,"@np.minimum(np.log(df.pick_count/df.prob), 60)",Sample of alternatives correction factor,1,1,1,1,1,1,1,1


## size_spec

In [13]:
data[0].size_spec

Unnamed: 0_level_0,TOTHH,RETEMPN,HEREMPN,OTHEMPN,AGE0519,HSENROLL
segment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
escort,0.0,0.225,0.144,0.0,0.465,0.166
shopping,0.0,1.0,0.0,0.0,0.0,0.0
eatout,0.0,0.742,0.258,0.0,0.0,0.0
othmaint,0.0,0.482,0.518,0.0,0.0,0.0
social,0.0,0.522,0.478,0.0,0.0,0.0
othdiscr,0.252,0.212,0.272,0.165,0.0,0.098


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [14]:
model.estimate(method='SLSQP')

req_data does not request avail_ca or avail_co but it is set and being provided
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
-999,-999.0,-999.0,-999.0,-999.0,-999.0,1,,-999.0
0,0.0,0.0,0.0,0.0,0.0,1,,0.0
1,1.0,1.0,1.0,1.0,1.0,1,,1.0
coef_eatout_dist_0_2,-0.767979,-0.5609,0.0,,,0,,-0.767979
coef_eatout_dist_2_5,-0.226448,-0.3192,0.0,,,0,,-0.226448
coef_eatout_dist_5_plus,-0.188137,-0.1238,0.0,,,0,,-0.188137
coef_escort_dist_0_2,0.226691,-0.1499,0.0,,,0,,0.226691
coef_escort_dist_2_5,-0.813494,-0.8671,0.0,,,0,,-0.813494
coef_escort_dist_5_plus,-0.255056,-0.2137,0.0,,,0,,-0.255056
coef_mode_logsum,0.662937,0.6755,0.0,,,0,,0.662937


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
-999,-999.000000
0,0.000000
1,1.000000
coef_eatout_dist_0_2,-0.767979
coef_eatout_dist_2_5,-0.226448
coef_eatout_dist_5_plus,-0.188137
coef_escort_dist_0_2,0.226691
coef_escort_dist_2_5,-0.813494
coef_escort_dist_5_plus,-0.255056
coef_mode_logsum,0.662937

Unnamed: 0,0
-999,-999.0
0,0.0
1,1.0
coef_eatout_dist_0_2,-0.767979
coef_eatout_dist_2_5,-0.226448
coef_eatout_dist_5_plus,-0.188137
coef_escort_dist_0_2,0.226691
coef_escort_dist_2_5,-0.813494
coef_escort_dist_5_plus,-0.255056
coef_mode_logsum,0.662937

Unnamed: 0,0
-999,0.0
0,0.0
1,0.0
coef_eatout_dist_0_2,0.000334
coef_eatout_dist_2_5,0.000606
coef_eatout_dist_5_plus,-0.001254
coef_escort_dist_0_2,4e-06
coef_escort_dist_2_5,-0.001135
coef_escort_dist_5_plus,-0.000519
coef_mode_logsum,0.000531


### Estimated coefficients

In [15]:
model.parameter_summary()

Unnamed: 0,Value,Std Err,t Stat,Signif,Null Value
-999,-999.0,0.0,,,-999.0
0,0.0,0.0,,,0.0
1,1.0,0.0,,,1.0
coef_eatout_dist_0_2,-0.768,0.138,-5.58,***,0.0
coef_eatout_dist_2_5,-0.226,0.0689,-3.29,**,0.0
coef_eatout_dist_5_plus,-0.188,0.105,-1.8,,0.0
coef_escort_dist_0_2,0.227,0.197,1.15,,0.0
coef_escort_dist_2_5,-0.813,0.06,-13.57,***,0.0
coef_escort_dist_5_plus,-0.255,0.0751,-3.39,***,0.0
coef_mode_logsum,0.663,0.0562,11.8,***,0.0


# Output Estimation Results

In [16]:
from activitysim.estimation.larch import update_coefficients, update_size_spec
result_dir = data[0].edb_directory/"estimated"

## Write updated utility coefficients

The revised coefficients are written out as one file.  For the MTC
example model, these coefficients are written into the non-mandatory
tour model, and are re-used by the joint tour model.

In [17]:
modelname = modelnames[0]

In [18]:
update_coefficients(
    model, data[0], result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

## Write updated size coefficients

In [19]:
update_size_spec(
    model, data[0], result_dir, 
    output_file=f"{modelname}_size_terms.csv",
)

Unnamed: 0,segment,model_selector,TOTHH,RETEMPN,FPSEMPN,HEREMPN,OTHEMPN,AGREMPN,MWTEMPN,AGE0519,HSENROLL,COLLFTE,COLLPTE
0,work_low,workplace,0.0,0.129129,0.193193,0.383383,0.12012,0.01001,0.164164,0.0,0.0,0.0,0.0
1,work_med,workplace,0.0,0.12012,0.197197,0.325325,0.139139,0.008008,0.21021,0.0,0.0,0.0,0.0
2,work_high,workplace,0.0,0.11,0.207,0.284,0.154,0.006,0.239,0.0,0.0,0.0,0.0
3,work_veryhigh,workplace,0.0,0.093,0.27,0.241,0.146,0.004,0.246,0.0,0.0,0.0,0.0
4,university,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.592,0.408
5,gradeschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
6,highschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
7,escort,non_mandatory,0.0,0.34195,0.0,0.13114,0.0,0.0,0.0,0.471104,0.055806,0.0,0.0
8,shopping,non_mandatory,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,eatout,non_mandatory,0.0,0.656666,0.0,0.343334,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Write the model estimation report, including coefficient t-statistic and log likelihood

Each model in the ModelGroup can have its own seperate estimation report, as the
common ModelGroup class does not yet implement a common report output.

In [20]:
model[0].to_xlsx(
    result_dir/f"{modelnames[0]}_model_estimation.xlsx", 
    data_statistics=False,
)
model[1].to_xlsx(
    result_dir/f"{modelnames[1]}_model_estimation.xlsx", 
    data_statistics=False,
);

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file and `*_size_terms.csv` file to the configs folder, rename them to `*_coefficients.csv` and `destination_choice_size_terms.csv`, and run ActivitySim in simulation mode.  Note that all the location
and desintation choice models share the same `destination_choice_size_terms.csv` input file, so if you
are updating all these models, you'll need to ensure that updated sections of this file for each model
are joined together correctly.

In [21]:
pd.read_csv(result_dir/f"{modelname}_coefficients_revised.csv")

Unnamed: 0,coefficient_name,value,constrain
0,coef_mode_logsum,0.662937,F
1,coef_escort_dist_0_2,0.226691,F
2,coef_eatout_dist_0_2,-0.767979,F
3,coef_eatout_social_0_2,-0.5609,F
4,coef_othdiscr_dist_0_2,-0.203187,F
5,coef_escort_dist_2_5,-0.813494,F
6,coef_shopping_dist_2_5,-0.629792,F
7,coef_eatout_dist_2_5,-0.226448,F
8,coef_othmaint_dist_2_5,-0.579932,F
9,coef_social_dist_2_5,-0.245175,F


In [22]:
pd.read_csv(result_dir/f"{modelname}_size_terms.csv")

Unnamed: 0,index,segment,model_selector,TOTHH,RETEMPN,FPSEMPN,HEREMPN,OTHEMPN,AGREMPN,MWTEMPN,AGE0519,HSENROLL,COLLFTE,COLLPTE
0,0,work_low,workplace,0.0,0.129129,0.193193,0.383383,0.12012,0.01001,0.164164,0.0,0.0,0.0,0.0
1,1,work_med,workplace,0.0,0.12012,0.197197,0.325325,0.139139,0.008008,0.21021,0.0,0.0,0.0,0.0
2,2,work_high,workplace,0.0,0.11,0.207,0.284,0.154,0.006,0.239,0.0,0.0,0.0,0.0
3,3,work_veryhigh,workplace,0.0,0.093,0.27,0.241,0.146,0.004,0.246,0.0,0.0,0.0,0.0
4,4,university,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.592,0.408
5,5,gradeschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
6,6,highschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
7,7,escort,non_mandatory,0.0,0.34195,0.0,0.13114,0.0,0.0,0.0,0.471104,0.055806,0.0,0.0
8,8,shopping,non_mandatory,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,9,eatout,non_mandatory,0.0,0.656666,0.0,0.343334,0.0,0.0,0.0,0.0,0.0,0.0,0.0
