# Estimating Joint Tour Participation

This notebook illustrates how to re-estimate a single model component for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import os
import larch  # !conda install larch -c conda-forge # for estimation
import pandas as pd

We'll work in our `test` directory, where ActivitySim has saved the estimation data bundles.

In [2]:
os.chdir('test')

# Load data and prep model for estimation

In [3]:
modelname = "joint_tour_participation"

from activitysim.estimation.larch import component_model
model, data = component_model(modelname, return_data=True)

# Review data loaded from the EDB

The next step is to read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

## Coefficients

In [4]:
data.coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_unavailable,-999.0,T
coef_full_time_worker_mixed_party,-3.566,F
coef_full_time_worker_mixed_party_not,0.5,T
coef_part_time_worker_adults_only_party,-3.566,F
coef_part_time_worker_adults_only_party_not,0.5,T
coef_part_time_worker_mixed_party,-0.3655,F
coef_university_student_mixed_party,-3.041,F
coef_non_worker_adults_only_party,-3.164,F
coef_non_worker_mixed_party,0.7152,F
coef_child_too_young_for_school_children_only_party,-2.786,F


## Utility specification

In [5]:
data.spec

Unnamed: 0,Label,Description,Expression,participate,not_participate
0,util_full_time_worker_mixed_party,"Full-Time Worker, mixed party",person_is_full & tour_composition_is_mixed,coef_full_time_worker_mixed_party,coef_full_time_worker_mixed_party_not
1,util_part_time_worker_adults_only_party,"Part-Time Worker, adults-only party",person_is_part & tour_composition_is_adults,coef_part_time_worker_adults_only_party,coef_part_time_worker_adults_only_party_not
2,util_part_time_worker_mixed_party,"Part-Time Worker, mixed party",person_is_part & tour_composition_is_mixed,coef_part_time_worker_mixed_party,
3,util_university_student_mixed_party,"University Student, mixed party",person_is_univ & tour_composition_is_mixed,coef_university_student_mixed_party,
4,util_non_worker_adults_only_party,"Non-Worker, adults-only party",person_is_nonwork & tour_composition_is_adults,coef_non_worker_adults_only_party,
...,...,...,...,...,...
56,util_persons_with_home_activity_patterns_are_p...,Persons with Home activity patterns are prohib...,~travel_active,coef_unavailable,
57,util_if_only_two_available_adults_both_must_pa...,"If only two available adults, both must partic...",adult & travel_active & tour_composition_is_ad...,,coef_unavailable
58,util_if_only_one_available_adult_traveler_must...,"If only one available adult, traveler must par...",adult & travel_active & tour_composition_is_mi...,,coef_unavailable
59,util_if_only_two_available_children_both_must_...,"If only two available children, both must part...",~adult & travel_active & tour_composition_is_c...,,coef_unavailable


## Chooser data

In [6]:
data.chooser_data

Unnamed: 0_level_0,model_choice,override_choice,util_full_time_worker_mixed_party,util_part_time_worker_adults_only_party,util_part_time_worker_mixed_party,util_university_student_mixed_party,util_non_worker_adults_only_party,util_non_worker_mixed_party,util_child_too_young_for_school_children_only_party,util_child_too_young_for_school_mixed_party,...,person_is_preschool,tour_type_is_eat,tour_type_is_disc,tour_composition_is_adults,tour_composition_is_children,tour_composition_is_mixed,home_is_suburban,high_income,more_cars_than_workers,override_choice_code
participant_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
778529801,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,False,False,1
778529802,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,False,False,1
870845401,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,False,False,1
870845402,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,False,False,1
971500601,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,False,False,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
29881474102,0,0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,True,False,1
30181098001,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,True,False,2
30181098002,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,True,False,1
30181098003,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,False,False,False,True,False,False,False,True,False,1


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [7]:
model.estimate()

req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_adult_log_of_max_window_overlap_with_a_child_mixed,4.595737,2.189,0.0,,,0,,4.595737
coef_adult_log_of_max_window_overlap_with_an_adult_adult_only_party,1.692145,0.8436,0.0,,,0,,1.692145
coef_adult_more_automobiles_than_workers_adult_only_party,0.311671,-0.2133,0.0,,,0,,0.311671
coef_adult_more_automobiles_than_workers_mixed_party,-1.82774,-0.6031,0.0,,,0,,-1.82774
coef_adult_number_of_joint_tours_adult_only,-2.134054,-0.3242,0.0,,,0,,-2.134054
coef_adult_number_of_joint_tours_mixed,-28.57222,-0.3584,0.0,,,0,,-28.57222
coef_adult_number_of_other_adults_in_the_household_adults_only_party,0.0,0.0,0.0,,,1,,0.0
coef_adult_number_of_other_adults_in_the_household_mixed_party,0.0,0.0,0.0,,,1,,0.0
coef_child_log_of_max_window_overlap_with_a_child_child,1.275585,1.296,0.0,,,0,,1.275585
coef_child_log_of_max_window_overlap_with_an_adult_mixed,1.539267,1.538,0.0,,,0,,1.539267


  model.estimate()
  model.estimate()


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
coef_adult_log_of_max_window_overlap_with_a_child_mixed,4.595737e+00
coef_adult_log_of_max_window_overlap_with_an_adult_adult_only_party,1.692145e+00
coef_adult_more_automobiles_than_workers_adult_only_party,3.116710e-01
coef_adult_more_automobiles_than_workers_mixed_party,-1.827740e+00
coef_adult_number_of_joint_tours_adult_only,-2.134054e+00
coef_adult_number_of_joint_tours_mixed,-2.857222e+01
coef_adult_number_of_other_adults_in_the_household_adults_only_party,-1.648956e-17
coef_adult_number_of_other_adults_in_the_household_mixed_party,7.224396e-18
coef_child_log_of_max_window_overlap_with_a_child_child,1.275585e+00
coef_child_log_of_max_window_overlap_with_an_adult_mixed,1.539267e+00

Unnamed: 0,0
coef_adult_log_of_max_window_overlap_with_a_child_mixed,4.595737
coef_adult_log_of_max_window_overlap_with_an_adult_adult_only_party,1.692145
coef_adult_more_automobiles_than_workers_adult_only_party,0.311671
coef_adult_more_automobiles_than_workers_mixed_party,-1.82774
coef_adult_number_of_joint_tours_adult_only,-2.134054
coef_adult_number_of_joint_tours_mixed,-28.57222
coef_adult_number_of_other_adults_in_the_household_adults_only_party,-1.648956e-17
coef_adult_number_of_other_adults_in_the_household_mixed_party,7.224396e-18
coef_child_log_of_max_window_overlap_with_a_child_child,1.275585
coef_child_log_of_max_window_overlap_with_an_adult_mixed,1.539267

Unnamed: 0,0
coef_adult_log_of_max_window_overlap_with_a_child_mixed,-0.0004096171
coef_adult_log_of_max_window_overlap_with_an_adult_adult_only_party,2.436129e-05
coef_adult_more_automobiles_than_workers_adult_only_party,5.345149e-05
coef_adult_more_automobiles_than_workers_mixed_party,0.0001722466
coef_adult_number_of_joint_tours_adult_only,0.0001769263
coef_adult_number_of_joint_tours_mixed,0.0003181344
coef_adult_number_of_other_adults_in_the_household_adults_only_party,0.0
coef_adult_number_of_other_adults_in_the_household_mixed_party,0.0
coef_child_log_of_max_window_overlap_with_a_child_child,0.0
coef_child_log_of_max_window_overlap_with_an_adult_mixed,0.0


In [8]:
model.dataframes.choice_avail_summary()

Unnamed: 0,name,chosen,available
1,participate,228.0,304.0
2,not_participate,76.0,304.0
< Total All Alternatives >,,304.0,


### Estimated coefficients

In [9]:
model.parameter_summary()

Unnamed: 0,Value,Std Err,t Stat,Signif,Like Ratio,Null Value,Constrained
coef_adult_log_of_max_window_overlap_with_a_child_mixed,4.6,1.53,3.00,**,,0.0,
coef_adult_log_of_max_window_overlap_with_an_adult_adult_only_party,1.69,0.833,2.03,*,,0.0,
coef_adult_more_automobiles_than_workers_adult_only_party,0.312,1.08,0.29,,,0.0,
coef_adult_more_automobiles_than_workers_mixed_party,-1.83,0.672,-2.72,**,,0.0,
coef_adult_number_of_joint_tours_adult_only,-2.13,2.0,-1.07,,,0.0,
coef_adult_number_of_joint_tours_mixed,-28.6,,,[***],499.96,0.0,
coef_adult_number_of_other_adults_in_the_household_adults_only_party,0.0,,,,,0.0,fixed value
coef_adult_number_of_other_adults_in_the_household_mixed_party,0.0,,,,,0.0,fixed value
coef_child_log_of_max_window_overlap_with_a_child_child,1.28,0.0045,283.43,***,,0.0,
coef_child_log_of_max_window_overlap_with_an_adult_mixed,1.54,0.00635,242.24,***,,0.0,


# Output Estimation Results

In [10]:
from activitysim.estimation.larch import update_coefficients
result_dir = data.edb_directory/"estimated"
update_coefficients(
    model, data, result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

### Write the model estimation report, including coefficient t-statistic and log likelihood

In [11]:
model.to_xlsx(
    result_dir/f"{modelname}_model_estimation.xlsx", 
    data_statistics=False,
)

<larch.util.excel.ExcelWriter at 0x7fd49865faf0>

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file to the configs folder, rename it to `*_coefficients.csv`, and run ActivitySim in simulation mode.

In [12]:
pd.read_csv(result_dir/f"{modelname}_coefficients_revised.csv")

Unnamed: 0,coefficient_name,value,constrain
0,coef_unavailable,-999.0,T
1,coef_full_time_worker_mixed_party,-957.202749,F
2,coef_full_time_worker_mixed_party_not,0.5,T
3,coef_part_time_worker_adults_only_party,-3.248302,F
4,coef_part_time_worker_adults_only_party_not,0.5,T
5,coef_part_time_worker_mixed_party,2254.660944,F
6,coef_university_student_mixed_party,-958.612314,F
7,coef_non_worker_adults_only_party,-4.080384,F
8,coef_non_worker_mixed_party,-125.288137,F
9,coef_child_too_young_for_school_children_only_...,-2.785631,F
