# Estimating Non-Mandatory Tour Frequency

This notebook illustrates how to re-estimate a single model component for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import os
import larch  # !conda install larch -c conda-forge # for estimation
import pandas as pd

We'll work in our `test` directory, where ActivitySim has saved the estimation data bundles.

In [2]:
os.chdir('test')

# Load data and prep model for estimation

In [3]:
modelname = "nonmand_tour_freq"

from activitysim.estimation.larch import component_model
model, data = component_model(modelname, return_data=True)

This component actually has a distinct choice model for each person type, so
instead of a single model there's a `dict` of models.

In [4]:
type(model)

dict

In [5]:
model.keys()

dict_keys(['PTYPE_FULL', 'PTYPE_PART', 'PTYPE_UNIVERSITY', 'PTYPE_NONWORK', 'PTYPE_RETIRED', 'PTYPE_DRIVING', 'PTYPE_SCHOOL', 'PTYPE_PRESCHOOL'])

# Review data loaded from the EDB

We can review the data loaded as well, similarly there is seperate data 
for each person type.

## Coefficients

In [6]:
data.coefficients['PTYPE_FULL']

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_escorting_tour,0.0000,T
coef_discretionary_tour,0.0000,T
coef_shopping_tour,0.0000,T
coef_maintenance_tour,0.0000,T
coef_visiting_or_social_tour,0.0000,T
...,...,...
coef_1_plus_maintenance_tours_constant,0.1202,F
coef_1_plus_eating_out_tours_constant,0.0097,F
coef_1_plus_visting_tours_constant,0.0522,F
coef_1_plus_other_discretionary_tours_constant,0.7412,F


## Utility specification

In [7]:
data.spec['PTYPE_FULL']

0                   coef_urban_and_discretionary_tour
1                   coef_urban_and_discretionary_tour
2                   coef_urban_and_discretionary_tour
3                   coef_urban_and_discretionary_tour
4                   coef_urban_and_discretionary_tour
                            ...                      
205            coef_1_plus_maintenance_tours_constant
206             coef_1_plus_eating_out_tours_constant
207                coef_1_plus_visting_tours_constant
208    coef_1_plus_other_discretionary_tours_constant
209          coef_0_auto_household_and_escorting_tour
Name: PTYPE_FULL, Length: 210, dtype: object

## Chooser data

In [8]:
data.chooser_data['PTYPE_FULL']

Unnamed: 0,person_id,model_choice,override_choice,household_id,age,PNUM,sex,pemploy,pstudent,ptype,...,high_income,no_cars,car_sufficiency,num_hh_joint_shop_tours,num_hh_joint_eatout_tours,num_hh_joint_maint_tours,num_hh_joint_social_tours,num_hh_joint_othdiscr_tours,has_mandatory_tour,has_joint_tour
0,72241,0,0,72241,56,1,1,1,3,1,...,False,False,0,0,0,0,0,0,1,0
1,72441,0,0,72441,49,1,1,1,3,1,...,False,False,0,0,0,0,0,0,1,0
2,73144,0,0,73144,31,1,2,1,3,1,...,False,True,-1,0,0,0,0,0,1,0
3,73493,0,0,73493,31,1,2,1,3,1,...,False,False,0,0,0,0,0,0,1,0
4,73706,0,0,73706,26,1,1,1,3,1,...,False,False,0,0,0,0,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1759,7512288,5,5,2820953,28,1,1,1,3,1,...,False,True,-1,0,0,0,0,0,0,0
1760,7512469,1,1,2821134,34,1,2,1,3,1,...,False,True,-1,0,0,0,0,0,1,0
1761,7513117,0,0,2821782,16,1,1,1,3,1,...,False,True,-1,0,0,0,0,0,1,0
1762,7513996,0,0,2822661,24,1,1,1,3,1,...,False,False,0,0,0,0,0,0,1,0


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [9]:
for k, m in model.items():
    m.estimate(method='SLSQP')

req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.000000,-2.0000,0.0,,,1,,-2.000000
coef_1_escort_tour_constant,0.319037,0.0298,0.0,,,0,,0.319037
coef_1_plus_eating_out_tours_constant,-1.012856,0.0097,0.0,,,0,,-1.012856
coef_1_plus_maintenance_tours_constant,-2.842643,0.1202,0.0,,,0,,-2.842643
coef_1_plus_other_discretionary_tours_constant,10.543980,0.7412,0.0,,,0,,10.543980
...,...,...,...,...,...,...,...,...
coef_walk_access_to_retail_and_discretionary,0.160520,0.0567,0.0,,,0,,0.160520
coef_walk_access_to_retail_and_eating_out,0.211289,0.1450,0.0,,,0,,0.211289
coef_walk_access_to_retail_and_escorting,-0.105285,0.0451,0.0,,,0,,-0.105285
coef_walk_access_to_retail_and_shopping,0.030182,0.0330,0.0,,,0,,0.030182


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.0,-2.0,0.0,,,1,,-2.0
coef_1_escort_tour_constant,0.777927,0.5272,0.0,,,0,,0.777927
coef_1_plus_eating_out_tours_constant,1.665075,0.6914,0.0,,,0,,1.665075
coef_1_plus_maintenance_tours_constant,1.214734,0.5533,0.0,,,0,,1.214734
coef_1_plus_other_discretionary_tours_constant,1.783871,0.7989,0.0,,,0,,1.783871
coef_1_plus_shopping_tours_constant,1.521613,0.7569,0.0,,,0,,1.521613
coef_1_plus_visting_tours_constant,1.045482,0.1405,0.0,,,0,,1.045482
coef_2_plus_escort_tours_constant,1.882661,1.5987,0.0,,,0,,1.882661
coef_car_shortage_vs_workers_and_tour_frequency_is_5_plus,0.049232,-0.5498,0.0,,,0,,0.049232
coef_female_and_discretionary_tour,0.439252,0.3072,0.0,,,0,,0.439252


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.000000,-2.0000,0.0,,,1,,-2.000000
coef_1_escort_tour_constant,1.851506,1.7028,0.0,,,0,,1.851506
coef_1_plus_eating_out_tours_constant,-13.970901,2.0723,0.0,,,0,,-13.970901
coef_1_plus_maintenance_tours_constant,-0.338031,0.3348,0.0,,,0,,-0.338031
coef_1_plus_other_discretionary_tours_constant,-1.473992,1.3389,0.0,,,0,,-1.473992
...,...,...,...,...,...,...,...,...
coef_urban_and_shopping_tour,0.711573,0.5330,0.0,,,0,,0.711573
coef_urban_and_tour_frequency_is_1,-3.537340,-1.1648,0.0,,,0,,-3.537340
coef_urban_and_tour_frequency_is_2,-4.729195,-2.3177,0.0,,,0,,-4.729195
coef_urban_and_tour_frequency_is_5_plus,-6.020888,-2.5027,0.0,,,0,,-6.020888


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.000000,-2.0000,0.0,,,1,,-2.000000
coef_1_escort_tour_constant,-4.317861,-0.0629,0.0,,,0,,-4.317861
coef_1_plus_eating_out_tours_constant,-4.459266,-0.1429,0.0,,,0,,-4.459266
coef_1_plus_maintenance_tours_constant,-9.609589,-0.0653,0.0,,,0,,-9.609589
coef_1_plus_other_discretionary_tours_constant,-4.488177,0.3334,0.0,,,0,,-4.488177
...,...,...,...,...,...,...,...,...
coef_walk_access_to_retail_and_discretionary,0.214072,0.0772,0.0,,,0,,0.214072
coef_walk_access_to_retail_and_shopping,0.039849,0.0598,0.0,,,0,,0.039849
coef_walk_access_to_retail_and_tour_frequency_is_1,-0.507092,0.0713,0.0,,,0,,-0.507092
coef_walk_access_to_retail_and_tour_frequency_is_2,-0.403984,0.1256,0.0,,,0,,-0.403984


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.0,-2.0,0.0,,,1,,-2.0
coef_1_escort_tour_constant,-2.730596,-0.3992,0.0,,,0,,-2.730596
coef_1_plus_eating_out_tours_constant,-2.583305,0.0245,0.0,,,0,,-2.583305
coef_1_plus_maintenance_tours_constant,-2.299219,0.1046,0.0,,,0,,-2.299219
coef_1_plus_other_discretionary_tours_constant,-2.243196,0.4282,0.0,,,0,,-2.243196
coef_1_plus_shopping_tours_constant,-1.916643,0.5947,0.0,,,0,,-1.916643
coef_1_plus_visting_tours_constant,-2.344018,0.2789,0.0,,,0,,-2.344018
coef_2_plus_escort_tours_constant,-4.935936,0.5175,0.0,,,0,,-4.935936
coef_car_surplus_vs_workers_and_tour_frequency_is_1,3.094754,0.7965,0.0,,,0,,3.094754
coef_car_surplus_vs_workers_and_tour_frequency_is_5_plus,3.475854,2.1302,0.0,,,0,,3.475854


  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-289.822903,-2.0,0.0,,,0,,-289.822903
coef_1_escort_tour_constant,-6435.039018,-0.4934,0.0,,,0,,-6435.039018
coef_1_plus_eating_out_tours_constant,-3885.458878,-0.0242,0.0,,,0,,-3885.458878
coef_1_plus_maintenance_tours_constant,-1195.820538,-0.4344,0.0,,,0,,-1195.820538
coef_1_plus_other_discretionary_tours_constant,-2406.835103,-0.2602,0.0,,,0,,-2406.835103
coef_1_plus_shopping_tours_constant,-2389.332994,0.532,0.0,,,0,,-2389.332994
coef_1_plus_visting_tours_constant,-2389.333045,0.2367,0.0,,,0,,-2389.333045
coef_2_plus_escort_tours_constant,-5667.753696,1.4155,0.0,,,0,,-5667.753696
coef_auto_access_to_retail_and_tour_frequency_is_5_plus,-44.188317,0.1004,0.0,,,0,,-44.188317
coef_car_shortage_vs_workers_and_tour_frequency_is_5_plus,-17233.03819,-0.6369,0.0,,,0,,-17233.03819


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.0,-2.0,0.0,,,1,,-2.0
coef_1_escort_tour_constant,-1.572081,-0.7551,0.0,,,0,,-1.572081
coef_1_plus_eating_out_tours_constant,-2.22221,1.1145,0.0,,,0,,-2.22221
coef_1_plus_maintenance_tours_constant,-1.497135,-0.506,0.0,,,0,,-1.497135
coef_1_plus_other_discretionary_tours_constant,0.090818,0.4634,0.0,,,0,,0.090818
coef_1_plus_shopping_tours_constant,1.474688,0.4783,0.0,,,0,,1.474688
coef_1_plus_visting_tours_constant,-1.792608,-0.4006,0.0,,,0,,-1.792608
coef_2_plus_escort_tours_constant,-2.059623,-0.0086,0.0,,,0,,-2.059623
coef_auto_access_to_retail_and_escorting,0.555108,0.0629,0.0,,,0,,0.555108
coef_high_income_group_and_eating_out_tour,-1.36198,-0.701,0.0,,,0,,-1.36198


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')
req_data does not request avail_ca or avail_co but it is set and being provided


Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto_household_and_escorting_tour,-2.0,-2.0,0.0,,,1,,-2.0
coef_1_escort_tour_constant,0.179651,0.3622,0.0,,,0,,0.179651
coef_1_plus_eating_out_tours_constant,0.204827,0.9612,0.0,,,0,,0.204827
coef_1_plus_maintenance_tours_constant,0.073169,0.6788,0.0,,,0,,0.073169
coef_1_plus_other_discretionary_tours_constant,1.133662,1.4935,0.0,,,0,,1.133662
coef_1_plus_shopping_tours_constant,0.785367,1.6919,0.0,,,0,,0.785367
coef_1_plus_visting_tours_constant,-0.011286,0.4424,0.0,,,0,,-0.011286
coef_2_plus_escort_tours_constant,1.744429,2.2219,0.0,,,0,,1.744429
coef_discretionary_tour,0.543162,0.903,0.0,,,0,,0.543162
coef_escorting_tour,1.353509,2.491,0.0,,,0,,1.353509


  m.estimate(method='SLSQP')
  m.estimate(method='SLSQP')


### Estimated coefficients

In [10]:
model['PTYPE_FULL'].parameter_summary()

Unnamed: 0,Value,Std Err,t Stat,Signif,Like Ratio,Null Value,Constrained
coef_0_auto_household_and_escorting_tour,-2.0,,,,,0.0,fixed value
coef_1_escort_tour_constant,0.319,362.0,0.0,,,0.0,
coef_1_plus_eating_out_tours_constant,-1.01,143.0,-0.01,,,0.0,
coef_1_plus_maintenance_tours_constant,-2.84,143.0,-0.02,,,0.0,
coef_1_plus_other_discretionary_tours_constant,10.5,144.0,0.07,,,0.0,
coef_1_plus_shopping_tours_constant,7.18,143.0,0.05,,,0.0,
coef_1_plus_visting_tours_constant,-0.321,143.0,-0.0,,,0.0,
coef_2_plus_escort_tours_constant,0.607,725.0,0.0,,,0.0,
coef_at_home_pre_driving_school_kid_and_escorting_tour,-0.926,1.05,-0.89,,,0.0,
coef_at_home_pre_school_kid_and_discretionary_tour,-0.656,0.743,-0.88,,,0.0,


# Output Estimation Results

In [11]:
from activitysim.estimation.larch import update_coefficients
for k, m in model.items():
    result_dir = data.edb_directory/k/"estimated"
    update_coefficients(
        m, data.coefficients[k], result_dir,
        output_file=f"{modelname}_{k}_coefficients_revised.csv",
    );

### Write the model estimation report, including coefficient t-statistic and log likelihood

In [12]:
for k, m in model.items():
    result_dir = data.edb_directory/k/"estimated"
    m.to_xlsx(
        result_dir/f"{modelname}_{k}_model_estimation.xlsx", 
        data_statistics=False,
    )

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file to the configs folder, rename it to `*_coefficients.csv`, and run ActivitySim in simulation mode.

In [15]:
result_dir = data.edb_directory/'PTYPE_FULL'/"estimated"
pd.read_csv(result_dir/f"{modelname}_PTYPE_FULL_coefficients_revised.csv")

Unnamed: 0,coefficient_name,value,constrain
0,coef_escorting_tour,0.000000,T
1,coef_discretionary_tour,0.000000,T
2,coef_shopping_tour,0.000000,T
3,coef_maintenance_tour,0.000000,T
4,coef_visiting_or_social_tour,0.000000,T
...,...,...,...
205,coef_1_plus_maintenance_tours_constant,-2.842643,F
206,coef_1_plus_eating_out_tours_constant,-1.012856,F
207,coef_1_plus_visting_tours_constant,-0.320820,F
208,coef_1_plus_other_discretionary_tours_constant,10.543980,F
