# Estimating Work from Home Model

This notebook illustrates how to re-estimate a single model component for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import os
import larch  # !conda install larch -c conda-forge # for estimation
import pandas as pd
import numpy as np
from larch import P, X
import matplotlib.pyplot as plt

# Load data and prep model for estimation

In [2]:
os.chdir('/projects/SANDAG/2017 On-Call Modeling Services/Area B/TO 05 - ABM3/estimation')
modelname = "work_from_home"

from activitysim.estimation.larch import component_model
model, data = component_model(modelname, return_data=True)

# Review data loaded from the EDB

The next step is to read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

### Coefficients

In [3]:
data.coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_work_from_home_constant,0.0,F
coef_part_time_worker,0.0,F
coef_access_to_workplaces,0.1,T
coef_non_working_adult_in_hh,0.0,T
coef_age_35_to_44,0.0,F
coef_age_45_to_54,0.0,F
coef_age_55_to_64,0.0,F
coef_2016,0.0,F
coef_age_65_79,0.0,F
coef_age_80_plus,0.0,F


#### Utility specification

In [4]:
data.spec

Unnamed: 0,Label,Description,Expression,work_at_home,work_away_from_home
0,util_work_from_home_constant,Constant for Working from home,1,coef_work_from_home_constant,
1,util_part_time_worker,Part time worker,@df.util_part_time_worker==1,coef_part_time_worker,
2,util_access_to_workplaces_2022,Accessibility to workplaces of the home mgra,@df.workplace_location_accessibility,,coef_access_to_workplaces
3,util_access_to_workplaces_2016,Accessibility to workplaces of the home mgra,@df.workplace_location_accessibility,,coef_access_to_workplaces
4,util_non_working_adult_in_hh,Presence of Non Working Adult in the Household,"@other_than(df.household_id, df.ptype == PTYPE...",coef_non_working_adult_in_hh,
5,util_2016,Year 2016 survey,PRE_COVID=1,coef_2016,
6,util_age_35_to_44,Age Group - 35 yrs to 44 yrs,"@df.age.between(35, 44)",coef_age_35_to_44,
7,util_age_45_to_54,Age Group - 45 yrs to 54 yrs,"@df.age.between(45, 54)",coef_age_45_to_54,
8,util_age_55_to_64,Age Group - 55 yrs to 64 yrs,"@df.age.between(55, 64)",coef_age_55_to_64,
9,util_age_65_79,Age 65-79,"@df.age.between(65, 79)",coef_age_65_79,


### Chooser data

In [5]:
pd.crosstab(data.chooser_data.util_2016,data.chooser_data.override_choice,margins=True)

override_choice,False,True,All
util_2016,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,2243,560,2803
1,5802,415,6217
All,8045,975,9020


In [6]:
data.chooser_data

Unnamed: 0_level_0,unique_id,person_id,model_choice,override_choice,util_work_from_home_constant,util_full_time_worker,util_part_time_worker,util_female_worker,util_female_worker_preschool_child,util_access_to_workplaces,...,terminal_time,household_density,population_density,employment_density,density_index,is_cbd,tot_collegeenroll,preschool_target,is_parking_zone,override_choice_code
household_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
4,2.020000e+13,8,True,False,1,0,1,0,0,11.417169,...,0,7.701544,19.129643,0.662498,0.610023,False,0,239,True,2
4,2.020000e+13,9,False,False,1,1,0,1,0,11.417169,...,0,7.701544,19.129643,0.662498,0.610023,False,0,239,True,2
7,2.020000e+13,14,False,False,1,1,0,1,0,11.226526,...,0,0.000000,0.000000,0.000000,0.000000,False,0,0,True,2
7,2.020000e+13,15,False,False,1,0,1,0,0,11.226526,...,0,0.000000,0.000000,0.000000,0.000000,False,0,0,True,2
10,2.020000e+13,21,False,False,1,0,1,0,0,10.983580,...,0,5.636361,12.890566,0.260943,0.249396,False,0,252,True,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
49202,2.020000e+12,95841,False,False,1,1,0,0,0,11.033125,...,0,0.000000,0.000000,0.000000,0.000000,False,0,0,True,2
49594,2.020000e+12,96564,False,False,1,1,0,0,0,10.549933,...,0,2.888253,10.205162,0.866476,0.666520,False,0,115,True,2
49594,2.020000e+12,96565,True,False,1,1,0,1,0,10.549933,...,0,2.888253,10.205162,0.866476,0.666520,False,0,115,True,2
49618,2.020000e+12,96615,False,False,1,0,1,1,0,10.607123,...,0,13.499831,46.852356,0.000000,0.000000,False,0,236,True,2


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [7]:
model.load_data()

req_data does not request avail_ca or avail_co but it is set and being provided
converting data_co to <class 'numpy.float64'>


In [8]:
model.maximize_loglike(method="SLSQP")

Unnamed: 0,value,initvalue,nullvalue,minimum,maximum,holdfast,note,best
coef_0_auto,0.0,0.0,0.0,0.0,0.0,1,,0.0
coef_2016,-1.062949,0.0,0.0,,,0,,-1.062949
coef_access_to_workplaces,0.1,0.1,0.0,0.1,0.1,1,,0.1
coef_age_35_to_44,0.35353,0.0,0.0,,,0,,0.35353
coef_age_45_to_54,0.468134,0.0,0.0,,,0,,0.468134
coef_age_55_to_64,0.495172,0.0,0.0,,,0,,0.495172
coef_age_65_79,1.243579,0.0,0.0,,,0,,1.243579
coef_age_80_plus,0.770722,0.0,0.0,,,0,,0.770722
coef_auto_ge_adults,0.0,0.0,0.0,0.0,0.0,1,,0.0
coef_auto_lt_adults,0.0,0.0,0.0,0.0,0.0,1,,0.0


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
coef_0_auto,0.000000
coef_2016,-1.062949
coef_access_to_workplaces,0.100000
coef_age_35_to_44,0.353530
coef_age_45_to_54,0.468134
coef_age_55_to_64,0.495172
coef_age_65_79,1.243579
coef_age_80_plus,0.770722
coef_auto_ge_adults,0.000000
coef_auto_lt_adults,0.000000

Unnamed: 0,0
coef_0_auto,0.0
coef_2016,-1.062949
coef_access_to_workplaces,0.1
coef_age_35_to_44,0.35353
coef_age_45_to_54,0.468134
coef_age_55_to_64,0.495172
coef_age_65_79,1.243579
coef_age_80_plus,0.770722
coef_auto_ge_adults,0.0
coef_auto_lt_adults,0.0

Unnamed: 0,0
coef_0_auto,0.0
coef_2016,-0.001184
coef_access_to_workplaces,0.0
coef_age_35_to_44,0.000487
coef_age_45_to_54,-0.00019
coef_age_55_to_64,-0.00079
coef_age_65_79,-0.000723
coef_age_80_plus,6.6e-05
coef_auto_ge_adults,0.0
coef_auto_lt_adults,0.0


### Estimated coefficients

In [9]:
model.parameter_summary()

Unnamed: 0,Value,Null Value
coef_0_auto,0.0,0.0
coef_2016,-1.06,0.0
coef_access_to_workplaces,0.1,0.0
coef_age_35_to_44,0.354,0.0
coef_age_45_to_54,0.468,0.0
coef_age_55_to_64,0.495,0.0
coef_age_65_79,1.24,0.0
coef_age_80_plus,0.771,0.0
coef_auto_ge_adults,0.0,0.0
coef_auto_lt_adults,0.0,0.0


# Output Estimation Results

In [10]:
from activitysim.estimation.larch import update_coefficients
result_dir = data.edb_directory/"estimated"
update_coefficients(
    model, data, result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

### Write the model estimation report, including coefficient t-statistic and log likelihood

In [11]:
model.calculate_parameter_covariance()
result_dir='/projects/SANDAG/2017 On-Call Modeling Services/Area B/TO 05 - ABM3/estimation/'
model.to_xlsx(
    result_dir+"work_from_home_10.xlsx", 
    data_statistics=True,
)

  xl = ExcelWriter(filename, engine='xlsxwriter_larch', model=model, **kwargs)


<larch.util.excel.ExcelWriter at 0x19d1970f0d0>

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file to the configs folder, rename it to `*_coefficients.csv`, and run ActivitySim in simulation mode.

In [12]:
pd.read_csv(result_dir/f"{modelname}_coefficients_revised.csv")

TypeError: unsupported operand type(s) for /: 'str' and 'str'