# Estimating Auto Ownership

This notebook illustrates how to re-estimate a single model component for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import larch as lx
import pandas as pd

lx.versions()

JAX not found. Some functionality will be unavailable.


{'larch': '6.0.32',
 'sharrow': '2.13.0',
 'numpy': '1.26.4',
 'pandas': '1.5.3',
 'xarray': '2024.3.0',
 'numba': '0.60.0'}

For this demo, we will assume that you have already run ActivitySim in estimation
mode, and saved the required estimation data bundles (EDB's) to disk.  See
the [first notebook](./01_estimation_mode.ipynb) for details.  The following module
will run a script to set everything up if the example data is not already available.

In [2]:
from est_mode_setup import prepare, backup
prepare()

EDB directory already populated.


PosixPath('test-estimation-data/activitysim-prototype-mtc-extended')

In this demo notebook, we will (later) edit the model spec file.  But for demo purposes, we want to
make sure we are starting from the "original" spec file, so we'll check that now.  For actual 
applications, this step would not be necessary.

In [3]:
backup("output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_SPEC.csv")

# Load data and prep model for estimation

In [4]:
modelname = "auto_ownership"

from activitysim.estimation.larch import component_model
model, data = component_model(
    modelname, 
    edb_directory=f"output-est-mode/estimation_data_bundle/{modelname}/",
    return_data=True,
)

loading from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_coefficients.csv
loading spec from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_SPEC.csv
loading from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_values_combined.parquet


# Review data loaded from the EDB

The next step is to read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

### Coefficients

In [5]:
data.coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_cars1_drivers_2,0.0000,T
coef_cars1_drivers_3,0.0000,T
coef_cars1_persons_16_17,0.0000,T
coef_cars234_asc_marin,0.0000,T
coef_cars1_persons_25_34,0.0000,T
...,...,...
coef_cars4_drivers_3,5.2080,F
coef_cars3_drivers_3,5.5131,F
coef_cars2_drivers_4_up,6.3662,F
coef_cars3_drivers_4_up,8.5148,F


#### Utility specification

In [6]:
data.spec

Unnamed: 0,Label,Description,Expression,cars0,cars1,cars2,cars3,cars4
0,util_drivers_2,2 Adults (age 16+),num_drivers==2,,coef_cars1_drivers_2,coef_cars2_drivers_2,coef_cars3_drivers_2,coef_cars4_drivers_2
1,util_drivers_3,3 Adults (age 16+),num_drivers==3,,coef_cars1_drivers_3,coef_cars2_drivers_3,coef_cars3_drivers_3,coef_cars4_drivers_3
2,util_drivers_4_up,4+ Adults (age 16+),num_drivers>3,,coef_cars1_drivers_4_up,coef_cars2_drivers_4_up,coef_cars3_drivers_4_up,coef_cars4_drivers_4_up
3,util_persons_16_17,Persons age 16-17,num_children_16_to_17,,coef_cars1_persons_16_17,coef_cars2_persons_16_17,coef_cars34_persons_16_17,coef_cars34_persons_16_17
4,util_persons_18_24,Persons age 18-24,num_college_age,,coef_cars1_persons_18_24,coef_cars2_persons_18_24,coef_cars34_persons_18_24,coef_cars34_persons_18_24
5,util_persons_25_34,Persons age 35-34,num_young_adults,,coef_cars1_persons_25_34,coef_cars2_persons_25_34,coef_cars34_persons_25_34,coef_cars34_persons_25_34
6,util_presence_children_0_4,Presence of children age 0-4,num_young_children>0,,coef_cars1_presence_children_0_4,coef_cars234_presence_children_0_4,coef_cars234_presence_children_0_4,coef_cars234_presence_children_0_4
7,util_presence_children_5_17,Presence of children age 5-17,(num_children_5_to_15+num_children_16_to_17)>0,,coef_cars1_presence_children_5_17,coef_cars2_presence_children_5_17,coef_cars34_presence_children_5_17,coef_cars34_presence_children_5_17
8,util_num_workers_clip_3,"Number of workers, capped at 3",@df.num_workers.clip(upper=3),,coef_cars1_num_workers_clip_3,coef_cars2_num_workers_clip_3,coef_cars3_num_workers_clip_3,coef_cars4_num_workers_clip_3
9,util_hh_income_0_30k,"Piecewise Linear household income, $0-30k","@df.income_in_thousands.clip(0, 30)",,coef_cars1_hh_income_0_30k,coef_cars2_hh_income_0_30k,coef_cars3_hh_income_0_30k,coef_cars4_hh_income_0_30k


### Chooser data

In [7]:
data.chooser_data

Unnamed: 0_level_0,model_choice,override_choice,util_drivers_2,util_drivers_3,util_drivers_4_up,util_persons_16_17,util_persons_18_24,util_persons_25_34,util_presence_children_0_4,util_presence_children_5_17,...,auPkTotal,auOpRetail,auOpTotal,trPkRetail,trPkTotal,trOpRetail,trOpTotal,nmRetail,nmTotal,override_choice_code
household_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
45,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,12.513805,9.924660,12.562639,4.193237,6.875144,3.952128,6.590585,2.194792,6.359507,2
499,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,12.823009,10.284673,12.868645,6.639963,9.364105,6.531079,9.259002,5.955868,7.795004,2
659,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,12.663406,10.247505,12.762286,6.001466,8.409169,5.786652,8.279842,5.798886,7.900061,2
948,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,12.710919,10.150335,12.777635,5.172974,7.850360,4.893929,7.571579,4.895220,7.409345,2
1276,0,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,12.661307,10.258471,12.759529,6.039019,8.348963,5.778785,8.070525,6.073537,7.851667,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2874468,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,10.036845,8.113608,10.265845,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2
2874567,1,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,8.811126,6.560015,8.886403,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,1
2874576,0,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,8.811126,6.560015,8.886403,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2
2874826,1,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,11.356335,9.298380,11.721935,1.052528,2.925968,0.494776,2.006432,3.782008,6.208875,2


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [8]:
model.estimate()

Unnamed: 0_level_0,value,best,initvalue,minimum,maximum,nullvalue,holdfast
param_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
coef_cars1_asc,1.337433,1.337433,1.1865,-50.00,50.00,0.0,0
coef_cars1_asc_county,-0.655949,-0.655949,-0.5660,-50.00,50.00,0.0,0
coef_cars1_asc_marin,-0.168475,-0.168475,-0.2434,-50.00,50.00,0.0,0
coef_cars1_asc_san_francisco,0.324519,0.324519,0.4259,-50.00,50.00,0.0,0
coef_cars1_auto_time_saving_per_worker,0.394451,0.394451,0.4707,-50.00,50.00,0.0,0
...,...,...,...,...,...,...,...
coef_retail_auto_no_workers,0.039844,0.039844,0.0626,-50.00,50.00,0.0,0
coef_retail_auto_workers,0.155792,0.155792,0.1646,-50.00,50.00,0.0,0
coef_retail_non_motor,-0.030000,-0.030000,-0.0300,-0.03,-0.03,0.0,1
coef_retail_transit_no_workers,-0.307701,-0.307701,-0.3053,-50.00,50.00,0.0,0


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
coef_cars1_asc,1.337433
coef_cars1_asc_county,-0.655949
coef_cars1_asc_marin,-0.168475
coef_cars1_asc_san_francisco,0.324519
coef_cars1_auto_time_saving_per_worker,0.394451
coef_cars1_density_0_10_no_workers,0.000000
coef_cars1_density_10_up_no_workers,-0.014457
coef_cars1_density_10_up_workers,-0.018280
coef_cars1_drivers_2,0.000000
coef_cars1_drivers_3,0.000000

Unnamed: 0,0
coef_cars1_asc,1.337433
coef_cars1_asc_county,-0.655949
coef_cars1_asc_marin,-0.168475
coef_cars1_asc_san_francisco,0.324519
coef_cars1_auto_time_saving_per_worker,0.394451
coef_cars1_density_0_10_no_workers,0.0
coef_cars1_density_10_up_no_workers,-0.014457
coef_cars1_density_10_up_workers,-0.01828
coef_cars1_drivers_2,0.0
coef_cars1_drivers_3,0.0

Unnamed: 0,0
coef_cars1_asc,5.272591e-05
coef_cars1_asc_county,-8.675234e-06
coef_cars1_asc_marin,-6.400057e-05
coef_cars1_asc_san_francisco,-0.0001587832
coef_cars1_auto_time_saving_per_worker,0.000140329
coef_cars1_density_0_10_no_workers,0.0
coef_cars1_density_10_up_no_workers,6.620783e-06
coef_cars1_density_10_up_workers,-2.503582e-06
coef_cars1_drivers_2,0.0
coef_cars1_drivers_3,0.0


### Estimated coefficients

In [9]:
model.parameter_summary()

Unnamed: 0_level_0,Value,Std Err,t Stat,Signif,Null Value,Constrained
Parameter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
coef_cars1_asc,1.34,0.941,1.42,,0.0,
coef_cars1_asc_county,-0.656,0.158,-4.14,***,0.0,
coef_cars1_asc_marin,-0.168,0.106,-1.58,,0.0,
coef_cars1_asc_san_francisco,0.325,0.0979,3.31,***,0.0,
coef_cars1_auto_time_saving_per_worker,0.394,0.188,2.1,*,0.0,
coef_cars1_density_0_10_no_workers,0.0,0.0,,,0.0,fixed value
coef_cars1_density_10_up_no_workers,-0.0145,0.00339,-4.27,***,0.0,
coef_cars1_density_10_up_workers,-0.0183,0.00271,-6.75,***,0.0,
coef_cars1_drivers_2,0.0,0.0,,,0.0,fixed value
coef_cars1_drivers_3,0.0,0.0,,,0.0,fixed value


# Output Estimation Results

In [10]:
from activitysim.estimation.larch import update_coefficients
result_dir = data.edb_directory/"estimated"
update_coefficients(
    model, data, result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

### Write the model estimation report, including coefficient t-statistic and log likelihood

In [11]:
model.to_xlsx(
    result_dir/f"{modelname}_model_estimation.xlsx", 
    data_statistics=False,
)

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file to the configs folder, rename it to `*_coefficients.csv`, and run ActivitySim in simulation mode.

In [12]:
pd.read_csv(result_dir/f"{modelname}_coefficients_revised.csv")

Unnamed: 0,coefficient_name,value,constrain
0,coef_cars1_drivers_2,0.000000,T
1,coef_cars1_drivers_3,0.000000,T
2,coef_cars1_persons_16_17,0.000000,T
3,coef_cars234_asc_marin,0.000000,T
4,coef_cars1_persons_25_34,0.000000,T
...,...,...,...
62,coef_cars4_drivers_3,5.329905,F
63,coef_cars3_drivers_3,5.702014,F
64,coef_cars2_drivers_4_up,6.182053,F
65,coef_cars3_drivers_4_up,8.643466,F


# Modify Spec

Here, we will demonstrate the process of re-estimating the model with a modified
SPEC file.  This does *not* require re-running ActivitySim, it just requires
changing the SPEC file and re-running the Larch estimation only.

The `backup` command we ran earlier made a backup copy of the
original spec file in the EDB directory.
This was not strictly necessary, but since we're about to modify it and
we may want undo our changes, it can be handy to keep a copy of the
original spec file around. Since we already have a backup copy, we'll make some 
changes directly in the SPEC file.  As an example here, we're going
to re-write the household income section of the file, to change the piecewise 
linear utility from 3 segments to 4.  We'll move the breakpoints and rename some
coefficients to accomodate the change.  As above, for this demo we are editing 
the SPEC file using Python code to make the changes, but a user does not need
to change the file using Python; any CSV editor (e.g. Excel) can be used. 

In [13]:
with open(data.edb_directory / "auto_ownership_SPEC.csv") as f:
    raw_spec = f.read()

orig_lines = """util_hh_income_0_30k,"Piecewise Linear household income, $0-30k","@df.income_in_thousands.clip(0, 30)",,coef_cars1_hh_income_0_30k,coef_cars2_hh_income_0_30k,coef_cars3_hh_income_0_30k,coef_cars4_hh_income_0_30k
util_hh_income_30_75k,"Piecewise Linear household income, $30-75k","@(df.income_in_thousands-30).clip(0, 45)",,coef_cars1_hh_income_30_up,coef_cars2_hh_income_30_up,coef_cars3_hh_income_30_up,coef_cars4_hh_income_30_up
util_hh_income_75k_up,"Piecewise Linear household income, $75k+, capped at $125k","@(df.income_in_thousands-75).clip(0, 50)",,coef_cars1_hh_income_30_up,coef_cars2_hh_income_30_up,coef_cars3_hh_income_30_up,coef_cars4_hh_income_30_up"""

repl_lines = """util_hh_income_0_25k,"Piecewise Linear household income, $0-25k","@df.income_in_thousands.clip(0, 25)",,coef_cars1_hh_income_0_25k,coef_cars2_hh_income_0_25k,coef_cars3_hh_income_0_25k,coef_cars4_hh_income_0_25k
util_hh_income_25_50k,"Piecewise Linear household income, $25-50k","@(df.income_in_thousands-25).clip(0, 25)",,coef_cars1_hh_income_25_50,coef_cars2_hh_income_25_50,coef_cars3_hh_income_25_50,coef_cars4_hh_income_25_50
util_hh_income_50_75k,"Piecewise Linear household income, $50-75k","@(df.income_in_thousands-50).clip(0, 25)",,coef_cars1_hh_income_50_75,coef_cars2_hh_income_50_75,coef_cars3_hh_income_50_75,coef_cars4_hh_income_50_75
util_hh_income_75k_150k,"Piecewise Linear household income, $75k+, capped at $150k","@(df.income_in_thousands-75).clip(0, 75)",,coef_cars1_hh_income_75_up,coef_cars2_hh_income_75_up,coef_cars3_hh_income_75_up,coef_cars4_hh_income_75_up"""

raw_spec = raw_spec.replace(orig_lines, repl_lines)

with open(data.edb_directory / "auto_ownership_SPEC.csv", mode="w") as f:
    f.write(raw_spec)


### WARNING

The re-estimation process will use the variable in the estimation data bundle (EDB) given by the "Label" 
column of the SPEC, if a variable with that exact name exists in the EDB.  Otherwise, it will attempt to
re-evaluate the contents of the "Expression" column using Sharrow, and the other data in the EDB.  Thus,
the expression must only reference other data that is available explicitly in the EDB; to use 
variables that ActivitySim could access but which have not been written to the EDB, it will be necessary
to go back to ActivitySim and re-run in estimation mode.

Also, the estimation functions do not inherently know what the "original" spec file contained, and rely
entirely on the presence or absence of an exact match on the "Label" column to find pre-evaluated expressions.
It is imcumbent on the user to ensure that any material changes the Expression column are also reflected
by a new unique name in the "Label" column.

Now to re-estimate the model, we just re-run the same steps as the original estimation above.

In [14]:
model2, data2 = component_model(modelname, edb_directory=data.edb_directory, return_data=True)

loading from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_coefficients.csv
loading spec from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_SPEC.csv
loading from output-est-mode/estimation_data_bundle/auto_ownership/auto_ownership_values_combined.parquet


You may notice in the utility functions shown below, all of the unadulterated lines of the 
spec file correlate with utility terms that are simple `X.label` data items, which are 
stored as pre-computed data variables in the EDB.  Our modified lines, however, are now
showing the complete expression that will be freshly evaluated by Larch using Sharrow.

In [15]:
model2.utility_co

alt,formula
1,<Empty LinearFunction>
2,"P.coef_cars1_drivers_2 * X.util_drivers_2  + P.coef_cars1_drivers_3 * X.util_drivers_3  + P.coef_cars1_drivers_4_up * X.util_drivers_4_up  + P.coef_cars1_persons_16_17 * X.util_persons_16_17  + P.coef_cars1_persons_18_24 * X.util_persons_18_24  + P.coef_cars1_persons_25_34 * X.util_persons_25_34  + P.coef_cars1_presence_children_0_4 * X.util_presence_children_0_4  + P.coef_cars1_presence_children_5_17 * X.util_presence_children_5_17  + P.coef_cars1_num_workers_clip_3 * X.util_num_workers_clip_3  + P.coef_cars1_hh_income_0_25k * X('df.income_in_thousands.clip(0, 25)')  + P.coef_cars1_hh_income_25_50 * X('(df.income_in_thousands-25).clip(0, 25)')  + P.coef_cars1_hh_income_50_75 * X('(df.income_in_thousands-50).clip(0, 25)')  + P.coef_cars1_hh_income_75_up * X('(df.income_in_thousands-75).clip(0, 75)')  + P.coef_cars1_density_0_10_no_workers * X.util_density_0_10_no_workers  + P.coef_cars1_density_10_up_no_workers * X.util_density_10_up_no_workers  + P.coef_cars1_density_0_10_no_workers * X.util_density_0_10_workers  + P.coef_cars1_density_10_up_workers * X.util_density_10_up_workers  + P.coef_cars1_asc * X.util_asc  + P.coef_cars1_asc_san_francisco * X.util_asc_san_francisco  + P.coef_cars1_asc_county * X.util_asc_solano  + P.coef_cars1_asc_county * X.util_asc_napa  + P.coef_cars1_asc_county * X.util_asc_sonoma  + P.coef_cars1_asc_marin * X.util_asc_marin  + P.coef_retail_auto_no_workers * X.util_retail_auto_no_workers  + P.coef_retail_auto_workers * X.util_retail_auto_workers  + P.coef_retail_transit_no_workers * X.util_retail_transit_no_workers  + P.coef_retail_transit_workers * X.util_retail_transit_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_no_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_workers  + P.coef_cars1_auto_time_saving_per_worker * X.util_auto_time_saving_per_worker"
3,"P.coef_cars2_drivers_2 * X.util_drivers_2  + P.coef_cars2_drivers_3 * X.util_drivers_3  + P.coef_cars2_drivers_4_up * X.util_drivers_4_up  + P.coef_cars2_persons_16_17 * X.util_persons_16_17  + P.coef_cars2_persons_18_24 * X.util_persons_18_24  + P.coef_cars2_persons_25_34 * X.util_persons_25_34  + P.coef_cars234_presence_children_0_4 * X.util_presence_children_0_4  + P.coef_cars2_presence_children_5_17 * X.util_presence_children_5_17  + P.coef_cars2_num_workers_clip_3 * X.util_num_workers_clip_3  + P.coef_cars2_hh_income_0_25k * X('df.income_in_thousands.clip(0, 25)')  + P.coef_cars2_hh_income_25_50 * X('(df.income_in_thousands-25).clip(0, 25)')  + P.coef_cars2_hh_income_50_75 * X('(df.income_in_thousands-50).clip(0, 25)')  + P.coef_cars2_hh_income_75_up * X('(df.income_in_thousands-75).clip(0, 75)')  + P.coef_cars2_density_0_10_no_workers * X.util_density_0_10_no_workers  + P.coef_cars2_density_10_up_no_workers * X.util_density_10_up_no_workers  + P.coef_cars2_density_0_10_no_workers * X.util_density_0_10_workers  + P.coef_cars2_density_10_up_no_workers * X.util_density_10_up_workers  + P.coef_cars2_asc * X.util_asc  + P.coef_cars2_asc_san_francisco * X.util_asc_san_francisco  + P.coef_cars2_asc_county * X.util_asc_solano  + P.coef_cars2_asc_county * X.util_asc_napa  + P.coef_cars2_asc_county * X.util_asc_sonoma  + P.coef_cars234_asc_marin * X.util_asc_marin  + P.coef_retail_auto_no_workers * X.util_retail_auto_no_workers  + P.coef_retail_auto_workers * X.util_retail_auto_workers  + P.coef_retail_transit_no_workers * X.util_retail_transit_no_workers  + P.coef_retail_transit_workers * X.util_retail_transit_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_no_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_workers  + P.coef_cars2_auto_time_saving_per_worker * X.util_auto_time_saving_per_worker"
4,"P.coef_cars3_drivers_2 * X.util_drivers_2  + P.coef_cars3_drivers_3 * X.util_drivers_3  + P.coef_cars3_drivers_4_up * X.util_drivers_4_up  + P.coef_cars34_persons_16_17 * X.util_persons_16_17  + P.coef_cars34_persons_18_24 * X.util_persons_18_24  + P.coef_cars34_persons_25_34 * X.util_persons_25_34  + P.coef_cars234_presence_children_0_4 * X.util_presence_children_0_4  + P.coef_cars34_presence_children_5_17 * X.util_presence_children_5_17  + P.coef_cars3_num_workers_clip_3 * X.util_num_workers_clip_3  + P.coef_cars3_hh_income_0_25k * X('df.income_in_thousands.clip(0, 25)')  + P.coef_cars3_hh_income_25_50 * X('(df.income_in_thousands-25).clip(0, 25)')  + P.coef_cars3_hh_income_50_75 * X('(df.income_in_thousands-50).clip(0, 25)')  + P.coef_cars3_hh_income_75_up * X('(df.income_in_thousands-75).clip(0, 75)')  + P.coef_cars34_density_0_10_no_workers * X.util_density_0_10_no_workers  + P.coef_cars34_density_10_up_no_workers * X.util_density_10_up_no_workers  + P.coef_cars34_density_0_10_no_workers * X.util_density_0_10_workers  + P.coef_cars34_density_10_up_no_workers * X.util_density_10_up_workers  + P.coef_cars3_asc * X.util_asc  + P.coef_cars34_asc_san_francisco * X.util_asc_san_francisco  + P.coef_cars34_asc_county * X.util_asc_solano  + P.coef_cars34_asc_county * X.util_asc_napa  + P.coef_cars34_asc_county * X.util_asc_sonoma  + P.coef_cars234_asc_marin * X.util_asc_marin  + P.coef_retail_auto_no_workers * X.util_retail_auto_no_workers  + P.coef_retail_auto_workers * X.util_retail_auto_workers  + P.coef_retail_transit_no_workers * X.util_retail_transit_no_workers  + P.coef_retail_transit_workers * X.util_retail_transit_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_no_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_workers  + P.coef_cars3_auto_time_saving_per_worker * X.util_auto_time_saving_per_worker"
5,"P.coef_cars4_drivers_2 * X.util_drivers_2  + P.coef_cars4_drivers_3 * X.util_drivers_3  + P.coef_cars4_drivers_4_up * X.util_drivers_4_up  + P.coef_cars34_persons_16_17 * X.util_persons_16_17  + P.coef_cars34_persons_18_24 * X.util_persons_18_24  + P.coef_cars34_persons_25_34 * X.util_persons_25_34  + P.coef_cars234_presence_children_0_4 * X.util_presence_children_0_4  + P.coef_cars34_presence_children_5_17 * X.util_presence_children_5_17  + P.coef_cars4_num_workers_clip_3 * X.util_num_workers_clip_3  + P.coef_cars4_hh_income_0_25k * X('df.income_in_thousands.clip(0, 25)')  + P.coef_cars4_hh_income_25_50 * X('(df.income_in_thousands-25).clip(0, 25)')  + P.coef_cars4_hh_income_50_75 * X('(df.income_in_thousands-50).clip(0, 25)')  + P.coef_cars4_hh_income_75_up * X('(df.income_in_thousands-75).clip(0, 75)')  + P.coef_cars34_density_0_10_no_workers * X.util_density_0_10_no_workers  + P.coef_cars34_density_10_up_no_workers * X.util_density_10_up_no_workers  + P.coef_cars34_density_0_10_no_workers * X.util_density_0_10_workers  + P.coef_cars34_density_10_up_no_workers * X.util_density_10_up_workers  + P.coef_cars4_asc * X.util_asc  + P.coef_cars34_asc_san_francisco * X.util_asc_san_francisco  + P.coef_cars34_asc_county * X.util_asc_solano  + P.coef_cars34_asc_county * X.util_asc_napa  + P.coef_cars34_asc_county * X.util_asc_sonoma  + P.coef_cars234_asc_marin * X.util_asc_marin  + P.coef_retail_auto_no_workers * X.util_retail_auto_no_workers  + P.coef_retail_auto_workers * X.util_retail_auto_workers  + P.coef_retail_transit_no_workers * X.util_retail_transit_no_workers  + P.coef_retail_transit_workers * X.util_retail_transit_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_no_workers  + P.coef_retail_non_motor * X.util_retail_non_motor_workers  + P.coef_cars4_auto_time_saving_per_worker * X.util_auto_time_saving_per_worker"


In [16]:
model2.estimate(maxiter=200)

Unnamed: 0_level_0,value,best,initvalue,minimum,maximum,nullvalue,holdfast
param_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
coef_cars1_asc,1.451930,1.451930,1.1865,-50.00,50.00,0.0,0
coef_cars1_asc_county,-0.669928,-0.669928,-0.5660,-50.00,50.00,0.0,0
coef_cars1_asc_marin,-0.170886,-0.170886,-0.2434,-50.00,50.00,0.0,0
coef_cars1_asc_san_francisco,0.317228,0.317228,0.4259,-50.00,50.00,0.0,0
coef_cars1_auto_time_saving_per_worker,0.380289,0.380289,0.4707,-50.00,50.00,0.0,0
...,...,...,...,...,...,...,...
coef_retail_auto_no_workers,0.027853,0.027853,0.0626,-50.00,50.00,0.0,0
coef_retail_auto_workers,0.143032,0.143032,0.1646,-50.00,50.00,0.0,0
coef_retail_non_motor,-0.030000,-0.030000,-0.0300,-0.03,-0.03,0.0,1
coef_retail_transit_no_workers,-0.307531,-0.307531,-0.3053,-50.00,50.00,0.0,0


Unnamed: 0_level_0,0
Unnamed: 0_level_1,0
coef_cars1_asc,1.451930
coef_cars1_asc_county,-0.669928
coef_cars1_asc_marin,-0.170886
coef_cars1_asc_san_francisco,0.317228
coef_cars1_auto_time_saving_per_worker,0.380289
coef_cars1_density_0_10_no_workers,0.000000
coef_cars1_density_10_up_no_workers,-0.014252
coef_cars1_density_10_up_workers,-0.018450
coef_cars1_drivers_2,0.000000
coef_cars1_drivers_3,0.000000

Unnamed: 0,0
coef_cars1_asc,1.45193
coef_cars1_asc_county,-0.669928
coef_cars1_asc_marin,-0.170886
coef_cars1_asc_san_francisco,0.317228
coef_cars1_auto_time_saving_per_worker,0.380289
coef_cars1_density_0_10_no_workers,0.0
coef_cars1_density_10_up_no_workers,-0.014252
coef_cars1_density_10_up_workers,-0.01845
coef_cars1_drivers_2,0.0
coef_cars1_drivers_3,0.0

Unnamed: 0,0
coef_cars1_asc,0.000104
coef_cars1_asc_county,-3.2e-05
coef_cars1_asc_marin,-5.1e-05
coef_cars1_asc_san_francisco,-9.7e-05
coef_cars1_auto_time_saving_per_worker,5.6e-05
coef_cars1_density_0_10_no_workers,0.0
coef_cars1_density_10_up_no_workers,-2e-05
coef_cars1_density_10_up_workers,7e-06
coef_cars1_drivers_2,0.0
coef_cars1_drivers_3,0.0


We can easily review the parameter estimates from the original and
revised models side by side to see what changed.

In [17]:
with pd.option_context('display.max_rows', 999):
    display(pd.concat({
        "model": model.parameter_summary().data,
        "model2": model2.parameter_summary().data,
    }, axis=1).fillna(""))

Unnamed: 0_level_0,model,model,model,model,model,model,model2,model2,model2,model2,model2,model2
Unnamed: 0_level_1,Value,Std Err,t Stat,Signif,Null Value,Constrained,Value,Std Err,t Stat,Signif,Null Value,Constrained
Parameter,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
coef_cars1_asc,1.34,0.941,1.42,,0.0,,1.45,0.946,1.54,,0.0,
coef_cars1_asc_county,-0.656,0.158,-4.14,***,0.0,,-0.67,0.159,-4.22,***,0.0,
coef_cars1_asc_marin,-0.168,0.106,-1.58,,0.0,,-0.171,0.106,-1.61,,0.0,
coef_cars1_asc_san_francisco,0.325,0.0979,3.31,***,0.0,,0.317,0.0982,3.23,**,0.0,
coef_cars1_auto_time_saving_per_worker,0.394,0.188,2.1,*,0.0,,0.38,0.189,2.01,*,0.0,
coef_cars1_density_0_10_no_workers,0.0,0.0,,,0.0,fixed value,0.0,0.0,,,0.0,fixed value
coef_cars1_density_10_up_no_workers,-0.0145,0.00339,-4.27,***,0.0,,-0.0143,0.00339,-4.21,***,0.0,
coef_cars1_density_10_up_workers,-0.0183,0.00271,-6.75,***,0.0,,-0.0184,0.00271,-6.8,***,0.0,
coef_cars1_drivers_2,0.0,0.0,,,0.0,fixed value,0.0,0.0,,,0.0,fixed value
coef_cars1_drivers_3,0.0,0.0,,,0.0,fixed value,0.0,0.0,,,0.0,fixed value


In [18]:
with pd.option_context('display.max_rows', 999):
    display(pd.concat({
        "model": model.estimation_statistics_raw(),
        "model2": model2.estimation_statistics_raw(),
    }, axis=1).fillna(""))

Unnamed: 0,Unnamed: 1,model,model2
Number of Cases,Aggregate,20000.0,20000.0
Log Likelihood at Convergence,Aggregate,-18485.495027,-18480.692008
Log Likelihood at Convergence,Per Case,-0.924275,-0.924035
Log Likelihood at Null Parameters,Aggregate,-32431.882743,-32431.882743
Log Likelihood at Null Parameters,Per Case,-1.621594,-1.621594
Rho Squared w.r.t. Null Parameters,Aggregate,0.430021,0.430169
