# Estimating Workplace Location Choice

This notebook illustrates how to re-estimate a single model component for ActivitySim.  This process 
includes running ActivitySim in estimation mode to read household travel survey files and write out
the estimation data bundles used in this notebook.  To review how to do so, please visit the other
notebooks in this directory.

# Load libraries

In [1]:
import larch as lx  # !conda install larch #for estimation
import pandas as pd

lx.versions()

JAX not found. Some functionality will be unavailable.


{'larch': '6.0.33',
 'sharrow': '2.13.0',
 'numpy': '1.26.4',
 'pandas': '1.5.3',
 'xarray': '2024.3.0',
 'numba': '0.60.0'}

For this demo, we will assume that you have already run ActivitySim in estimation
mode, and saved the required estimation data bundles (EDB's) to disk.  See
the [first notebook](./01_estimation_mode.ipynb) for details.  The following module
will run a script to set everything up if the example data is not already available.

In [2]:
from est_mode_setup import prepare, backup

prepare()

EDB directory already populated.


PosixPath('test-estimation-data/activitysim-prototype-mtc-extended')

In this demo notebook, we will (later) edit the model spec file.  But for demo purposes, we want to
make sure we are starting from the "original" spec file, so we'll check that now.  For actual 
applications, this step would not be necessary.

In [3]:
backup("output-est-mode/estimation_data_bundle/workplace_location/workplace_location_SPEC.csv")

# Load data and prep model for estimation

In [4]:
modelname = "workplace_location"

In [5]:
from activitysim.estimation.larch import component_model

model, data = component_model(
    modelname,
    edb_directory=f"output-est-mode/estimation_data_bundle/{modelname}/",
    return_data=True,
)

loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_coefficients.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_SPEC.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_alternatives_combined.parquet
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_choosers_combined.parquet
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_landuse.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_size_terms.csv


# Review data loaded from EDB

Next we can review what was read the EDB, including the coefficients, model settings, utilities specification, and chooser and alternative data.

## coefficients

In [6]:
data.coefficients

Unnamed: 0_level_0,value,constrain
coefficient_name,Unnamed: 1_level_1,Unnamed: 2_level_1
coef_dist_0_1,-0.8428,F
coef_dist_1_2,-0.3104,F
coef_dist_2_5,-0.3783,F
coef_dist_5_15,-0.1285,F
coef_dist_15_up,-0.0917,F
coef_dist_0_5_high,0.15,F
coef_dist_5_up_high,0.02,F
coef_mode_logsum,0.3,F


## alt_values

In [7]:
data.alt_values

Unnamed: 0,person_id,alt_dest,prob,pick_count,mode_choice_logsum,size_term,shadow_price_size_term_adjustment,shadow_price_utility_adjustment,util_dist_0_1,util_dist_1_2,util_dist_2_5,util_dist_5_15,util_dist_15_up,util_dist_0_5_high,util_dist_15_up_high,util_size_variable,util_utility_adjustment,util_no_attractions,util_mode_logsum,util_sample_of_corrections_factor
0,72355,2,0.033018,4,-1.930058,8679.220,1,0,1.0,1.0,3.0,1.01,0.000000,0.0,0.0,9.068802,0,False,-1.930058,4.796986
1,72355,5,0.015149,1,-1.727929,3811.166,1,0,1.0,1.0,3.0,0.67,0.000000,0.0,0.0,8.245953,0,False,-1.727929,4.189851
2,72355,6,0.004282,1,-1.599405,1050.528,1,0,1.0,1.0,3.0,0.48,0.000000,0.0,0.0,6.958000,0,False,-1.599405,5.453389
3,72355,12,0.013401,1,-1.859328,3508.389,1,0,1.0,1.0,3.0,0.98,0.000000,0.0,0.0,8.163197,0,False,-1.859328,4.312441
4,72355,14,0.020545,2,-2.014588,5612.293,1,0,1.0,1.0,3.0,1.31,0.000000,0.0,0.0,8.632893,0,False,-2.014588,4.578298
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
766219,7539317,1392,0.012884,1,-1.286404,153.501,1,0,1.0,1.0,3.0,3.60,0.000000,0.0,0.0,5.040201,0,False,-1.286404,4.351731
766220,7539317,1397,0.004689,1,-3.658543,303.136,1,0,1.0,1.0,3.0,10.00,9.440001,0.0,0.0,5.717475,0,False,-3.658543,5.362505
766221,7539317,1401,0.005169,1,-4.566126,559.322,1,0,1.0,1.0,3.0,10.00,15.040001,0.0,0.0,6.328512,0,False,-4.566126,5.264988
766222,7539317,1405,0.001902,1,-6.448663,763.324,1,0,1.0,1.0,3.0,10.00,29.330002,0.0,0.0,6.638992,0,False,-6.448663,6.264901


## chooser_data

In [8]:
data.chooser_data

Unnamed: 0,person_id,model_choice,override_choice,income_segment,home_zone_id
0,72355,14,17,1,55
1,72384,1,16,1,59
2,72407,70,70,1,59
3,72459,193,30,1,61
4,72529,16,57,1,69
...,...,...,...,...,...
28276,7539071,1006,1019,1,1006
28277,7539203,1059,1152,1,1159
28278,7539217,940,991,1,1159
28279,7539270,1162,1145,1,1161


## landuse

In [9]:
data.landuse

Unnamed: 0_level_0,DISTRICT,SD,county_id,TOTHH,TOTPOP,TOTACRE,RESACRE,CIACRE,TOTEMP,AGE0519,...,area_type,HSENROLL,COLLFTE,COLLPTE,TOPOLOGY,TERMINAL,household_density,employment_density,density_index,is_cbd
zone_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1,1,1,46,82,20.3,1.0,15.00000,27318,7,...,0,0.0,0.00000,0.0,3,5.89564,2.875000,1707.375000,2.870167,False
2,1,1,1,134,240,31.1,1.0,24.79297,42078,19,...,0,0.0,0.00000,0.0,1,5.84871,5.195214,1631.374751,5.178722,False
3,1,1,1,267,476,14.7,1.0,2.31799,2445,38,...,0,0.0,0.00000,0.0,1,5.53231,80.470405,736.891913,72.547987,False
4,1,1,1,151,253,19.3,1.0,18.00000,22434,20,...,0,0.0,0.00000,0.0,2,5.64330,7.947368,1180.736842,7.894233,False
5,1,1,1,611,1069,52.7,1.0,15.00000,15662,86,...,0,0.0,72.14684,0.0,1,5.52555,38.187500,978.875000,36.753679,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1450,34,34,9,2724,6493,1320.0,630.0,69.00000,1046,1013,...,4,0.0,0.00000,0.0,1,1.12116,3.896996,1.496423,1.081235,False
1451,34,34,9,2016,4835,664.0,379.0,43.00000,757,757,...,4,0.0,0.00000,0.0,1,1.17116,4.777251,1.793839,1.304140,False
1452,34,34,9,2178,5055,1068.0,602.0,35.00000,2110,789,...,4,0.0,0.00000,0.0,1,1.17587,3.419152,3.312402,1.682465,False
1453,34,34,9,298,779,14195.0,429.0,4.00000,922,88,...,5,0.0,0.00000,0.0,1,1.01972,0.688222,2.129330,0.520115,False


## spec

In [10]:
data.spec

Unnamed: 0,Label,Description,Expression,coefficient
0,util_dist_0_1,"Distance, piecewise linear from 0 to 1 miles","@_DIST.clip(0,1)",coef_dist_0_1
1,util_dist_1_2,"Distance, piecewise linear from 1 to 2 miles","@(_DIST-1).clip(0,1)",coef_dist_1_2
2,util_dist_2_5,"Distance, piecewise linear from 2 to 5 miles","@(_DIST-2).clip(0,3)",coef_dist_2_5
3,util_dist_5_15,"Distance, piecewise linear from 5 to 15 miles","@(_DIST-5).clip(0,10)",coef_dist_5_15
4,util_dist_15_up,"Distance, piecewise linear for 15+ miles",@(_DIST-15.0).clip(0),coef_dist_15_up
5,util_dist_0_5_high,"Distance 0 to 5 mi, high and very high income",@(df['income_segment']>=WORK_HIGH_SEGMENT_ID) ...,coef_dist_0_5_high
6,util_dist_15_up_high,"Distance 5+ mi, high and very high income",@(df['income_segment']>=WORK_HIGH_SEGMENT_ID) ...,coef_dist_5_up_high
7,util_no_attractions,No attractions,@df['size_term']==0,-999
8,util_mode_logsum,Mode choice logsum,mode_choice_logsum,coef_mode_logsum
9,util_sample_of_corrections_factor,Sample of alternatives correction factor,"@np.minimum(np.log(df.pick_count/df.prob), 60)",1


## size_spec

In [11]:
data.size_spec

Unnamed: 0_level_0,RETEMPN,FPSEMPN,HEREMPN,OTHEMPN,AGREMPN,MWTEMPN
segment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
work_low,0.129,0.193,0.383,0.12,0.01,0.164
work_med,0.12,0.197,0.325,0.139,0.008,0.21
work_high,0.11,0.207,0.284,0.154,0.006,0.239
work_veryhigh,0.093,0.27,0.241,0.146,0.004,0.246


# Estimate

With the model setup for estimation, the next step is to estimate the model coefficients.  Make sure to use a sufficiently large enough household sample and set of zones to avoid an over-specified model, which does not have a numerically stable likelihood maximizing solution.  Larch has a built-in estimation methods including BHHH, and also offers access to more advanced general purpose non-linear optimizers in the `scipy` package, including SLSQP, which allows for bounds and constraints on parameters.  BHHH is the default and typically runs faster, but does not follow constraints on parameters.

In [12]:
model.estimate(method="BHHH", options={"maxiter": 1000})

Unnamed: 0_level_0,value,best,initvalue,minimum,maximum,nullvalue,holdfast
param_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
-999,-999.0,-999.0,-999.0,-999.0,-999.0,0.0,1
1,1.0,1.0,1.0,1.0,1.0,0.0,1
coef_dist_0_1,-1.035381,-1.035381,-0.8428,-25.0,25.0,0.0,0
coef_dist_0_5_high,0.1469662,0.1469662,0.15,-25.0,25.0,0.0,0
coef_dist_15_up,-0.1170512,-0.1170512,-0.0917,-25.0,25.0,0.0,0
coef_dist_1_2,-0.4652491,-0.4652491,-0.3104,-25.0,25.0,0.0,0
coef_dist_2_5,-0.4177477,-0.4177477,-0.3783,-25.0,25.0,0.0,0
coef_dist_5_15,-0.1558734,-0.1558734,-0.1285,-25.0,25.0,0.0,0
coef_dist_5_up_high,0.02506728,0.02506728,0.02,-25.0,25.0,0.0,0
coef_mode_logsum,0.07423654,0.07423654,0.3,-25.0,25.0,0.0,0


/Users/jpn/Git/est-mode/larch/src/larch/model/jaxmodel.py:1156: PossibleOverspecification: Model is possibly over-specified (hessian is nearly singular).
  self.calculate_parameter_covariance()


Unnamed: 0,0
-999,-9.990000e+02
1,1.000000e+00
coef_dist_0_1,-1.035381e+00
coef_dist_0_5_high,1.469662e-01
coef_dist_15_up,-1.170512e-01
coef_dist_1_2,-4.652491e-01
coef_dist_2_5,-4.177477e-01
coef_dist_5_15,-1.558734e-01
coef_dist_5_up_high,2.506728e-02
coef_mode_logsum,7.423654e-02

Unnamed: 0,0
-999,-999.0
1,1.0
coef_dist_0_1,-1.035381
coef_dist_0_5_high,0.1469662
coef_dist_15_up,-0.1170512
coef_dist_1_2,-0.4652491
coef_dist_2_5,-0.4177477
coef_dist_5_15,-0.1558734
coef_dist_5_up_high,0.02506728
coef_mode_logsum,0.07423654


### Estimated coefficients

In [13]:
model.parameter_summary()

Unnamed: 0_level_0,Value,Std Err,t Stat,Signif,Null Value,Constrained
Parameter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
-999,-999.0,0.0,,,0.0,fixed value
1,1.0,0.0,,,0.0,fixed value
coef_dist_0_1,-1.04,0.105,-9.89,***,0.0,
coef_dist_0_5_high,0.147,0.0114,12.86,***,0.0,
coef_dist_15_up,-0.117,0.00211,-55.56,***,0.0,
coef_dist_1_2,-0.465,0.0428,-10.86,***,0.0,
coef_dist_2_5,-0.418,0.0116,-36.15,***,0.0,
coef_dist_5_15,-0.156,0.00276,-56.40,***,0.0,
coef_dist_5_up_high,0.0251,0.00176,14.22,***,0.0,
coef_mode_logsum,0.0742,0.00837,8.87,***,0.0,


# Output Estimation Results

In [14]:
from activitysim.estimation.larch import update_coefficients, update_size_spec

result_dir = data.edb_directory / "estimated"

## Write updated utility coefficients

In [15]:
update_coefficients(
    model,
    data,
    result_dir,
    output_file=f"{modelname}_coefficients_revised.csv",
);

## Write updated size coefficients

In [16]:
update_size_spec(
    model,
    data,
    result_dir,
    output_file=f"{modelname}_size_terms.csv",
)

Unnamed: 0,segment,model_selector,TOTHH,RETEMPN,FPSEMPN,HEREMPN,OTHEMPN,AGREMPN,MWTEMPN,AGE0519,HSENROLL,COLLFTE,COLLPTE
0,work_low,workplace,0.0,0.121509,0.208506,0.41524,0.12477,6.285395e-32,0.129975,0.0,0.0,0.0,0.0
1,work_med,workplace,0.0,0.17635,0.17096,0.294919,0.167444,0.0,0.190327,0.0,0.0,0.0,0.0
2,work_high,workplace,0.0,0.125802,0.171788,0.20929,0.128317,0.1787012,0.186102,0.0,0.0,0.0,0.0
3,work_veryhigh,workplace,0.0,0.144466,0.247167,0.211891,0.126365,0.003692074,0.266419,0.0,0.0,0.0,0.0
4,university,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.592,0.408
5,gradeschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
6,highschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
7,escort,non_mandatory,0.0,0.225,0.0,0.144,0.0,0.0,0.0,0.465,0.166,0.0,0.0
8,shopping,non_mandatory,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,eatout,non_mandatory,0.0,0.742,0.0,0.258,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Write the model estimation report, including coefficient t-statistic and log likelihood

In [17]:
model.to_xlsx(
    result_dir / f"{modelname}_model_estimation.xlsx",
    data_statistics=False,
);

# Next Steps

The final step is to either manually or automatically copy the `*_coefficients_revised.csv` file and `*_size_terms.csv` file to the configs folder, rename them to `*_coefficients.csv` and `destination_choice_size_terms.csv`, and run ActivitySim in simulation mode.  Note that all the location
and desintation choice models share the same `destination_choice_size_terms.csv` input file, so if you
are updating all these models, you'll need to ensure that updated sections of this file for each model
are joined together correctly.

In [18]:
pd.read_csv(result_dir / f"{modelname}_coefficients_revised.csv")

Unnamed: 0,coefficient_name,value,constrain
0,coef_dist_0_1,-1.035381,F
1,coef_dist_1_2,-0.465249,F
2,coef_dist_2_5,-0.417748,F
3,coef_dist_5_15,-0.155873,F
4,coef_dist_15_up,-0.117051,F
5,coef_dist_0_5_high,0.146966,F
6,coef_dist_5_up_high,0.025067,F
7,coef_mode_logsum,0.074237,F


In [19]:
pd.read_csv(result_dir / f"{modelname}_size_terms.csv")

Unnamed: 0,index,segment,model_selector,TOTHH,RETEMPN,FPSEMPN,HEREMPN,OTHEMPN,AGREMPN,MWTEMPN,AGE0519,HSENROLL,COLLFTE,COLLPTE
0,0,work_low,workplace,0.0,0.121509,0.208506,0.41524,0.12477,6.285395e-32,0.129975,0.0,0.0,0.0,0.0
1,1,work_med,workplace,0.0,0.17635,0.17096,0.294919,0.167444,0.0,0.190327,0.0,0.0,0.0,0.0
2,2,work_high,workplace,0.0,0.125802,0.171788,0.20929,0.128317,0.1787012,0.186102,0.0,0.0,0.0,0.0
3,3,work_veryhigh,workplace,0.0,0.144466,0.247167,0.211891,0.126365,0.003692074,0.266419,0.0,0.0,0.0,0.0
4,4,university,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.592,0.408
5,5,gradeschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
6,6,highschool,school,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
7,7,escort,non_mandatory,0.0,0.225,0.0,0.144,0.0,0.0,0.0,0.465,0.166,0.0,0.0
8,8,shopping,non_mandatory,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,9,eatout,non_mandatory,0.0,0.742,0.0,0.258,0.0,0.0,0.0,0.0,0.0,0.0,0.0


# Modify Spec

Here, we will demonstrate the process of re-estimating the model with a modified
SPEC file.  This does *not* require re-running ActivitySim, it just requires
changing the SPEC file and re-running the Larch estimation only.

The `backup` command we ran earlier made a backup copy of the  
original spec file in the EDB directory.
This was not strictly necessary, but since we're about to modify it and
we may want undo our changes, it can be handy to keep a copy of the
original spec file around. Since we already have a backup copy, we'll make some 
changes directly in the SPEC file.  As an example here, we're going
to change one of the breakpoints on the piecewise distance function in the
utility calculation. For this demo we are editing 
the SPEC file using Python code to make the changes, but a user does not need
to change the file using Python; any CSV editor (e.g. Excel) can be used. 

The raw contents of the SPEC file can be loaded and viewed in Python like this:

In [20]:
with open(data.edb_directory / "workplace_location_SPEC.csv") as f:
    raw_spec = f.read()

print(raw_spec)

Label,Description,Expression,coefficient
local_dist,,_DIST@skims['DIST'],1
util_dist_0_1,"Distance, piecewise linear from 0 to 1 miles","@_DIST.clip(0,1)",coef_dist_0_1
util_dist_1_2,"Distance, piecewise linear from 1 to 2 miles","@(_DIST-1).clip(0,1)",coef_dist_1_2
util_dist_2_5,"Distance, piecewise linear from 2 to 5 miles","@(_DIST-2).clip(0,3)",coef_dist_2_5
util_dist_5_15,"Distance, piecewise linear from 5 to 15 miles","@(_DIST-5).clip(0,10)",coef_dist_5_15
util_dist_15_up,"Distance, piecewise linear for 15+ miles",@(_DIST-15.0).clip(0),coef_dist_15_up
util_dist_0_5_high,"Distance 0 to 5 mi, high and very high income",@(df['income_segment']>=WORK_HIGH_SEGMENT_ID) * _DIST.clip(upper=5),coef_dist_0_5_high
util_dist_15_up_high,"Distance 5+ mi, high and very high income",@(df['income_segment']>=WORK_HIGH_SEGMENT_ID) * (_DIST-5).clip(0),coef_dist_5_up_high
util_size_variable,Size variable,@(df['size_term'] * df['shadow_price_size_term_adjustment']).apply(np.log1p),1
util_utility_adjust

Let's move the 2 mile breakpoint to 3 miles.  To do so, we will need to change two lines
of the SPEC file. As we change the file, we will edit all four columns: label, description, 
expression, and coefficient. 

## WARNING

**It is particularly important to make changes to the label when 
changing the expression.**  The estimation tools will prefer the pre-calculated variables computed
and stored with the given label, so if the label is not changed the expression changes will be ignored.

After we make the changes, we'll write the modified SPEC file back to disk, overwriting the original.

In [21]:
orig_lines = """util_dist_1_2,"Distance, piecewise linear from 1 to 2 miles","@(_DIST-1).clip(0,1)",coef_dist_1_2
util_dist_2_5,"Distance, piecewise linear from 2 to 5 miles","@(_DIST-2).clip(0,3)",coef_dist_2_5"""

repl_lines = """util_dist_1_3,"Distance, piecewise linear from 1 to 3 miles","@(_DIST-1).clip(0,2)",coef_dist_1_3
util_dist_3_5,"Distance, piecewise linear from 3 to 5 miles","@(_DIST-3).clip(0,2)",coef_dist_3_5"""

raw_spec = raw_spec.replace(orig_lines, repl_lines)

with open(data.edb_directory / "workplace_location_SPEC.csv", mode="w") as f:
    f.write(raw_spec)


Reloading the model and getting set for re-estimation can be done using the same commands as above.

In [22]:
model2, data2 = component_model(
    modelname,
    edb_directory=f"output-est-mode/estimation_data_bundle/{modelname}/",
    return_data=True,
)

loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_coefficients.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_SPEC.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_alternatives_combined.parquet
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_choosers_combined.parquet
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_landuse.csv
loading from output-est-mode/estimation_data_bundle/workplace_location/workplace_location_size_terms.csv


You may notice in the utility functions shown below, all of the unadulterated lines of the 
spec file correlate with utility terms that are simple `X.label` data items, which are 
stored as pre-computed data variables in the EDB.  Our modified lines, however, are now
showing the complete expression that will be freshly evaluated by Larch using Sharrow.

In [23]:
model2.utility_ca

We are almost ready to go with this model.  But if we attempt to estimate the model now, we will get an error:

In [24]:
from larch.util.shush import shush

with shush(stderr=True):
    try:
        model2.estimate()
    except NameError as e:
        print(repr(e))

NameError("name '_DIST' is not defined")


The error arises because the model SPEC includes a temporary `_DIST` variable, which was not stored in the EDB data files.
There are several ways to solve this problem. The most robust way would be to return to ActivitySim and edit the model specification
and configurations so that this variable is not a temporary value (i.e. add it to a preprocessor or annotator).  This process 
may take some time and effort, but it should be able to allow the user to access any variable from ActivitySim.

Alternatively, it may be possible to reconstruct the value of the temporary variable, or a suitable proxy, by processing 
values that did get stored in the EDB.  We can do so here, by reconstituting the distance value as the sum of the piecewise
parts that did get stored in the EDB.  Rather than returning to ActivitySim, we can simply compute the temporary variable
directly in the model's data object, giving it the same name as from the specification file, and otherwise proceed as normal.

In [25]:
model2.data["_DIST"] = (
    model2.data.util_dist_0_1
    + model2.data.util_dist_1_2
    + model2.data.util_dist_2_5
    + model2.data.util_dist_5_15
    + model2.data.util_dist_15_up
)

In [26]:
with shush(stderr=True):
    # swallow a bunch of error logging related to recovery from our previous estimation failure
    result2 = model2.estimate(method="BHHH", options={"maxiter": 1000})

result2

Unnamed: 0_level_0,value,best,initvalue,minimum,maximum,nullvalue,holdfast
param_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
-999,-999.0,-999.0,-999.0,-999.0,-999.0,0.0,1
1,1.0,1.0,1.0,1.0,1.0,0.0,1
coef_dist_0_1,-1.072002,-1.072002,-0.8428,-25.0,25.0,0.0,0
coef_dist_0_5_high,0.147163,0.147163,0.15,-25.0,25.0,0.0,0
coef_dist_15_up,-0.117017,-0.117017,-0.0917,-25.0,25.0,0.0,0
coef_dist_1_3,-0.433865,-0.433865,0.0,-inf,inf,0.0,0
coef_dist_3_5,-0.417069,-0.417069,0.0,-inf,inf,0.0,0
coef_dist_5_15,-0.15576,-0.15576,-0.1285,-25.0,25.0,0.0,0
coef_dist_5_up_high,0.02505,0.02505,0.02,-25.0,25.0,0.0,0
coef_mode_logsum,0.07453,0.07453,0.3,-25.0,25.0,0.0,0


Unnamed: 0,0
-999,-999.000000
1,1.000000
coef_dist_0_1,-1.072002
coef_dist_0_5_high,0.147163
coef_dist_15_up,-0.117017
coef_dist_1_3,-0.433865
coef_dist_3_5,-0.417069
coef_dist_5_15,-0.155760
coef_dist_5_up_high,0.025050
coef_mode_logsum,0.074530

Unnamed: 0,0
-999,-999.0
1,1.0
coef_dist_0_1,-1.072002
coef_dist_0_5_high,0.147163
coef_dist_15_up,-0.117017
coef_dist_1_3,-0.433865
coef_dist_3_5,-0.417069
coef_dist_5_15,-0.15576
coef_dist_5_up_high,0.02505
coef_mode_logsum,0.07453


We can then review the original and revised models side-by-side to see the differences.

In [27]:
with pd.option_context('display.max_rows', 999):
    display(pd.concat({
        "model": model.estimation_statistics_raw(),
        "model2": model2.estimation_statistics_raw(),
    }, axis=1).fillna(""))

Unnamed: 0,Unnamed: 1,model,model2
Number of Cases,Aggregate,28281.0,28281.0
Log Likelihood at Convergence,Aggregate,-88683.850386,-88684.181771
Log Likelihood at Convergence,Per Case,-3.13581,-3.135822
Log Likelihood at Null Parameters,Aggregate,-124263.519724,-124263.519724
Log Likelihood at Null Parameters,Per Case,-4.393887,-4.393887
Rho Squared w.r.t. Null Parameters,Aggregate,0.286324,0.286322


In [28]:
with pd.option_context('display.max_rows', 999):
    display(pd.concat({
        "model": model.parameter_summary().data,
        "model2": model2.parameter_summary().data,
    }, axis=1).fillna(""))

Unnamed: 0_level_0,model,model,model,model,model,model,model2,model2,model2,model2,model2,model2
Unnamed: 0_level_1,Value,Std Err,t Stat,Signif,Null Value,Constrained,Value,Std Err,t Stat,Signif,Null Value,Constrained
Parameter,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
-999,-999.0,0.0,,,0.0,fixed value,-999.0,0.0,,,0.0,fixed value
1,1.0,0.0,,,0.0,fixed value,1.0,0.0,,,0.0,fixed value
coef_dist_0_1,-1.04,0.105,-9.89,***,0.0,,-1.07,0.0946,-11.33,***,0.0,
coef_dist_0_5_high,0.147,0.0114,12.86,***,0.0,,0.147,0.0114,12.86,***,0.0,
coef_dist_15_up,-0.117,0.00211,-55.56,***,0.0,,-0.117,0.00211,-55.55,***,0.0,
coef_dist_1_2,-0.465,0.0428,-10.86,***,0.0,,,,,,,
coef_dist_2_5,-0.418,0.0116,-36.15,***,0.0,,,,,,,
coef_dist_5_15,-0.156,0.00276,-56.40,***,0.0,,-0.156,0.00283,-55.09,***,0.0,
coef_dist_5_up_high,0.0251,0.00176,14.22,***,0.0,,0.025,0.00176,14.2,***,0.0,
coef_mode_logsum,0.0742,0.00837,8.87,***,0.0,,0.0745,0.00836,8.91,***,0.0,
