# 1: MTC MNL Mode Choice

In [None]:
import pandas as pd
import larch as lx

In [None]:
# TEST
pd.set_option("display.max_columns", 999)
pd.set_option('expand_frame_repr', False)
pd.set_option('display.precision', 3)
from pytest import approx
from larch.util.testing import assert_same_text

This example is a mode choice model built using the MTC example dataset.
First we create the Dataset and Model objects:

In [None]:
d = lx.examples.MTC(format='dataset')
d

In [None]:
m = lx.Model(d)

Then we can build up the utility function.  We'll use some :ref:`idco` data first, using
the `Model.utility.co` attribute.  This attribute is a dict-like object, to which
we can assign :class:`LinearFunction` objects for each alternative code.

In [None]:
from larch import P, X, PX
m.utility_co[2] = P("ASC_SR2")  + P("hhinc#2") * X("hhinc")
m.utility_co[3] = P("ASC_SR3P") + P("hhinc#3") * X("hhinc")
m.utility_co[4] = P("ASC_TRAN") + P("hhinc#4") * X("hhinc")
m.utility_co[5] = P("ASC_BIKE") + P("hhinc#5") * X("hhinc")
m.utility_co[6] = P("ASC_WALK") + P("hhinc#6") * X("hhinc")

Next we'll use some idca data, with the `utility_ca` attribute. This attribute
is only a single :class:`LinearFunction` that is applied across all alternatives
using :ref:`idca` data.  Because the data is structured to vary across alternatives,
the parameters (and thus the structure of the :class:`LinearFunction`) does not need
to vary across alternatives.

In [None]:
m.utility_ca = PX("tottime") + PX("totcost")

Lastly, we need to identify :ref:`idca` data that gives the availability for each
alternative, as well as the number of times each alternative is chosen. (In traditional
discrete choice analysis, this is often 0 or 1, but it need not be binary, or even integral.)

In [None]:
m.availability_ca_var = 'avail'
m.choice_ca_var = 'chose'

And let's give our model a descriptive title.

In [None]:
m.title = "MTC Example 1 (Simple MNL)"

We can view a summary of the choices and alternative 
availabilities to make sure the model is set up 
correctly.

In [None]:
m.choice_avail_summary()

In [None]:
# TEST
s = '''            name  chosen available
altid                                    
1                    DA    3637      4755
2                   SR2     517      5029
3                  SR3+     161      5029
4               Transit     498      4003
5                  Bike      50      1738
6                  Walk     166      1479
< Total All Alternatives > 5029          
'''
import re
mash = lambda x: re.sub('\s+', ' ', x).strip()
assert mash(s) == mash(str(m.choice_avail_summary()))

We'll set a parameter cap (bound) at +/- 20, which helps improve the 
numerical stability of the optimization algorithm used in estimation.

In [None]:
m.set_cap(20)

Having created this model, we can then estimate it:

In [None]:
# TEST
assert dict(m.required_data()) == {
    'ca': ['totcost', 'tottime'],
    'co': ['hhinc'],
    'choice_ca': 'chose',
    'avail_ca': 'avail',
}
assert m.loglike() == approx(-7309.600971749634)

In [None]:
assert m.compute_engine == 'jax'

In [None]:
result = m.maximize_loglike(stderr=True)

In [None]:
# TEST
assert result.loglike == approx(-3626.18625551293)
assert result.logloss == approx(0.7210551313408093)
assert result.message == 'Optimization terminated successfully'
assert m.total_weight() == 5029.0

In [None]:
m.parameter_summary()

It is a little tough to read this report because the parameters show up 
in alphabetical order.
We can use the reorder method to fix this and group them systematically:

In [None]:
m.ordering = (
    ("LOS", "totcost", "tottime", ),
    ("ASCs", "ASC.*", ),
    ("Income", "hhinc.*", ),
)

In [None]:
m.parameter_summary()

In [None]:
m.estimation_statistics()

In [None]:
# TEST
es = m.estimation_statistics()
assert es[0][1][0][1].text == '5029'
assert "|".join(i.text for i in es[0][1][0]) == 'Number of Cases|5029'
assert "|".join(i.text for i in es[0][1][1]) == 'Log Likelihood at Convergence|-3626.19|-0.72'
assert "|".join(i.text for i in es[0][1][2]) == 'Log Likelihood at Null Parameters|-7309.60|-1.45'
assert "|".join(i.text for i in es[0][1][3]) == 'Rho Squared w.r.t. Null Parameters|0.504'