# The Nested Logit Model


The Nested Logit model considers sub-groups of alternatives totally substitutables, called 'nests'. The general idea is that a customer might choose its transportation mode between publics transport and its private car. And then, if he decides to use public transportations the customer chooses between taking the train or the bus.\
The classical Conditional Logit does not account for such decision process. Hence the introduction of the Nested Logit. More detailed information are available [here](https://cran.r-project.org/web/packages/mlogit/vignettes/c4.relaxiid.html#:~:text=The%20nested%20logit%20model&text=It%20is%20a%20generalization%20of,different%20nests%20are%20still%20uncorrelated.).


In this notebook we reproduce results from other packages showing how to speficy a Nested Logit model with Choice-Learn and that we reach the right results.

## Summary

In [None]:
import os

os.environ["CUDA_VISIBLE_DEVICES"] = ""

import sys

sys.path.append("../../")

import numpy as np
import pandas as pd

### Import the Nested Logit from Choice-Learn !

In [None]:
from choice_learn.models.nested_logit import NestedLogit

## 1- Nested Logit on the SwissMetro dataset

We reproduce the results from [Biogeme](https://biogeme.epfl.ch/sphinx/auto_examples/swissmetro/plot_b09nested.html) that is also reproduced in [PyLogit](https://github.com/timothyb0912/pylogit/blob/master/examples/notebooks/Nested%20Logit%20Example--Python%20Biogeme%20benchmark--09NestedLogit.ipynb).\
This example uses the SwissMetro dataset further described in the [data introduction](../introduction/2_data_handling.ipynb).



In [None]:
from choice_learn.datasets import load_swissmetro
swiss_dataset = load_swissmetro(preprocessing="biogeme_nested")
print(swiss_dataset.summary())

The model specified in Biogeme defines two nests:
- The existing modes nest with the train and car *(items indexes of 0 and 2)*
- The future modes nest with the swissmetro *(item index of 1)*

And the utility form is the following:\
&nbsp; &nbsp; &nbsp; $U(i) = \beta^{inter}_i + \beta^{tt} \cdot TT(i) + \beta^{co} \cdot CO(i)$\
with:
- $TT(i)$ the travel time of alternative $i$
- $CO(i)$ the cost of alternative $i$
- $\beta^{inter}_{sm} = 0$

Therefore we have 4 weights in the utility function and the $\gamma_{nest}$ values to estimate.

With Choice-Learn, the Nested Logit model specification is similar to the [Conditional Logit specification](./../introduction/3_model_clogit.ipynb). The few differences are:
- When the model is instantiated, the nested need to be specified as a list of nests with the concerned items indexes. In the example, we specify `items_nests=[[0, 2], [1]]` saying that first nest contains the items of indexes 0 (train) and 2 (car) and the second nest the item of index 1 (swiss metro).
- The "fast" dict-base specifications has another alternative with `coefficients={feature_name: "nest"}` creating for the feature feature_name one coefficient to estimate by nest, this coefficient being shared by all alternatives of the nest.

In [None]:
# Initialization of the model
swiss_model = NestedLogit(optimizer="lbfgs", items_nests=[[0, 2], [1]], batch_size=-1, lr=0.002, epochs=100)

# Intercept for train & sm
swiss_model.add_coefficients(feature_name="intercept", items_indexes=[0, 2])

# betas TT and CO shared by train and sm
swiss_model.add_shared_coefficient(feature_name="travel_time",
                                   items_indexes=[0, 1, 2])
swiss_model.add_shared_coefficient(feature_name="cost",
                                   items_indexes=[0, 1, 2])


In [None]:
# Estimation of the model
history = swiss_model.fit(swiss_dataset, get_report=False, verbose=2)

In [None]:
# Looking at the weights
swiss_model.trainable_weights

In [None]:
# Estimating the total summed Negative Log-Likelihood
swiss_model.evaluate(swiss_dataset) * len(swiss_dataset)

In [None]:
# Probabilities can be easily computed:
probas = swiss_model.predict_probas(swiss_dataset)

### Interpretation and comparison with Biogeme results