# AI for Transportation | Recitation 2: Discrete Choice Modeling

In this recitation, we study **Discrete Choice Modeling (DCM)** using both classical baselines and a modern, theory-informed neural approach: the **Alternative-Specific Utility Deep Neural Network (ASU-DNN)**. Our goal is to understand how utility-theoretic structure can guide neural architectures to achieve **better prediction**, **clearer interpretation**, and **stronger generalization** than vanilla fully connected models.

We will work with the [1987 Netherlands **Train**](https://open.rijkswaterstaat.nl/@120703/the-netherlands-value-time-study-final/) mode-choice dataset via the R [**mlogit**](https://cran.r-project.org/web/packages/mlogit/index.html) package. You can find information on the dataset in the mlogit [documentation](https://www.rdocumentation.org/packages/mlogit/versions/0.2-4/topics/Train). Although the original survey microdata are not directly distributed, the `Train` dataset is accessible programmatically from CRAN; in Colab we will fetch it using **rpy2** and convert it to Python (pandas) for modeling.

We will proceed in four parts:

1. **Data Access & Preparation**  
   - Install and load **mlogit** in an R runtime within Colab using **rpy2**.  
   - Import the `Train` dataset and convert it to long/choice-format pandas DataFrames.  
   - Split into train/validation/test with traveler-level grouping.

2. **Classical Baselines (Utility Models)**  
   - Fit **Multinomial Logit (MNL)** with alternative-specific constants and key attributes (time, cost, etc.).  
   - Report **in-sample fit** and **out-of-sample accuracy** / log-likelihood.  
   - Extract **elasticities**, **value of time (VOT)**, and **substitution patterns** for interpretability.

3. **ASU-DNN (Theory-Guided Architecture)**  
   - Build a sparse, **alternative-specific** network where each mode’s attributes feed only its own utility head (softmax over utilities).  
   - Train with cross-entropy; compare against a **fully connected DNN** of similar capacity.  
   - Evaluate **accuracy**, **calibration**, and **interpretability** (e.g., cost/time response curves per mode).

4. **Comparison & Reflection**  
   - Side-by-side results for **MNL vs. FC-DNN vs. ASU-DNN**.  
   - Discuss the trade-off between **approximation power** and **estimation stability**, and how ASU-DNN acts as a **domain-knowledge regularizer**.  
   - Optional: examine **robustness** (small cost/time perturbations) and **IIA-like behaviors** under different architectures.

By the end, you will:
- Load and prepare a **choice-format** dataset from R inside a Python notebook.  
- Fit and interpret a **classical MNL** model.  
- Implement and train a **theory-informed ASU-DNN**, and compare it with a **vanilla DNN**.  
- Relate architectural choices to **economic interpretability** and **generalization** in discrete choice tasks.

**References:**  
- CRAN `mlogit` package — Train dataset documentation.  
- Lecture materials on **ASU-DNN** and **theory-based neural architectures** for choice modeling.

We will be using data from [The Netherlands' Reports](https://open.rijkswaterstaat.nl/@120703/the-netherlands-value-time-study-final/). Since the original dataset is not available in a stand-alone fashion, we are making use of the R-language CRAN package [MLogit](https://cran.r-project.org/web/packages/mlogit/index.html). You can find information on the specific `Train` (the vehicle) dataset [here](https://www.rdocumentation.org/packages/mlogit/versions/0.2-4/topics/Train).

In [None]:
# Imports
# Standard library
import os
import pickle

# Core scientific stack
import numpy as np
import pandas as pd
import scipy.stats as ss
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf

# Scikit-learn
from sklearn.model_selection import StratifiedKFold, train_test_split, cross_validate
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC, LinearSVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

# Custom utilities
import util_nn_mlarch as util

# Reproducibility and plotting style
RANDOM_STATE = 100
np.random.seed(RANDOM_STATE)
sns.set(context="notebook", style="whitegrid")

ModuleNotFoundError: No module named 'util_nn_mlarch'

In [None]:
# We support either:
# 1) mlogit_Train_wide.csv (simplest), or
# 2) mlogit_choice_data.pickle with a 'Train_wide' key.

def load_train_wide(uploaded_names):
    if any(name.endswith("mlogit_Train_wide.csv") for name in uploaded_names):
        csv_name = [n for n in uploaded_names if n.endswith("mlogit_Train_wide.csv")][0]
        df = pd.read_csv(csv_name)
        source = "csv"
    elif any(name.endswith("mlogit_choice_data.pickle") for name in uploaded_names):
        pkl_name = [n for n in uploaded_names if n.endswith("mlogit_choice_data.pickle")][0]
        import pickle
        with open(pkl_name, "rb") as f:
            data_dic = pickle.load(f)
        if "Train_wide" not in data_dic:
            raise KeyError("Train_wide not found in the pickle. Available keys: "
                           + ", ".join(data_dic.keys()))
        df = data_dic["Train_wide"].copy()
        source = "pickle"
    else:
        raise FileNotFoundError("Please upload mlogit_Train_wide.csv or mlogit_choice_data.pickle")
    return df, source

df_raw, source = load_train_wide(uploaded_names)
print(f"Loaded Train dataset from {source}. Shape:", df_raw.shape)
display(df_raw.head(3))

## Data Loading

## Baseline Model Definitions

## ASU-DNN Model Definition

## Benchmark Inference

## ASU-DNN Inference

## Analysis

## Results and Insights

## Where to Go from Here?