# Discrete Choice Modeling for Travel Behavior Analysis: From Multinomial Logit to More Advanced Forms

## C2SMARTER Student Learning Hub Series

### Xiyuan Ren
### April 11, 2025

#### ---------------------

In [82]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cvxpy as cp
import xlogit
import warnings
warnings.filterwarnings('ignore')
import time
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import log_loss
from math import radians, cos, sin, asin, sqrt

The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.


## Example 1: MNL, MXL, AMXL for Commute Choice

Ren, X., & Chow, J. Y. (2022). A random-utility-consistent machine learning method to estimate agents’ joint activity scheduling choice from a ubiquitous data set. Transportation Research Part B: Methodological, 166, 396-418.

<img src="image/commute_choice.jpg" style="width:90%">

### 1.Data Structure

In [2]:
Commuting_choice = pd.read_csv("Commuting_choice_0507.csv")

In [3]:
Commuting_choice.head()

Unnamed: 0,iid,hw_od,chosen,ln_dwork,SDE_work,SDL_work,PL_work,ln_dlunch,SDE_lunch,SDL_lunch,K_lunch1,K_lunch2,ln_dafterwork,ln_dwork*ln_afterwork,t_commute,c_commute,M_commute2,t_worklunch,alternative
0,1,"(121.34179988186844, 30.714540897435004, 121.5...",True,2.235376,126,0,0,0.0,45,0,0,0,1.7492,3.91012,69,20.0,0,0,"6:30-7:00,Driving"
1,1,"(121.34179988186844, 30.714540897435004, 121.5...",False,2.122262,66,0,0,0.0,45,0,0,0,1.7492,3.71226,129,8.0,1,0,"6:30-7:00,Transit"
2,1,"(121.34179988186844, 30.714540897435004, 121.5...",False,2.180417,96,0,0,0.0,45,0,0,0,1.7492,3.813986,69,20.0,0,0,"7:00-7:30,Driving"
3,1,"(121.34179988186844, 30.714540897435004, 121.5...",False,2.060514,36,0,0,0.0,45,0,0,0,1.7492,3.60425,129,8.0,1,0,"7:00-7:30,Transit"
4,1,"(121.34179988186844, 30.714540897435004, 121.5...",False,2.012678,14,0,0,0.0,45,0,0,0,1.7492,3.520577,121,20.0,0,0,"7:30-8:00,Driving"


In [4]:
print('Number of rows:',len(Commuting_choice))
print('Number of individuals:',len(Commuting_choice['iid'].unique()))
print('Number of choice observations:',int(len(Commuting_choice)/len(Commuting_choice['alternative'].unique())))
print('Number of alternatives:',len(Commuting_choice['alternative'].unique()))

Number of rows: 366086
Number of individuals: 26149
Number of choice observations: 26149
Number of alternatives: 14


In [5]:
print(Commuting_choice['alternative'].unique())

['6:30-7:00,Driving' '6:30-7:00,Transit' '7:00-7:30,Driving'
 '7:00-7:30,Transit' '7:30-8:00,Driving' '7:30-8:00,Transit'
 '8:00-8:30,Driving' '8:00-8:30,Transit' '8:30-9:00,Driving'
 '8:30-9:00,Transit' '9:00-9:30,Driving' '9:00-9:30,Transit'
 '9:30-10:00,Driving' '9:30-10:00,Transit']


### 2.Utility function
#### $$U_{ij}=\theta_{time}time_{commute}+\theta_{cost}cost_{commute}+\theta_{mode}mode_{commute}+\theta_{SDE}SDE+\theta_{SDL}SDL+\theta_{PL}PL+\theta_{duration}Dur_{work}+\epsilon_{i,j}$$ where:
$time_{commute}$: commuting travel time (vary across i,j)

$cost_{commute}$: commuting travel cost (vary across i,j)

$mode_{commute}$: commuting mode constant (vary across j)

$SDE$: schedule deviation--earlier than regular workplace arrival time (vary across i,j)

$SDL$: schedule deviation--later than regular workplace arrival time (vary across i,j)

$PL$: additional penalty for being late for work (vary across i,j)

$Dur_{work}$: total work duration (vary across i,j)

$\epsilon_{i,j}$: random disturbance following Gumbel distribution (vary across i for each j)

### 3.Estimate MNL and MXL using xlogit

Arteaga, C., Park, J., Beeramoole, P. B., & Paz, A. (2022). xlogit: An open-source Python package for GPU-accelerated estimation of Mixed Logit models. Journal of Choice Modelling, 42, 100339.

In [6]:
from sklearn.preprocessing import MinMaxScaler
ms = MinMaxScaler()
Commuting_choice_ms = Commuting_choice.copy(deep=True)
Commuting_choice_ms.iloc[:,3:-1] = ms.fit_transform(Commuting_choice_ms.iloc[:,3:-1].values)

In [7]:
from xlogit import MultinomialLogit, MixedLogit

#### In MNL, all parameters (theta) are assummed to be fixed values

In [8]:
varnames = ['t_commute','c_commute','M_commute2','SDE_work','SDL_work','PL_work','ln_dwork']

MNL = MultinomialLogit()
MNL.fit(X=Commuting_choice_ms[varnames], y=Commuting_choice_ms['chosen'], varnames=varnames,
        ids=Commuting_choice_ms['iid'],alts=Commuting_choice_ms['alternative'])

MNL.summary()

Optimization terminated successfully.
    Message: The gradients are close to zero
    Iterations: 13
    Function evaluations: 14
Estimation time= 1.1 seconds
---------------------------------------------------------------------------
Coefficient              Estimate      Std.Err.         z-val         P>|z|
---------------------------------------------------------------------------
t_commute              -8.8774619     0.1099954   -80.7075656             0 ***
c_commute              -1.0049461     0.1853316    -5.4224207      5.93e-08 ***
M_commute2              0.3743507     0.1563890     2.3937141        0.0167 *  
SDE_work               -6.9333252     0.1654255   -41.9120701             0 ***
SDL_work               -5.3241934     0.2386781   -22.3070085      3.3e-109 ***
PL_work                 0.3348956     0.4548364     0.7362992         0.462    
ln_dwork               22.3646455     0.6324974    35.3592691     1.44e-267 ***
----------------------------------------------------

$$\mathrm{Loglikelihood} \;=\; \sum_{i=1}^{N} \sum_{j=1}^{J} y_{ij}\,\ln P_{ij}$$
$$McFadden \space R^2 \;=\; 1 \;-\; \frac{LL(\beta)}{LL(0)}$$

In [9]:
LL_MNL = MNL.loglikelihood
num_alt = len(Commuting_choice['alternative'].unique())
num_observation = int(len(Commuting_choice)/len(Commuting_choice['alternative'].unique()))
LL_0 = np.log(1/num_alt) * num_observation
print('McFadden R Square of MNL:',1-LL_MNL/LL_0)

McFadden R Square of MNL: 0.11339410257313098


#### In MXL, parameters are assumed to follow a parameteric distribution (e.g. normal, uniform, triangular)

In [10]:
varnames = ['t_commute','c_commute','M_commute2','SDE_work','SDL_work','PL_work','ln_dwork']

MXL = MixedLogit()
MXL.fit(X=Commuting_choice_ms[varnames], y=Commuting_choice_ms['chosen'], varnames=varnames,
        ids=Commuting_choice_ms['iid'],alts=Commuting_choice_ms['alternative'],
        randvars={'t_commute':'n','c_commute':'n','M_commute2':'n','SDE_work':'n','SDL_work':'n','ln_dwork':'n'},
        n_draws=100)

MXL.summary()

Optimization terminated successfully.
    Message: The gradients are close to zero
    Iterations: 23
    Function evaluations: 54
Estimation time= 39.0 seconds
---------------------------------------------------------------------------
Coefficient              Estimate      Std.Err.         z-val         P>|z|
---------------------------------------------------------------------------
t_commute              -9.5313549     0.1381162   -69.0096741             0 ***
c_commute              -1.4454435     0.2010687    -7.1888058      6.71e-13 ***
M_commute2              0.0882339     0.1659243     0.5317718         0.595    
SDE_work               -7.2840633     0.1709194   -42.6169487             0 ***
SDL_work               -8.3119172     0.4460536   -18.6343450      5.34e-77 ***
PL_work                -1.7446962     0.5169959    -3.3746809       0.00074 ***
ln_dwork               23.2591849     0.6398418    36.3514641     2.64e-282 ***
sd.t_commute            4.0906957     0.4048290    

In [11]:
LL_MXL = MXL.loglikelihood
print('McFadden R Square of MXL:',1-LL_MXL/LL_0)

McFadden R Square of MNL: 0.11424358507354293


### 4.Estimate AMXL

#### In AMXL, each agent (an individual or a group of individuals) has a unique set of parameters

<img src="image/AMXL.jpg" style="width:40%">

In [29]:
import AMXL_functions
import importlib
importlib.reload(AMXL_functions)
from AMXL_functions import solve_agent_commuting,One_iteration_AMXL

In [13]:
alter_num_c = int(Commuting_choice_ms.groupby('iid').agg({'hw_od':'count'}).mean().values)
np.random.seed(8521)
epsilon_c = np.random.gumbel(0,1,26149*alter_num_c).reshape(26149,alter_num_c)

print('Individual 1')
iid = 560
aa = Commuting_choice_ms[Commuting_choice_ms['iid']==iid]
variable,Z = solve_agent_commuting(aa,[0,0,0,0,0,0,0],epsilon_c,iid=iid,safe_boundary=0.5)
print(pd.DataFrame(variable[None,:],columns=varnames))
print('------------------')

print('Individual 2')
iid = 132
aa = Commuting_choice_ms[Commuting_choice_ms['iid']==iid]
variable,Z = solve_agent_commuting(aa,[0,0,0,0,0,0,0],epsilon_c,iid=iid,safe_boundary=0.5)
print(pd.DataFrame(variable[None,:],columns=varnames))

Individual 1
   t_commute  c_commute  M_commute2  SDE_work  SDL_work   PL_work  ln_dwork
0  -0.333369   0.401597   -0.635861  0.827855 -0.227997 -0.555283  0.298294
------------------
Individual 2
   t_commute  c_commute  M_commute2  SDE_work  SDL_work  PL_work  ln_dwork
0        0.0        0.0         0.0  4.283428       0.0      0.0  1.244082


### 5.Let's try 500 sample and compare MNL, MXL, AMXL

In [84]:
sample_size = 500
data_sample = Commuting_choice_ms.iloc[:sample_size*num_alt]

shuffle = range(1,26150)
theta_0 = [0,0,0,0,0,0,0]
start_time = time.time()
theta_0, theta_i, sb_c = One_iteration_AMXL(data_sample, shuffle, epsilon_c, theta_0, 
                                           sample_size=sample_size,bound=30,boundary_max=3,boundary_min=1,step=0.4)
end_time = time.time()
print('Estimation time of AMXL per iteration: %.1f seconds'%(end_time-start_time))

Estimation time of AMXL per iteration: 5.8 seconds


In [85]:
start_time = time.time()
MNL.fit(X=data_sample[varnames], y=data_sample['chosen'], varnames=varnames,
        ids=data_sample['iid'],alts=data_sample['alternative'])
end_time = time.time()
print('Estimation time of MNL: %.2f seconds'%(end_time-start_time))

Estimation time of MNL: 0.02 seconds


In [86]:
start_time = time.time()
MXL.fit(X=data_sample[varnames], y=data_sample['chosen'], varnames=varnames,
        ids=data_sample['iid'],alts=data_sample['alternative'],
        randvars={'t_commute':'n','c_commute':'n','M_commute2':'n','SDE_work':'n','SDL_work':'n','ln_dwork':'n'},
        n_draws=100)
end_time = time.time()
print('Estimation time of MXL: %.1f seconds'%(end_time-start_time))

Estimation time of MXL: 0.2 seconds


In [87]:
X = [Commuting_choice_ms[Commuting_choice_ms['iid']==iid][varnames].values for iid in range(1,sample_size+1)]
X = np.array(X)
X = np.transpose(X, (0, 2, 1)) # shape (sessions,attributes,alternatives)
Y = [Commuting_choice_ms[Commuting_choice_ms['iid']==iid]['chosen'].values for iid in range(1,sample_size+1)]
Y = np.array(Y)

V = (X * theta_i[:,:,None]).sum(axis=1)
V = V - V.min(axis=1)[:,None]
demo = np.exp(V).sum(axis=1).reshape(X.shape[0],1)
P = np.exp(V) / demo
LL_0 = np.log(1/num_alt) * sample_size

LL_MNL = MNL.loglikelihood
print('McFadden R Square of MNL:',1-LL_MNL/LL_0)
LL_MXL = MXL.loglikelihood
print('McFadden R Square of MXL:',1-LL_MXL/LL_0)
LL_AMXL = -log_loss(Y, P, normalize=False)
print('McFadden R Square of AMXL:',(1 - LL_AMXL/LL_0))

McFadden R Square of MNL: 0.37663953031518227
McFadden R Square of MXL: 0.38226595860042545
McFadden R Square of AMXL: 0.7959300109727356


### 6.Overfitting Issues in AMXL

<img src="image/AMXL_out_of_sample_accuracy.jpg" style="width:100%">

# Example 2: Group-level AMXL for Statewide Mode Choice