# Conditional Logit: Choice-Specific Attributes

**Tutorial Series**: Discrete Choice Econometrics with PanelBox

**Notebook**: 05 - Conditional Logit

**Author**: PanelBox Contributors

**Date**: 2026-02-17

**Estimated Duration**: 90 minutes

**Difficulty Level**: Advanced

---

## Learning Objectives

By the end of this notebook, you will be able to:

1. Distinguish Conditional Logit from Binary and Multinomial Logit
2. Work with alternative-specific covariates ($Z_{ij}$) in "long" data format
3. Estimate Conditional Logit models using PanelBox
4. Calculate and interpret choice elasticities (own and cross)
5. Understand and test the IIA (Independence of Irrelevant Alternatives) property
6. Include individual-specific covariates via alternative interactions
7. Apply the model to transportation mode choice analysis

---

## Table of Contents

1. [Binary Logit vs Conditional Logit](#section1)
2. [Long Format Data Structure](#section2)
3. [Estimation with PanelBox](#section3)
4. [Predicted Choice Probabilities](#section4)
5. [Choice Elasticities](#section5)
6. [IIA Property and Testing](#section6)
7. [Individual-Specific Covariates](#section7)
8. [Application: Transportation Mode Choice](#section8)
9. [Exercises](#exercises)

## Setup

Import all required libraries and configure the environment.

In [None]:
# Standard library imports
import warnings
from pathlib import Path

# Data manipulation and numerical computing
import numpy as np
import pandas as pd

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Statistical functions
from scipy.stats import norm, chi2
from scipy.special import logsumexp

# PanelBox models
from panelbox.models.discrete.multinomial import ConditionalLogit

# Configuration
warnings.filterwarnings('ignore')
np.random.seed(42)
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 4)

# Matplotlib configuration
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.labelsize'] = 12
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.rcParams['legend.fontsize'] = 10

# Paths
DATA_DIR = Path("..") / "data"
OUTPUT_DIR = Path("..") / "outputs"
FIG_DIR = OUTPUT_DIR / "figures"
TABLE_DIR = OUTPUT_DIR / "tables"

print("All libraries imported successfully")
print(f"Random seed set to: 42")
print(f"Working directory: {Path.cwd()}")

<a id='section1'></a>
## 1. Binary Logit vs Conditional Logit

### 1.1 Review: Binary and Multinomial Logit

In **Binary Logit** (Notebook 01) and **Multinomial Logit**, the covariates are **individual-specific**:

$$U_{ij} = X_i' \beta_j + \varepsilon_{ij}$$

Key features:
- Covariates $X_i$ (income, age, education) describe the **individual**, not the alternative
- Each alternative $j$ has its **own coefficient vector** $\beta_j$
- Number of parameters: $(J-1) \times K$

### 1.2 The Need for Alternative-Specific Covariates

Consider transportation mode choice. What matters is not just *who* the person is, but the **attributes of each alternative**:

| Attribute | Car | Bus | Metro | Bike |
|-----------|-----|-----|-------|------|
| Cost (R\$) | 25 | 4.40 | 4.80 | 1.00 |
| Time (min) | 30 | 60 | 35 | 45 |
| Comfort | 4.5 | 2.5 | 3.5 | 2.0 |

These attributes vary **across alternatives for the same individual**. Standard Multinomial Logit cannot handle this directly.

### 1.3 Conditional Logit (McFadden, 1973)

**Conditional Logit** accommodates **alternative-specific** covariates $Z_{ij}$:

$$U_{ij} = Z_{ij}' \gamma + \varepsilon_{ij}$$

$$P(y_i = j) = \frac{\exp(Z_{ij}' \gamma)}{\sum_{k=1}^{J} \exp(Z_{ik}' \gamma)}$$

**Key difference**: $\gamma$ is a **single coefficient vector** common across all alternatives.

| Feature | Multinomial Logit | Conditional Logit |
|---------|------------------|------------------|
| Covariates | Individual-specific ($X_i$) | Alternative-specific ($Z_{ij}$) |
| Coefficients | Alternative-specific ($\beta_j$) | Common ($\gamma$) |
| Parameters | $(J-1) \times K$ | $K$ |
| Data format | Wide | Long |
| Example | Income affects mode choice | Cost of each mode affects choice |

### 1.4 Load the Data

In [None]:
# Load transportation mode choice data (long format)
data = pd.read_csv(DATA_DIR / "transportation_choice.csv")

print("Dataset loaded successfully!")
print(f"\nShape: {data.shape}")
print(f"Number of individuals: {data['id'].nunique()}")
print(f"Number of periods: {data['year'].nunique()}")
print(f"Number of alternatives: {data['mode'].nunique()}")
print(f"Modes: {data['mode'].unique()}")
print(f"\nRows per choice occasion: {data.shape[0] / (data['id'].nunique() * data['year'].nunique()):.0f}")
print(f"\nFirst 12 rows (3 choice occasions):")
data.head(12)

In [None]:
# Verify long format structure: exactly one choice=1 per (id, year)
choice_sums = data.groupby(['id', 'year'])['choice'].sum()
assert choice_sums.eq(1).all(), "Data integrity violated: not exactly one choice per occasion!"
print("Data integrity check PASSED")
print(f"  - {choice_sums.shape[0]} choice occasions")
print(f"  - Each occasion has exactly one chosen alternative")

# Choice distribution
print(f"\nChoice distribution:")
choice_shares = data[data['choice'] == 1]['mode'].value_counts(normalize=True).sort_index()
print(choice_shares)

<a id='section2'></a>
## 2. Long Format Data Structure

### 2.1 Understanding the Long Format

Conditional Logit requires data in **long format** where each row represents one (individual $\times$ time $\times$ alternative) combination:

```
id  year  mode   choice  cost   time   ...
1   1     car    1       25.3   30.2   ...
1   1     bus    0       4.40   58.5   ...
1   1     metro  0       4.80   35.1   ...
1   1     bike   0       1.00   42.0   ...
1   2     car    0       27.1   35.0   ...
1   2     bus    1       4.40   55.0   ...
...
```

- **N individuals** $\times$ **T periods** $\times$ **J alternatives** = total rows
- `choice = 1` for the chosen alternative, `0` for all others
- Alternative-specific variables (`cost`, `time`) vary across rows within a choice occasion
- Individual-specific variables (`income`, `distance`) are constant within a choice occasion

### 2.2 Variable Types

In [None]:
# Demonstrate variable types
example = data[(data['id'] == 1) & (data['year'] == 1)]
print("=== Example: Individual 1, Year 1 ===")
print(example.to_string(index=False))

print("\n=== Variable Classification ===")
print("\nAlternative-specific (vary across modes for same person):")
for var in ['cost', 'time', 'reliability', 'comfort']:
    vals = example[var].values
    print(f"  {var:12s}: {vals}  (varies!)")

print("\nIndividual-specific (constant across modes for same person):")
for var in ['income', 'distance']:
    vals = example[var].values
    print(f"  {var:12s}: {vals}  (constant)")

### 2.3 Descriptive Statistics by Alternative

In [None]:
# Summary statistics by transport mode
alt_stats = data.groupby('mode')[['cost', 'time', 'reliability', 'comfort']].agg(
    ['mean', 'std', 'min', 'max']
).round(2)

print("=== Attribute Statistics by Transport Mode ===")
print(alt_stats)

# Simpler summary
print("\n=== Mean Attributes by Mode ===")
mean_attrs = data.groupby('mode')[['cost', 'time', 'reliability', 'comfort']].mean().round(2)
print(mean_attrs)

In [None]:
# Visualize attribute distributions by mode
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

variables = ['cost', 'time', 'reliability', 'comfort']
colors = {'car': '#e74c3c', 'bus': '#3498db', 'metro': '#2ecc71', 'bike': '#f39c12'}
titles = ['Cost (R$)', 'Travel Time (minutes)', 'Reliability (1-5)', 'Comfort (1-5)']

for ax, var, title in zip(axes.flatten(), variables, titles):
    mode_data = [data[data['mode'] == m][var] for m in ['car', 'bus', 'metro', 'bike']]
    bp = ax.boxplot(mode_data, labels=['Car', 'Bus', 'Metro', 'Bike'],
                    patch_artist=True, notch=True)
    
    for patch, mode in zip(bp['boxes'], ['car', 'bus', 'metro', 'bike']):
        patch.set_facecolor(colors[mode])
        patch.set_alpha(0.6)
    
    ax.set_title(title, fontweight='bold')
    ax.set_ylabel(var.capitalize())
    ax.grid(True, alpha=0.3)

plt.suptitle('Attribute Distributions by Transport Mode', fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig(FIG_DIR / '05_attribute_distributions.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_attribute_distributions.png")

### 2.4 Wide-to-Long Transformation

If your data starts in **wide format** (one row per choice occasion with columns like `cost_car`, `cost_bus`, etc.), you need to reshape it to long format.

Here's how:

In [None]:
# Example: Convert our long data to wide and back (demonstrating both directions)

# Step 1: Long -> Wide (for illustration)
sample = data[data['id'] <= 3].copy()

# Pivot alternative-specific variables
wide_cost = sample.pivot_table(index=['id', 'year'], columns='mode', values='cost')
wide_cost.columns = [f'cost_{m}' for m in wide_cost.columns]

wide_time = sample.pivot_table(index=['id', 'year'], columns='mode', values='time')
wide_time.columns = [f'time_{m}' for m in wide_time.columns]

# Add choice column
wide_choice = sample[sample['choice'] == 1][['id', 'year', 'mode']].rename(columns={'mode': 'chosen_mode'})

# Combine
wide = wide_cost.join(wide_time).reset_index()
wide = wide.merge(wide_choice, on=['id', 'year'])

print("=== Wide Format (one row per choice occasion) ===")
print(wide.head())

# Step 2: Wide -> Long (common task)
print("\n=== Transformation: Wide -> Long ===")
print("Use pd.melt() or manual reshaping:")
print("""\n
long_rows = []
for _, row in wide.iterrows():
    for mode in ['car', 'bus', 'metro', 'bike']:
        long_rows.append({
            'id': row['id'],
            'year': row['year'],
            'mode': mode,
            'choice': 1 if mode == row['chosen_mode'] else 0,
            'cost': row[f'cost_{mode}'],
            'time': row[f'time_{mode}']
        })
long = pd.DataFrame(long_rows)
""")

<a id='section3'></a>
## 3. Estimation with PanelBox

### 3.1 Model Specification

The Conditional Logit model in PanelBox requires:
- `data`: DataFrame in long format
- `choice_col`: Column identifying each choice occasion
- `alt_col`: Column identifying alternatives
- `chosen_col`: Binary indicator (1 if chosen)
- `alt_varying_vars`: List of alternative-specific variable names

### 3.2 Prepare Data

In [None]:
# Create a unique choice occasion identifier
data['choice_id'] = data['id'].astype(str) + '_' + data['year'].astype(str)

print(f"Number of unique choice occasions: {data['choice_id'].nunique()}")
print(f"Number of alternatives per occasion: {data['mode'].nunique()}")
print(f"Total rows: {len(data)}")
print(f"\nVerification: {data['choice_id'].nunique()} x {data['mode'].nunique()} = {data['choice_id'].nunique() * data['mode'].nunique()} rows")

### 3.3 Estimate Base Model

In [None]:
# Estimate Conditional Logit with alternative-specific attributes
model = ConditionalLogit(
    data=data,
    choice_col='choice_id',
    alt_col='mode',
    chosen_col='choice',
    alt_varying_vars=['cost', 'time', 'reliability', 'comfort']
)

results = model.fit()

print("=" * 70)
print(" " * 15 + "CONDITIONAL LOGIT: BASE MODEL")
print("=" * 70)
print(results.summary())

### 3.4 Coefficient Interpretation

Unlike Multinomial Logit, Conditional Logit coefficients have a **direct interpretation**:

- $\gamma$ is **common across all alternatives** (one coefficient per attribute)
- Signs directly indicate the effect on utility:
  - $\gamma_{\text{cost}} < 0$: Higher cost reduces utility (as expected)
  - $\gamma_{\text{time}} < 0$: Longer travel time reduces utility
  - $\gamma_{\text{comfort}} > 0$: More comfort increases utility

In [None]:
print("=== Coefficient Interpretation ===")
print("\nConditional Logit: coefficients are COMMON across alternatives")
print("(unlike Multinomial Logit where each alternative has separate coefficients)\n")

var_names = model.alt_varying_vars
for i, var in enumerate(var_names):
    coef = results.params[i]
    se = results.bse[i]
    z = coef / se
    print(f"{var:12s}: coef = {coef:8.4f}  (SE = {se:.4f}, z = {z:.2f})")
    if coef < 0:
        print(f"             -> Higher {var} DECREASES utility (probability of choice)")
    else:
        print(f"             -> Higher {var} INCREASES utility (probability of choice)")
    print()

print("\nEconomic consistency check:")
print(f"  cost < 0:        {'PASS' if results.params[0] < 0 else 'FAIL'}  (higher cost -> lower choice prob)")
print(f"  time < 0:        {'PASS' if results.params[1] < 0 else 'FAIL'}  (more time -> lower choice prob)")
print(f"  reliability > 0: {'PASS' if results.params[2] > 0 else 'FAIL'}  (better reliability -> higher choice prob)")
print(f"  comfort > 0:     {'PASS' if results.params[3] > 0 else 'FAIL'}  (more comfort -> higher choice prob)")

In [None]:
# Value of Time (VOT) calculation
# VOT = gamma_time / gamma_cost (in R$/minute)
gamma_cost = results.params[0]  # cost coefficient
gamma_time = results.params[1]  # time coefficient

vot = gamma_time / gamma_cost  # R$ per minute
vot_hourly = vot * 60  # R$ per hour

print("=== Value of Time (VOT) ===")
print(f"\nVOT = gamma_time / gamma_cost = {gamma_time:.4f} / {gamma_cost:.4f}")
print(f"    = {vot:.2f} R$/minute")
print(f"    = {vot_hourly:.2f} R$/hour")
print(f"\nInterpretation: Individuals are willing to pay R${vot:.2f} per minute")
print(f"(or R${vot_hourly:.2f} per hour) to save travel time.")
print(f"\nThis is a key metric for transportation policy evaluation.")

<a id='section4'></a>
## 4. Predicted Choice Probabilities

### 4.1 Generate Predictions

In [None]:
# Predicted choice probabilities
pred_probs = results.predict_proba()

print(f"Predicted probabilities shape: {pred_probs.shape}")
print(f"  {pred_probs.shape[0]} choice occasions x {pred_probs.shape[1]} alternatives")

# Map alternatives to column names
alternatives = sorted(data['mode'].unique())
pred_df = pd.DataFrame(pred_probs, columns=alternatives)

print(f"\nAlternatives (column order): {alternatives}")
print(f"\nFirst 5 predicted probability vectors:")
print(pred_df.head())

# Verify probabilities sum to 1
print(f"\nSum of probabilities (should all = 1.0):")
print(f"  Min: {pred_probs.sum(axis=1).min():.6f}")
print(f"  Max: {pred_probs.sum(axis=1).max():.6f}")

### 4.2 Observed vs Predicted Choice Shares

In [None]:
# Compare observed vs predicted choice shares
observed_shares = data[data['choice'] == 1]['mode'].value_counts(normalize=True).sort_index()
predicted_shares = pred_df.mean()

comparison = pd.DataFrame({
    'Observed': observed_shares,
    'Predicted': predicted_shares,
    'Difference': predicted_shares - observed_shares
})

print("=== Observed vs Predicted Choice Shares ===")
print(comparison.round(4))
print(f"\nMean Absolute Difference: {comparison['Difference'].abs().mean():.4f}")

In [None]:
# Visualize observed vs predicted shares
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Bar chart comparison
x = np.arange(len(alternatives))
width = 0.35

bars1 = axes[0].bar(x - width/2, observed_shares.values, width, label='Observed',
                     color='#3498db', alpha=0.8, edgecolor='black')
bars2 = axes[0].bar(x + width/2, predicted_shares.values, width, label='Predicted',
                     color='#e74c3c', alpha=0.8, edgecolor='black')

axes[0].set_xlabel('Transport Mode')
axes[0].set_ylabel('Choice Share')
axes[0].set_title('Observed vs Predicted Choice Shares', fontweight='bold')
axes[0].set_xticks(x)
axes[0].set_xticklabels([m.capitalize() for m in alternatives])
axes[0].legend()
axes[0].grid(True, alpha=0.3, axis='y')

# Add value labels
for bar in bars1:
    h = bar.get_height()
    axes[0].text(bar.get_x() + bar.get_width()/2, h + 0.005, f'{h:.3f}',
                ha='center', va='bottom', fontsize=9)
for bar in bars2:
    h = bar.get_height()
    axes[0].text(bar.get_x() + bar.get_width()/2, h + 0.005, f'{h:.3f}',
                ha='center', va='bottom', fontsize=9)

# Distribution of predicted probabilities
for mode in alternatives:
    axes[1].hist(pred_df[mode], bins=40, alpha=0.5, label=mode.capitalize())

axes[1].set_xlabel('Predicted Probability')
axes[1].set_ylabel('Frequency')
axes[1].set_title('Distribution of Predicted Probabilities', fontweight='bold')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(FIG_DIR / '05_choice_shares.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_choice_shares.png")

### 4.3 Classification Accuracy

In [None]:
# Classification: predict most likely alternative
y_pred = np.argmax(pred_probs, axis=1)
y_true = np.array(model.y_list)

accuracy = np.mean(y_pred == y_true)
print(f"=== Classification Performance ===")
print(f"Overall accuracy: {accuracy:.4f} ({accuracy*100:.1f}%)")
print(f"Random baseline (1/J): {1/len(alternatives):.4f} ({100/len(alternatives):.1f}%)")
print(f"Improvement over random: {accuracy - 1/len(alternatives):.4f}")

# Confusion matrix
n_alts = len(alternatives)
conf_matrix = np.zeros((n_alts, n_alts), dtype=int)
for true, pred in zip(y_true, y_pred):
    conf_matrix[true, pred] += 1

conf_df = pd.DataFrame(conf_matrix, index=alternatives, columns=alternatives)
print(f"\nConfusion Matrix (rows=actual, columns=predicted):")
print(conf_df)

<a id='section5'></a>
## 5. Choice Elasticities

### 5.1 Theory

Elasticities measure the **percentage change in choice probability** given a **percentage change in an attribute**.

**Own elasticity** (effect of changing attribute $k$ of alternative $j$ on $P(j)$):

$$\eta_{jk}^{\text{own}} = \gamma_k \cdot Z_{jk} \cdot (1 - P_j)$$

**Cross elasticity** (effect of changing attribute $k$ of alternative $j$ on $P(m)$, $m \neq j$):

$$\eta_{mj,k}^{\text{cross}} = -\gamma_k \cdot Z_{jk} \cdot P_j$$

**Key IIA implication**: Cross elasticities are the same for ALL other alternatives $m \neq j$.

### 5.2 Calculate Elasticities

In [None]:
# Compute average elasticities across all choice occasions
def compute_elasticities(results, data, alt_col, var_name, var_idx):
    """
    Compute own and cross elasticities for a given variable.
    
    Returns average elasticities across all choice occasions.
    """
    gamma_k = results.params[var_idx]
    probs = results.predict_proba()
    alts = sorted(data[alt_col].unique())
    n_alts = len(alts)
    n_choices = probs.shape[0]
    
    # Get variable values for each choice occasion
    # Reshape data to get Z values per (choice, alt)
    own_elasticities = np.zeros(n_alts)
    cross_elasticities = np.zeros((n_alts, n_alts))  # [i, j] = effect on P(i) of changing Z_j
    
    for idx_c in range(n_choices):
        X = results.model.X_list[idx_c]
        P = probs[idx_c]
        
        for j in range(n_alts):
            Z_jk = X[j, var_idx]
            # Own elasticity for alternative j
            own_elasticities[j] += gamma_k * Z_jk * (1 - P[j])
            # Cross elasticity: effect on P(m) of changing Z_j
            for m in range(n_alts):
                if m != j:
                    cross_elasticities[m, j] += -gamma_k * Z_jk * P[j]
    
    # Average over choice occasions
    own_elasticities /= n_choices
    cross_elasticities /= n_choices
    
    return own_elasticities, cross_elasticities, alts


# Compute elasticities for cost
own_cost, cross_cost, alts = compute_elasticities(results, data, 'mode', 'cost', 0)

print("=== Cost Elasticities ===")
print("\nOwn elasticities (diagonal):")
for j, alt in enumerate(alts):
    print(f"  {alt:6s}: {own_cost[j]:.4f}  (1% cost increase -> {own_cost[j]:.2f}% change in P({alt}))")

print("\nCross elasticities (off-diagonal):")
elast_df = pd.DataFrame(cross_cost, index=alts, columns=[f'{a} cost' for a in alts])
# Fill diagonal with own elasticities
for j, alt in enumerate(alts):
    elast_df.iloc[j, j] = own_cost[j]
print(elast_df.round(4))

In [None]:
# Also compute time elasticities
own_time, cross_time, _ = compute_elasticities(results, data, 'mode', 'time', 1)

# Build full elasticity matrix for both variables
print("=== Time Elasticities ===")
print("\nOwn elasticities:")
for j, alt in enumerate(alts):
    print(f"  {alt:6s}: {own_time[j]:.4f}")

# Save elasticity table
elast_table = pd.DataFrame({
    'Mode': alts,
    'Own_Cost_Elasticity': own_cost,
    'Own_Time_Elasticity': own_time,
})
elast_table.to_csv(TABLE_DIR / '05_elasticities.csv', index=False)
print(f"\nElasticity table saved to outputs/tables/05_elasticities.csv")

In [None]:
# Visualize cross-elasticity heatmap
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Cost elasticity heatmap
cost_matrix = pd.DataFrame(cross_cost, index=alts, columns=alts)
for j in range(len(alts)):
    cost_matrix.iloc[j, j] = own_cost[j]

sns.heatmap(cost_matrix, annot=True, fmt='.3f', cmap='RdBu_r', center=0,
            ax=axes[0], square=True, linewidths=1)
axes[0].set_title('Cost Elasticity Matrix', fontweight='bold')
axes[0].set_xlabel('Mode whose cost changes')
axes[0].set_ylabel('Mode whose P changes')

# Time elasticity heatmap
time_matrix = pd.DataFrame(cross_time, index=alts, columns=alts)
for j in range(len(alts)):
    time_matrix.iloc[j, j] = own_time[j]

sns.heatmap(time_matrix, annot=True, fmt='.3f', cmap='RdBu_r', center=0,
            ax=axes[1], square=True, linewidths=1)
axes[1].set_title('Time Elasticity Matrix', fontweight='bold')
axes[1].set_xlabel('Mode whose time changes')
axes[1].set_ylabel('Mode whose P changes')

plt.suptitle('Cross-Elasticity Heatmaps', fontsize=15, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig(FIG_DIR / '05_elasticity_heatmap.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_elasticity_heatmap.png")

print("\nKey observations:")
print("  - Diagonal (own elasticities) are NEGATIVE: higher own cost -> lower own P")
print("  - Off-diagonal (cross elasticities) are POSITIVE: competitor's cost up -> own P up")
print("  - IIA property: cross elasticities in each column are EQUAL")
print("    (all other modes gain proportionally when one mode's cost increases)")

### 5.3 Policy Simulation: Bus Cost Reduction

**Scenario**: What happens if bus fares are reduced by 20%?

In [None]:
# Policy simulation: Reduce bus cost by 20%
data_policy = data.copy()
data_policy.loc[data_policy['mode'] == 'bus', 'cost'] *= 0.80

# Re-create design matrices with modified data
data_policy['choice_id'] = data_policy['id'].astype(str) + '_' + data_policy['year'].astype(str)

model_policy = ConditionalLogit(
    data=data_policy,
    choice_col='choice_id',
    alt_col='mode',
    chosen_col='choice',
    alt_varying_vars=['cost', 'time', 'reliability', 'comfort']
)

# Predict using ORIGINAL estimated parameters
policy_probs = np.zeros((model_policy.n_choices, model_policy.n_alts))
for i, X in enumerate(model_policy.X_list):
    utilities = X @ results.params
    exp_u = np.exp(utilities - utilities.max())
    policy_probs[i] = exp_u / exp_u.sum()

# Compare shares before and after
original_shares = pred_df.mean()
policy_shares = pd.Series(policy_probs.mean(axis=0), index=alternatives)

policy_comparison = pd.DataFrame({
    'Before': original_shares,
    'After (bus -20%)': policy_shares,
    'Change (pp)': (policy_shares - original_shares) * 100,
    'Change (%)': ((policy_shares - original_shares) / original_shares) * 100
})

print("=== Policy Simulation: 20% Bus Cost Reduction ===")
print(policy_comparison.round(4))

In [None]:
# Visualize policy simulation
fig, ax = plt.subplots(figsize=(10, 6))

x = np.arange(len(alternatives))
width = 0.35

bars1 = ax.bar(x - width/2, original_shares.values * 100, width,
               label='Current', color='#3498db', alpha=0.8, edgecolor='black')
bars2 = ax.bar(x + width/2, policy_shares.values * 100, width,
               label='After 20% bus cost reduction', color='#e74c3c', alpha=0.8, edgecolor='black')

ax.set_xlabel('Transport Mode')
ax.set_ylabel('Choice Share (%)')
ax.set_title('Policy Simulation: Effect of 20% Bus Cost Reduction\non Modal Choice Shares',
             fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels([m.capitalize() for m in alternatives])
ax.legend()
ax.grid(True, alpha=0.3, axis='y')

# Add value labels
for bar in bars1:
    h = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2, h + 0.3, f'{h:.1f}%',
            ha='center', va='bottom', fontsize=9)
for bar in bars2:
    h = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2, h + 0.3, f'{h:.1f}%',
            ha='center', va='bottom', fontsize=9)

plt.tight_layout()
plt.savefig(FIG_DIR / '05_policy_simulation.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_policy_simulation.png")

<a id='section6'></a>
## 6. IIA Property and Testing

### 6.1 Independence of Irrelevant Alternatives

The **IIA property** states that the ratio of probabilities between any two alternatives is independent of the other alternatives:

$$\frac{P(y_i = j)}{P(y_i = k)} = \frac{\exp(Z_{ij}' \gamma)}{\exp(Z_{ik}' \gamma)}$$

This does **not depend** on the attributes of other alternatives!

### 6.2 The Blue Bus / Red Bus Problem

Classic example (McFadden, 1973):

- Initial choice set: {Car, Blue Bus} with equal probabilities (0.5, 0.5)
- Add Red Bus (identical to Blue Bus)
- **IIA prediction**: P(Car) = 1/3, P(Blue Bus) = 1/3, P(Red Bus) = 1/3
- **Realistic outcome**: P(Car) = 0.5, P(Blue Bus) = 0.25, P(Red Bus) = 0.25

IIA implies **proportional substitution** — new alternatives draw equally from all existing alternatives, which may be unrealistic for similar alternatives.

### 6.3 Hausman-McFadden Test

**Idea**: If IIA holds, removing an alternative should not significantly change the estimated coefficients.

**Procedure**:
1. Estimate full model with all alternatives
2. Remove one alternative and re-estimate
3. Compare coefficients using Hausman test statistic

In [None]:
# Hausman-McFadden IIA Test
print("=== Hausman-McFadden IIA Test ===")
print("\nH0: IIA holds (coefficients are stable when an alternative is removed)")
print("H1: IIA is violated\n")

# Full model coefficients and covariance
beta_full = results.params[:model.n_alt_varying]  # Only alt-varying params
vcov_full = results.vcov[:model.n_alt_varying, :model.n_alt_varying]

# Test by removing each alternative
iia_results = []

for mode_to_remove in alternatives:
    # Restrict data: remove one alternative
    data_restricted = data[data['mode'] != mode_to_remove].copy()
    
    # Remove choice occasions where the removed mode was chosen
    # (those observations are lost)
    chosen_removed = data_restricted.groupby('choice_id')['choice'].sum()
    valid_choices = chosen_removed[chosen_removed == 1].index
    data_restricted = data_restricted[data_restricted['choice_id'].isin(valid_choices)]
    
    if len(data_restricted) == 0:
        continue
    
    # Estimate restricted model
    model_r = ConditionalLogit(
        data=data_restricted,
        choice_col='choice_id',
        alt_col='mode',
        chosen_col='choice',
        alt_varying_vars=['cost', 'time', 'reliability', 'comfort']
    )
    results_r = model_r.fit()
    
    beta_r = results_r.params[:model_r.n_alt_varying]
    vcov_r = results_r.vcov[:model_r.n_alt_varying, :model_r.n_alt_varying]
    
    # Hausman test statistic: H = (beta_r - beta_f)' * (V_r - V_f)^{-1} * (beta_r - beta_f)
    diff = beta_r - beta_full
    vcov_diff = vcov_r - vcov_full
    
    try:
        H = diff @ np.linalg.inv(vcov_diff) @ diff
        df = len(diff)
        p_value = 1 - chi2.cdf(abs(H), df)
        
        iia_results.append({
            'Removed': mode_to_remove,
            'N_choices': model_r.n_choices,
            'H_statistic': H,
            'df': df,
            'p_value': p_value,
            'Conclusion': 'Fail to reject IIA' if p_value > 0.05 else 'Reject IIA'
        })
    except np.linalg.LinAlgError:
        iia_results.append({
            'Removed': mode_to_remove,
            'N_choices': model_r.n_choices,
            'H_statistic': np.nan,
            'df': len(diff),
            'p_value': np.nan,
            'Conclusion': 'Could not compute (singular matrix)'
        })

iia_df = pd.DataFrame(iia_results)
print(iia_df.to_string(index=False))

print("\nInterpretation:")
print("  - p > 0.05: No evidence against IIA (coefficients stable)")
print("  - p < 0.05: Evidence against IIA (coefficients change significantly)")
print("\nNote: If IIA is rejected, consider Nested Logit or Mixed Logit models.")

### 6.4 Discussion

**If IIA holds**: Conditional Logit is appropriate. The proportional substitution assumption is reasonable.

**If IIA is rejected**: Consider:
- **Nested Logit**: Groups similar alternatives into nests (e.g., {Car} vs {Bus, Metro} vs {Bike})
- **Mixed Logit**: Allows random coefficients, relaxing IIA
- **Probit**: Flexible correlation structure but computationally expensive

In transportation, bus and metro are often close substitutes, while car and bike are more different. This can violate IIA.

<a id='section7'></a>
## 7. Individual-Specific Covariates

### 7.1 The Problem

Variables like **income** and **distance** are individual-specific — they don't vary across alternatives.

In Conditional Logit, these cannot enter directly because:
$$U_{ij} = Z_{ij}' \gamma + \text{income}_i \cdot \delta \implies \text{income cancels out in probability ratio}$$

The income term is the same for all alternatives, so it drops out of the choice probability.

### 7.2 Solution: Alternative-Specific Interactions

Interact individual-specific variables with **alternative dummies** (one alternative is the reference):

$$U_{ij} = Z_{ij}' \gamma + \text{income}_i \cdot \mathbb{1}[j = \text{car}] \cdot \delta_{\text{car}} + \text{income}_i \cdot \mathbb{1}[j = \text{metro}] \cdot \delta_{\text{metro}} + ...$$

**Interpretation**: $\delta_{\text{car}} > 0$ means higher income individuals are more likely to choose car (relative to reference).

In [None]:
# PanelBox supports case-varying variables directly
# Scale income to thousands for numerical stability
data['income_k'] = data['income'] / 1000

model_ext = ConditionalLogit(
    data=data,
    choice_col='choice_id',
    alt_col='mode',
    chosen_col='choice',
    alt_varying_vars=['cost', 'time', 'reliability', 'comfort'],
    case_varying_vars=['income_k']
)

results_ext = model_ext.fit(maxiter=5000)

print("=" * 70)
print(" " * 10 + "EXTENDED MODEL: With Income Interactions")
print("=" * 70)
print(results_ext.summary())

print("\nNote: Income coefficients are relative to the first alternative (reference).")
print(f"Reference alternative: {model_ext.alternatives[0]}")

In [None]:
# Compare base vs extended model
print("=== Model Comparison ===")
print(f"{'Metric':<25} {'Base Model':<15} {'Extended Model':<15}")
print("-" * 55)
print(f"{'Log-likelihood':<25} {results.llf:<15.4f} {results_ext.llf:<15.4f}")
print(f"{'Pseudo R-squared':<25} {results.pseudo_r2:<15.4f} {results_ext.pseudo_r2:<15.4f}")
print(f"{'AIC':<25} {results.aic:<15.4f} {results_ext.aic:<15.4f}")
print(f"{'BIC':<25} {results.bic:<15.4f} {results_ext.bic:<15.4f}")
print(f"{'Accuracy':<25} {results.accuracy:<15.4f} {results_ext.accuracy:<15.4f}")
print(f"{'N parameters':<25} {len(results.params):<15} {len(results_ext.params):<15}")

# Likelihood ratio test
LR = 2 * (results_ext.llf - results.llf)
df_diff = len(results_ext.params) - len(results.params)
p_lr = 1 - chi2.cdf(LR, df_diff)

print(f"\nLikelihood Ratio Test:")
print(f"  LR statistic = {LR:.4f}")
print(f"  Degrees of freedom = {df_diff}")
print(f"  p-value = {p_lr:.4f}")
if p_lr < 0.05:
    print(f"  -> Income significantly improves the model (reject null at 5%)")
else:
    print(f"  -> Income does not significantly improve the model")

<a id='section8'></a>
## 8. Application: Transportation Mode Choice

### Research Question

**How do cost and time affect transportation mode choice? What happens if bus fares are reduced?**

### 8.1 Exploratory Analysis

In [None]:
# Comprehensive exploratory analysis
print("=" * 70)
print(" " * 15 + "TRANSPORTATION MODE CHOICE ANALYSIS")
print("=" * 70)

# 1. Choice distribution by year
print("\n1. Choice Distribution Over Time")
choice_by_year = data[data['choice'] == 1].groupby('year')['mode'].value_counts(
    normalize=True
).unstack(fill_value=0)
print(choice_by_year.round(3))

# 2. Income distribution by chosen mode
print("\n2. Mean Income by Chosen Mode")
chosen_data = data[data['choice'] == 1]
income_by_mode = chosen_data.groupby('mode')['income'].agg(['mean', 'median', 'std'])
print(income_by_mode.round(0))

# 3. Distance distribution by chosen mode
print("\n3. Mean Distance by Chosen Mode")
distance_by_mode = chosen_data.groupby('mode')['distance'].agg(['mean', 'median', 'std'])
print(distance_by_mode.round(1))

In [None]:
# Visualize choice patterns
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

colors = {'car': '#e74c3c', 'bus': '#3498db', 'metro': '#2ecc71', 'bike': '#f39c12'}

# 1. Choice shares by year
choice_by_year.plot(kind='bar', ax=axes[0, 0], color=[colors[m] for m in choice_by_year.columns],
                    alpha=0.8, edgecolor='black')
axes[0, 0].set_title('Choice Shares by Year', fontweight='bold')
axes[0, 0].set_xlabel('Year')
axes[0, 0].set_ylabel('Share')
axes[0, 0].legend(title='Mode')
axes[0, 0].tick_params(axis='x', rotation=0)

# 2. Income distribution by chosen mode
for mode in alternatives:
    mode_income = chosen_data[chosen_data['mode'] == mode]['income']
    axes[0, 1].hist(mode_income, bins=30, alpha=0.5, label=mode.capitalize(),
                    color=colors[mode])
axes[0, 1].set_title('Income Distribution by Chosen Mode', fontweight='bold')
axes[0, 1].set_xlabel('Income (R$)')
axes[0, 1].set_ylabel('Frequency')
axes[0, 1].legend()

# 3. Cost vs Time scatter by mode
for mode in alternatives:
    mode_data = data[data['mode'] == mode]
    axes[1, 0].scatter(mode_data['cost'], mode_data['time'], alpha=0.1, s=5,
                       color=colors[mode], label=mode.capitalize())
axes[1, 0].set_title('Cost vs Time by Mode', fontweight='bold')
axes[1, 0].set_xlabel('Cost (R$)')
axes[1, 0].set_ylabel('Time (minutes)')
axes[1, 0].legend(markerscale=5)

# 4. Choice share by distance quartile
chosen_data_q = chosen_data.copy()
chosen_data_q['dist_q'] = pd.qcut(chosen_data_q['distance'], 4, labels=['Q1\n(short)', 'Q2', 'Q3', 'Q4\n(long)'])
shares_by_dist = chosen_data_q.groupby('dist_q')['mode'].value_counts(
    normalize=True
).unstack(fill_value=0)
shares_by_dist.plot(kind='bar', stacked=True, ax=axes[1, 1],
                    color=[colors[m] for m in shares_by_dist.columns],
                    alpha=0.8, edgecolor='black')
axes[1, 1].set_title('Mode Choice by Distance Quartile', fontweight='bold')
axes[1, 1].set_xlabel('Distance Quartile')
axes[1, 1].set_ylabel('Share')
axes[1, 1].legend(title='Mode', loc='upper right')
axes[1, 1].tick_params(axis='x', rotation=0)

plt.suptitle('Exploratory Analysis: Transportation Mode Choice', fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig(FIG_DIR / '05_mode_choice_exploration.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_mode_choice_exploration.png")

### 8.2 Full Applied Model

In [None]:
# Full model with both alternative-specific and individual-specific variables
print("=" * 70)
print(" " * 15 + "FULL APPLICATION MODEL")
print("=" * 70)

# Use the extended model with income interactions
print("\nBase Model (alternative-specific attributes only):")
print(f"  Log-L = {results.llf:.2f}, Pseudo R2 = {results.pseudo_r2:.4f}, Acc = {results.accuracy:.4f}")

print(f"\nExtended Model (+ income interactions):")
print(f"  Log-L = {results_ext.llf:.2f}, Pseudo R2 = {results_ext.pseudo_r2:.4f}, Acc = {results_ext.accuracy:.4f}")

# Use best model for policy analysis
best_results = results_ext if results_ext.aic < results.aic else results
best_model = model_ext if results_ext.aic < results.aic else model
best_label = "Extended" if results_ext.aic < results.aic else "Base"
print(f"\nBest model by AIC: {best_label}")
print(f"\n{best_results.summary()}")

### 8.3 Policy Simulation: 30% Bus Cost Reduction

In [None]:
# Comprehensive policy simulation: 30% bus cost reduction
# Using the base model for cleaner interpretation

reductions = [0.0, 0.10, 0.20, 0.30, 0.40, 0.50]
scenario_results = []

for reduction in reductions:
    data_scen = data.copy()
    data_scen.loc[data_scen['mode'] == 'bus', 'cost'] *= (1 - reduction)
    data_scen['choice_id'] = data_scen['id'].astype(str) + '_' + data_scen['year'].astype(str)
    
    model_scen = ConditionalLogit(
        data=data_scen,
        choice_col='choice_id',
        alt_col='mode',
        chosen_col='choice',
        alt_varying_vars=['cost', 'time', 'reliability', 'comfort']
    )
    
    # Predict with original parameters
    scen_probs = np.zeros((model_scen.n_choices, model_scen.n_alts))
    for i, X in enumerate(model_scen.X_list):
        utilities = X @ results.params
        exp_u = np.exp(utilities - utilities.max())
        scen_probs[i] = exp_u / exp_u.sum()
    
    shares = scen_probs.mean(axis=0)
    row = {'Reduction': f"{int(reduction*100)}%"}
    for j, alt in enumerate(alternatives):
        row[alt] = shares[j]
    scenario_results.append(row)

scenario_df = pd.DataFrame(scenario_results).set_index('Reduction')
print("=== Bus Cost Reduction Scenarios ===")
print("\nPredicted choice shares:")
print(scenario_df.round(4))

In [None]:
# Visualize scenario analysis
fig, ax = plt.subplots(figsize=(12, 7))

for mode in alternatives:
    ax.plot([r * 100 for r in reductions], scenario_df[mode].values * 100,
            marker='o', linewidth=2.5, markersize=8, label=mode.capitalize(),
            color=colors[mode])

ax.set_xlabel('Bus Cost Reduction (%)', fontsize=13)
ax.set_ylabel('Choice Share (%)', fontsize=13)
ax.set_title('Effect of Bus Cost Reduction on Modal Shares\n(Conditional Logit Counterfactual)',
             fontsize=14, fontweight='bold')
ax.legend(fontsize=12)
ax.grid(True, alpha=0.3)
ax.set_xticks([r * 100 for r in reductions])

plt.tight_layout()
plt.savefig(FIG_DIR / '05_mode_choice_scenarios.png', dpi=150, bbox_inches='tight')
plt.show()

print("Figure saved to outputs/figures/05_mode_choice_scenarios.png")

print("\nPolicy implications:")
baseline = scenario_df.iloc[0]
after30 = scenario_df.iloc[3]
print(f"  A 30% bus cost reduction would:")
for mode in alternatives:
    change = (after30[mode] - baseline[mode]) * 100
    print(f"    {mode:6s}: {change:+.2f} percentage points ({baseline[mode]*100:.1f}% -> {after30[mode]*100:.1f}%)")

### 8.4 Summary of Findings

In [None]:
print("=" * 70)
print(" " * 15 + "SUMMARY OF FINDINGS")
print("=" * 70)

print("\n1. MODEL FIT")
print(f"   Pseudo R2 = {results.pseudo_r2:.4f}")
print(f"   Prediction accuracy = {results.accuracy:.1%}")
print(f"   All coefficients have expected signs")

print("\n2. KEY FINDINGS")
gamma_cost_val = results.params[0]
gamma_time_val = results.params[1]
vot_val = gamma_time_val / gamma_cost_val * 60
print(f"   Cost effect: gamma = {gamma_cost_val:.4f} (1 R$ increase -> utility changes by {gamma_cost_val:.4f})")
print(f"   Time effect: gamma = {gamma_time_val:.4f} (1 min increase -> utility changes by {gamma_time_val:.4f})")
print(f"   Value of Time: R${vot_val:.2f}/hour")

print("\n3. ELASTICITIES")
for j, alt in enumerate(alternatives):
    print(f"   {alt:6s}: own cost elasticity = {own_cost[j]:.4f}")

print("\n4. POLICY IMPLICATIONS")
print(f"   30% bus cost reduction increases bus share by ")
print(f"   {(after30['bus'] - baseline['bus'])*100:+.2f} percentage points")
print(f"   Due to IIA, all other modes lose proportionally")

print("\n" + "=" * 70)

<a id='exercises'></a>
## 9. Exercises

---

### Exercise 1: Wide-to-Long Transformation (Easy)

Transform a small wide-format dataset into long format suitable for Conditional Logit.

**Task**:
1. Create a small wide-format dataset with 5 individuals, 3 modes, and variables `cost` and `time`
2. Transform it to long format
3. Verify: each (individual) has exactly one `choice=1`
4. Print both formats side by side

In [None]:
# Exercise 1: Your solution here

# Step 1: Create wide-format data
np.random.seed(99)
n = 5
modes = ['car', 'bus', 'metro']

# TODO: Create a DataFrame with columns:
#   id, chosen_mode, cost_car, cost_bus, cost_metro, time_car, time_bus, time_metro

# Step 2: Transform to long format
# TODO: Create rows for each (id, mode) with choice indicator

# Step 3: Verify integrity
# TODO: assert each id has exactly one choice=1

# Step 4: Print both
# TODO: Display wide and long DataFrames

---

### Exercise 2: Elasticity Interpretation (Medium)

Calculate own and cross elasticities for a **15% metro fare increase**.

**Task**:
1. Simulate a 15% increase in metro cost
2. Compute new predicted choice shares
3. Which modes gain ridership? By how much?
4. Verify the IIA property: do all non-metro modes gain proportionally?

In [None]:
# Exercise 2: Your solution here

# Step 1: Simulate 15% metro cost increase
# TODO: Modify data, predict with original coefficients

# Step 2: Compare shares
# TODO: Before vs after

# Step 3: Which modes gain?
# TODO: Calculate changes

# Step 4: Verify IIA (proportional substitution)
# TODO: Check that non-metro modes gain in proportion to their original shares

---

### Exercise 3: IIA Test with Different Alternatives (Medium)

Perform the Hausman-McFadden test by omitting each alternative in turn.

**Task**:
1. Estimate the full model
2. Remove each alternative one at a time and re-estimate
3. Compare coefficients across specifications
4. Are results sensitive to which alternative is removed?

In [None]:
# Exercise 3: Your solution here

# Step 1: Full model coefficients
# TODO: Use results.params from base model

# Step 2: Remove each alternative and re-estimate
# TODO: Loop over alternatives, re-estimate, store coefficients

# Step 3: Create comparison table
# TODO: Table with coefficients from each specification

# Step 4: Discussion
# TODO: Are coefficients stable? What does this imply for IIA?

---

### Exercise 4: Counterfactual — New Mode Introduction (Hard)

Simulate the introduction of a new transport mode: **e-scooter**.

**Task**:
1. Define e-scooter attributes (cost ~R$3, time ~25 min, reliability ~3.5, comfort ~2.5)
2. Using the estimated parameters, predict choice probabilities with 5 alternatives
3. Compare new equilibrium shares to original 4-mode shares
4. Which existing modes lose the most ridership?

**Hint**: Add a fifth row to each choice occasion's design matrix.

In [None]:
# Exercise 4: Your solution here

# Step 1: Define e-scooter attributes
# TODO: Set cost, time, reliability, comfort for e-scooter

# Step 2: Predict with 5 alternatives
# TODO: For each choice occasion, add e-scooter row to design matrix
#        and compute new probabilities using estimated gamma

# Step 3: Compare shares
# TODO: Create before/after comparison

# Step 4: Discussion
# TODO: Under IIA, which modes lose most? Is this realistic?

---

## Summary and Key Takeaways

### What We Learned

1. **Conditional Logit** handles **alternative-specific** covariates (cost, time, comfort)

2. **Long format** data: one row per (individual $\times$ time $\times$ alternative)

3. **Coefficients are common** across alternatives (unlike Multinomial Logit)

4. **Value of Time** = $\gamma_{\text{time}} / \gamma_{\text{cost}}$ is a key policy metric

5. **Own elasticities** are negative; **cross elasticities** are positive

6. **IIA property**: Proportional substitution — can be tested via Hausman-McFadden

7. **Individual-specific variables** require interaction with alternative dummies

8. **Policy simulations** are straightforward: change attributes, recompute probabilities

### Limitations

- IIA assumption may be unrealistic for similar alternatives
- No random taste variation across individuals
- Fixed coefficients (no heterogeneity in preferences)

### Next Steps

- **Nested Logit**: Relaxes IIA by grouping similar alternatives
- **Mixed Logit**: Random coefficients for heterogeneous preferences
- **Probit**: Flexible correlation structure

---

## References

1. **McFadden, D. (1973)**. "Conditional logit analysis of qualitative choice behavior." In P. Zarembka (Ed.), *Frontiers in Econometrics*. New York: Academic Press.

2. **Train, K. (2009)**. *Discrete Choice Methods with Simulation*. Cambridge University Press.

3. **Ben-Akiva, M., & Lerman, S. (1985)**. *Discrete Choice Analysis*. MIT Press.

4. **Hausman, J., & McFadden, D. (1984)**. "Specification Tests for the Multinomial Logit Model." *Econometrica*, 52(5), 1219-1240.

---

**Thank you for completing this tutorial!**

Questions or feedback? Visit: https://github.com/panelbox/panelbox/issues