Multinomial Choice Models
In a typical linear regression model, we assume that the dependent variable is either continuous or binary. 
y_it=β_0+β_1 x_1it+β_2 x_2it+β_3 x_3it+λ_t+ϵ_it
An example of a continuous dependent variable might be the height of a tomato plant in inches. In this model, y_it is the height of the i^th tomato plant at time t, as a function of an intercept term, the amount of water (x_1), sunlight (x_2) and fertilizer (x_3) the plant receives, and a time trend variable. We interpret each coefficient β as the slope between its corresponding x and the value of y, holding all other independent variables constant.
An example of a binary dependent variable might be whether an individual attends college. In this model, y_it=1 corresponds to attending college for at least one semester, and y_it=0 corresponds to never attending college. We’d include an intercept term, and we’d also consider including covariates (independent variables) like family income (x_1), whether a parent attended college (x_2), high school GPA (x_3), and a time trend. We interpret each coefficient β as an increase or decrease to the probability of observing y = 1 for the dependent variable.
Both models are easy to estimate using linear regression models (such as OLS). But what happens when the choice faced by an individual doesn’t correspond to a binary outcome (for example, college/no college), but to multiple options from a categorical variable (for example, Apple iPhone / Android / Samsung Galaxy)? 
Multinomial choice models are used to provide mathematical structure to these problems. As with linear regression models, our goal is to estimate the parameters (coefficients) β that correspond to the covariates of interest x that we expect will influence the outcome. The output of the model is a set of probabilities that a given agent will choose one option compared to another. 
Multinomial choice models are used in practically every field of study. Brainstorm (or research) some applications of multinomial choice models in your chosen area (whether it’s your major, your anticipated career path, or just a field you find interesting). 


In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, accuracy_score

# Load the dataset
file_path = r"D:\BINGHAMTON University\Spring Semester 2025\DATA 580E_E- Numeric Methods for Optimization\transportation_ml.csv"
df = pd.read_csv(file_path)

# Display basic info
print("Dataset Head:\n", df.head())
print("\nColumn Info:\n", df.info())
print("\nClass Distribution:\n", df['Transport_Mode'].value_counts())

# Assume 'Transport_Mode' is the categorical target variable
# and other columns are features
target = 'Transport_Mode'
X = df.drop(columns=[target])
y = df[target]

# Optional: encode categorical variables if needed
X = pd.get_dummies(X, drop_first=True)

# Standardize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Fit a multinomial logistic regression
model = LogisticRegression(multi_class='multinomial', solver='lbfgs', max_iter=1000)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print("\nClassification Report:\n", classification_report(y_test, y_pred))
print("Accuracy Score:", accuracy_score(y_test, y_pred))

Dataset Head:
    id  time_Bike  time_Bus  time_Car  cost_Bike  cost_Bus  cost_Car  \
0   1    19.4296   30.4610   32.4836          0    1.5683    6.3994   
1   2    21.8453   35.6981   29.3087          0    1.9844    5.9246   
2   3    20.2897   37.9320   33.2384          0    2.0090    5.0596   
3   4    22.2600   49.4384   37.6151          0    2.2363    4.3531   
4   5    23.9292   42.7828   28.8292          0    1.3166    5.6982   

   risk_Bike  risk_Bus  risk_Car choice  choice_code  Unnamed: 12  Unnamed: 13  
0     0.0379    0.0087    0.0166    Car            2          NaN          NaN  
1     0.0122    0.0086    0.0193    Car            2          NaN          NaN  
2     0.0371    0.0046    0.0160    Bus            1          NaN          NaN  
3     0.0277    0.0090    0.0185   Bike            0          NaN          NaN  
4     0.0371    0.0122    0.0105    Bus            1          NaN          NaN  
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999

KeyError: 'Transport_Mode'

In [7]:
import numpy as np

# Simulated covariates (x_ij) for one individual and three alternatives (j = A, B, C)
x_A = np.array([1.0, 2.0])   # features for alternative A
x_B = np.array([1.5, 1.2])   # features for alternative B
x_C = np.array([0.8, 1.6])   # features for alternative C

# Coefficients β
beta = np.array([0.5, -0.3])

# Compute linear utility x_ij * beta
u_A = x_A @ beta
u_B = x_B @ beta
u_C = x_C @ beta

# Compute exp(x_ij * beta)
exp_A = np.exp(u_A)
exp_B = np.exp(u_B)
exp_C = np.exp(u_C)

# With all 3 alternatives
denominator_full = exp_A + exp_B + exp_C
pi_A_full = exp_A / denominator_full
pi_B_full = exp_B / denominator_full

# With only 2 alternatives: A and B (remove C)
denominator_AB = exp_A + exp_B
pi_A_ab = exp_A / denominator_AB
pi_B_ab = exp_B / denominator_AB

# Compare the ratios
odds_full = pi_A_full / pi_B_full
odds_ab = pi_A_ab / pi_B_ab

# Show that the ratio stays the same
print("π_A / π_B with all 3 choices   =", round(odds_full, 4))
print("π_A / π_B with only A and B    =", round(odds_ab, 4))
print("✅ Ratios equal (IIA holds)?    =", np.isclose(odds_full, odds_ab))

π_A / π_B with all 3 choices   = 0.6126
π_A / π_B with only A and B    = 0.6126
✅ Ratios equal (IIA holds)?    = True


## 	Can you prove algebraically that the relative probabilities of two choices do not depend on the presence or absence of other choices? (Hint: use the equation above, and find the ratio π_ij/π_ik for j≠k. 

In [2]:
import numpy as np
import pandas as pd

# Simulated β coefficients for 3 features
beta = np.array([0.5, -0.3, 0.2])

# Simulate covariates x_ij for 1 agent (i=1) and 3 alternatives (j=1,2,3)
x = {
    'Alt1': np.array([1.0, 2.0, 1.5]),
    'Alt2': np.array([1.2, 1.8, 1.0]),
    'Alt3': np.array([0.8, 2.2, 1.3])
}

# Compute utility and exp(xβ) for each alternative
utilities = {alt: x[alt] @ beta for alt in x}
exp_utilities = {alt: np.exp(utilities[alt]) for alt in x}

# Full choice set: Alt1, Alt2, Alt3
denominator_full = sum(exp_utilities.values())
prob_full = {alt: exp_utilities[alt] / denominator_full for alt in x}

# Subset choice set: Alt1, Alt2 only
denominator_subset = exp_utilities['Alt1'] + exp_utilities['Alt2']
prob_subset = {
    'Alt1': exp_utilities['Alt1'] / denominator_subset,
    'Alt2': exp_utilities['Alt2'] / denominator_subset
}

# Compute relative odds
odds_full = prob_full['Alt1'] / prob_full['Alt2']
odds_subset = prob_subset['Alt1'] / prob_subset['Alt2']

print("Odds (Alt1 vs Alt2) with 3 choices:", round(odds_full, 4))
print("Odds (Alt1 vs Alt2) with 2 choices:", round(odds_subset, 4))
print("Are odds equal?", np.isclose(odds_full, odds_subset))

Odds (Alt1 vs Alt2) with 3 choices: 0.9418
Odds (Alt1 vs Alt2) with 2 choices: 0.9418
Are odds equal? True


## 3.

In [8]:
import pandas as pd

# Define voter groups and their rankings
voters = pd.DataFrame({
    'group': ['G1', 'G2', 'G3'],
    'percent': [0.25, 0.40, 0.35],
    '1st': ['A', 'B', 'C'],
    '2nd': ['B', 'C', 'A'],
    '3rd': ['C', 'A', 'B']
})

# First-past-the-post: count only first-choice votes
fptp_results = voters.groupby('1st')['percent'].sum().sort_values(ascending=False)
print("📊 First-Past-the-Post Results:")
print(fptp_results)
print(f"\n🏆 Winner (FPTP): {fptp_results.idxmax()}")

# Top-two runoff
# Step 1: Identify top two from first round (same as FPTP)
top_two = fptp_results.index[:2].tolist()
print(f"\n🧮 Top Two Candidates for Runoff: {top_two}")

# Step 2: Redistribute votes in runoff between top two
def runoff_vote(row, top_two):
    # Go through preferences in order
    for col in ['1st', '2nd', '3rd']:
        if row[col] in top_two:
            return row[col]
    return None

voters['runoff_vote'] = voters.apply(lambda row: runoff_vote(row, top_two), axis=1)

runoff_results = voters.groupby('runoff_vote')['percent'].sum()
print("\n📊 Runoff Results:")
print(runoff_results)
print(f"\n🏆 Winner (Runoff): {runoff_results.idxmax()}")


📊 First-Past-the-Post Results:
1st
B    0.40
C    0.35
A    0.25
Name: percent, dtype: float64

🏆 Winner (FPTP): B

🧮 Top Two Candidates for Runoff: ['B', 'C']

📊 Runoff Results:
runoff_vote
B    0.65
C    0.35
Name: percent, dtype: float64

🏆 Winner (Runoff): B


## 4.