# Lecture 10

The data is taken from Greene and Hensher (1997). 210 individuals are surveyed about their choice of travel mode between Sydney, Canberra and Melbourne, and the various costs (time and money) associated with each alternative. Therefore there are 840 = 4 x 210 observations, which we can stack into `travelmodedataset` a 3 dimensional array whose dimensions are mode,individual,dummy for choice+covariates.

In [1]:
thePath = getwd()
travelmodedataset = as.matrix(read.csv(paste0(thePath,"/travelmodedata.csv"),sep=",", header=TRUE)) # loads the data
head(travelmodedataset)

individual,mode,choice,wait,vcost,travel,gcost,income,size
1,air,no,69,59,100,70,35,1
1,train,no,34,31,372,71,35,1
1,bus,no,35,25,417,70,35,1
1,car,yes,0,10,180,30,35,1
2,air,no,64,58,68,68,30,2
2,train,no,44,31,354,84,30,2


In [2]:
# Convert strings to categorical variables
convertmode = Vectorize ( 
  function(inputtxt) { 
    if (inputtxt == "air") {
      return(1) 
      }
    if (inputtxt == "train") {
      return(2)
      }
    if (inputtxt == "bus") {
      return(3)
      }
    if (inputtxt == "car") {
      return(4)
      }
  }
)
convertchoice = function(x) (ifelse(x=="no",0,1))
travelmodedataset[,2] = convertmode(travelmodedataset[,2])
travelmodedataset[,3] = convertchoice(travelmodedataset[,3])

# Useful things
nobs = dim(travelmodedataset)[1]
nind = nobs / 4
ncols =  dim(travelmodedataset)[2]
travelmodedataset = array(as.numeric(travelmodedataset),dim = c(4,nind,ncols))
choices = travelmodedataset[,,3]

First, we compute the unconditional market shares:

In [3]:
s = apply(X = choices,FUN = mean, MARGIN = 1)
names(s)=c("air","train","bus","car")
print("Market shares:")
print(s)

[1] "Market shares:"
      air     train       bus       car 
0.2761905 0.3000000 0.1428571 0.2809524 


Define "car" as the default alternative. The utilities in the logit model are obtained by the log-odds ratio formula:

In [4]:
Ulogit = log(s[1:4]/s[4])
print("Systematic utilities (logit):")
print(Ulogit)

[1] "Systematic utilities (logit):"
        air       train         bus         car 
-0.01709443  0.06559728 -0.67634006  0.00000000 


Now compute these utilities using a nested logit model with two nests, "noncar" and "car", and taking $\lambda=0.5$ in both nests. Do:

In [5]:
lambda = c(1/2,1/2)

Unocar = lambda[1]*log(s[1:3])+(1-lambda[1]) * log(sum(s[1:3]))
Ucar = lambda[2]*log(s[4])+(1-lambda[2]) * log(sum(s[4]))
Unested = c(Unocar,Ucar ) - Ucar
print("Systematic utilities (nested logit):")
print(Unested)

print("Choice probabilities within nocar nest (predicted vs observed):")
print( exp(Unested[1:3]/lambda[1]) / sum(exp(Unested[1:3]/lambda[1])))
print(s[1:3]/sum(s[1:3]))

print("Choice probabilities of car nest (predicted vs observed):")
print( 1 / (sum(exp(Unested[1:3]/lambda[1]))^lambda[1]+1) )
print(unname(s[4]))

[1] "Systematic utilities (nested logit):"
      air     train       bus       car 
0.4613240 0.5026698 0.1317012 0.0000000 
[1] "Choice probabilities within nocar nest (predicted vs observed):"
      air     train       bus 
0.3841060 0.4172185 0.1986755 
      air     train       bus 
0.3841060 0.4172185 0.1986755 
[1] "Choice probabilities of car nest (predicted vs observed):"
[1] 0.2809524
[1] 0.2809524
