37902 Foundation of Advanced Quantitative Marketing

Li Liu

#### Tasks
1)      Compute the nested logit elasticities (if you haven’t already done so)

2)      Try out the IIA tests

3)      Work on the observable heterogeneity model – both a priori and with interactions

4)      Fit the latent-segment / latent-class / discrete heterogeneity model with 2 segments

In [72]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.optimize as opt
pd.options.display.max_colwidth = 1000

### Yogurt100N Sales Data

In [73]:
df=pd.read_excel("Yogurt100N.csv.xlsx")
df.describe()
df.head()

Unnamed: 0,Pan I.D.,Expend $,Income,HH Size,IPT,Quantity,Brand 1,Brand 2,Brand 3,Brand 4,Feature 1,Feature 2,Feature 3,Feature 4,Price 1,Price 2,Price 3,Price 4,PanelistFirstObs
0,1,40.900002,9,2,5,2,0,0,0,1,0,0,0,0,0.108,0.081,0.061,0.079,1
1,1,16.809999,9,2,5,2,0,1,0,0,0,0,0,0,0.108,0.098,0.064,0.075,0
2,1,4.06,9,2,1,2,0,1,0,0,0,0,0,0,0.108,0.098,0.061,0.086,0
3,1,34.459999,9,2,4,2,0,1,0,0,0,0,0,0,0.108,0.098,0.061,0.086,0
4,1,8.39,9,2,7,2,0,1,0,0,0,0,0,0,0.125,0.098,0.049,0.079,0


### Simple Logit on Yogurt Data

In [235]:
def crit(params,df):

    a1,a2,a3,bf,bp=params
    ev1=np.exp(a1+bf*df['Feature 1']+bp*df['Price 1'])
    ev2=np.exp(a2+bf*df['Feature 2']+bp*df['Price 2'])
    ev3=np.exp(a3+bf*df['Feature 3']+bp*df['Price 3'])
    ev4=np.exp(0+bf*df['Feature 4']+bp*df['Price 4'])
    denom=ev1+ev2+ev3+ev4
    pc=(ev1*df['Brand 1']+ev2*df['Brand  2']+ev3*df['Brand 3']+ev4*df['Brand 4'])/denom
    Inpc=np.log(pc)
    LL=np.sum(Inpc)
    return -LL
a1,a2,a3,bf,bp=1,1,1,1,1 #Initalization
params_init = np.array([a1,a2,a3,bf,bp])
results = opt.minimize(crit, params_init,df)
a1,a2,a3,bf,bp = results.x
sigmaR=results.hess_inv
print(" a1 (Intrinsic brand preference for Brand 1):",a1,"\n",
      "a2 (Intrinsic brand preference for Brand 2):",a2,"\n",
      "a3 (Intrinsic brand preference for Brand 3):",a3,"\n",
      "bf (Coefficients for feature variable):",bf,"\n",
      "bp (Coefficients for price variable):",bp,"\n",
      "Maximized Log Likelihood:",-results.fun)

 a1 (Intrinsic brand preference for Brand 1): 1.3877493848693059 
 a2 (Intrinsic brand preference for Brand 2): 0.6435046305879636 
 a3 (Intrinsic brand preference for Brand 3): -3.0861119572117355 
 bf (Coefficients for feature variable): 0.4874149107851659 
 bp (Coefficients for price variable): -37.057782766093105 
 Maximized Log Likelihood: -2658.5566975071233


### Elasticity with I.I.A assumption

In [75]:
#Own elasticities
e11=np.mean((bf*df['Feature 1']+bp*df['Price 1'])*(1-df['Price 1']))
e22=np.mean((bf*df['Feature 2']+bp*df['Price 2'])*(1-df['Price 2']))
e33=np.mean((bf*df['Feature 3']+bp*df['Price 3'])*(1-df['Price 3']))
e44=np.mean((bf*df['Feature 4']+bp*df['Price 4'])*(1-df['Price 4']))

In [76]:
#Property of Logit Model: for all j != k, the cross elasticity will be the same.
e21=e31=e41=np.mean(-(bf*df['Feature 1']+bp*df['Price 1'])*(df['Price 1']))
e12=e32=e42=np.mean(-(bf*df['Feature 2']+bp*df['Price 2'])*(df['Price 2']))
e13=e23=e43=np.mean(-(bf*df['Feature 3']+bp*df['Price 3'])*(df['Price 3']))
e14=e24=e34=np.mean(-(bf*df['Feature 4']+bp*df['Price 4'])*(df['Price 4']))

In [77]:
mat=pd.DataFrame({"Brand 1":[e11,e21,e31,e41], "Brand2":[e12,e22,e32,e42],
                  "Brand 3":[e13,e23,e33,e43],'Brand 4':[e14,e24,e34,e44]})
mat.index=["Brand 1","Brand 2", "Brand 3", "Brand 4"]
print("Elasiticy Matrix with Simple Logit Model")
mat

Elasiticy Matrix with Simple Logit Model


Unnamed: 0,Brand 1,Brand2,Brand 3,Brand 4
Brand 1,-3.478799,0.24966,0.108242,0.235253
Brand 2,0.431428,-2.752467,0.108242,0.235253
Brand 3,0.431428,0.24966,-1.86061,0.235253
Brand 4,0.431428,0.24966,0.108242,-2.692863


### Nested Logit Model on Yogurt Data

#### Brands 1~3 in one nest and 4 in another 

In [78]:
def nestedlogit(params):

    a1,a2,a3,bf,bp,theta=params 
    rho=np.exp(theta)/(np.exp(theta)+1)
    ev1=np.exp((a1+bf*df['Feature 1']+bp*df['Price 1'])/rho)
    ev2=np.exp((a2+bf*df['Feature 2']+bp*df['Price 2'])/rho)
    ev3=np.exp((a3+bf*df['Feature 3']+bp*df['Price 3'])/rho)
    ev4=np.exp(0+bf*df['Feature 4']+bp*df['Price 4'])
    denom=ev1+ev2+ev3
    P4=ev4/(np.power(denom,rho)+ev4)
    P1=(1-P4)*ev1/denom
    P2=(1-P4)*ev2/denom
    P3=(1-P4)*ev3/denom
        
    pc=(P1*df['Brand 1']+P2*df['Brand  2']+P3*df['Brand 3']+P4*df['Brand 4'])
    Inpc=np.log(pc)
    LL=np.sum(Inpc)
    return -LL

In [79]:
a1,a2,a3,bf,bp,theta=1,1,1,1,1,1 #Initalization
params_init = np.array([a1,a2,a3,bf,bp,theta])
results = opt.minimize(nestedlogit, params_init)
a1,a2,a3,bf,bp,theta = results.x
rho=np.exp(theta)/(np.exp(theta)+1)
print(" a1 (Intrinsic brand preference for Brand 1):",a1,"\n",
      "a2 (Intrinsic brand preference for Brand 2):",a2,"\n",
      "a3 (Intrinsic brand preference for Brand 3):",a3,"\n",
      "bf (Coefficients for feature variable):",bf,"\n",
      "bp (Coefficients for price variable):",bp,"\n",
      "rho (Correlation variable):",rho,"\n",
      "Maximized Log Likelihood:",-results.fun)

 a1 (Intrinsic brand preference for Brand 1): 1.3816689432731886 
 a2 (Intrinsic brand preference for Brand 2): 0.8394218694834024 
 a3 (Intrinsic brand preference for Brand 3): -1.6585023784518325 
 bf (Coefficients for feature variable): 0.3744679884335011 
 bp (Coefficients for price variable): -26.58112038547648 
 rho (Correlation variable): 0.6433848076286417 
 Maximized Log Likelihood: -2653.7645999847773


In [80]:
vcv_mle = results.hess_inv
stderr_a1_mle = np.sqrt(vcv_mle[0,0])
stderr_a2_mle = np.sqrt(vcv_mle[1,1])
stderr_a3_mle = np.sqrt(vcv_mle[2,2])
stderr_bf_mle = np.sqrt(vcv_mle[3,3])
stderr_bp_mle = np.sqrt(vcv_mle[4,4])
stderr_theta_mle = np.sqrt(vcv_mle[5,5])

print('Standard error for a1 estimate = ', stderr_a1_mle)
print('Standard error for a2 estimate = ', stderr_a2_mle)
print('Standard error for a3 estimate = ', stderr_a3_mle)
print('Standard error for bf estimate = ', stderr_bf_mle)
print('Standard error for bp estimate = ', stderr_bp_mle)
print('Standard error for theta estimate = ', stderr_theta_mle)

Standard error for a1 estimate =  0.07125745085397713
Standard error for a2 estimate =  0.0498364392238186
Standard error for a3 estimate =  0.023037025008121905
Standard error for bf estimate =  0.08637968526841336
Standard error for bp estimate =  1.4536600440077279
Standard error for theta estimate =  0.1084251269080594


### Elasticity when I.I.A assumption is violated

In [81]:
ev1=np.exp((a1+bf*df['Feature 1']+bp*df['Price 1'])/rho)
ev2=np.exp((a2+bf*df['Feature 2']+bp*df['Price 2'])/rho)
ev3=np.exp((a3+bf*df['Feature 3']+bp*df['Price 3'])/rho)
ev4=np.exp(0+bf*df['Feature 4']+bp*df['Price 4'])
denom=ev1+ev2+ev3
P4=ev4/(np.power(denom,rho)+ev4)
P1=(1-P4)*ev1/denom
P2=(1-P4)*ev2/denom
P3=(1-P4)*ev3/denom
#Own Elasticities
e11=np.mean((bf*df['Feature 1']+bp*df['Price 1'])*((1/rho)+(ev1/denom)*(1-1/rho)-P1))
e22=np.mean((bf*df['Feature 2']+bp*df['Price 2'])*((1/rho)+(ev2/denom)*(1-1/rho)-P2))
e33=np.mean((bf*df['Feature 3']+bp*df['Price 3'])*((1/rho)+(ev3/denom)*(1-1/rho)-P3))
e44=np.mean((bf*df['Feature 4']+bp*df['Price 4'])*((1/rho)+P4*(1-1/rho)-P1))

In [82]:
e41=np.mean(-(bf*df['Feature 1']+bp*df['Price 1'])*(P1))
e42=np.mean(-(bf*df['Feature 2']+bp*df['Price 2'])*(P2))
e43=np.mean(-(bf*df['Feature 3']+bp*df['Price 3'])*(P3))
e14=e24=e34=np.mean(-(bf*df['Feature 4']+bp*df['Price 4'])*(P4))

e12=np.mean((bf*df['Feature 2']+bp*df['Price 2'])*((1/rho)+(ev1/denom)*(1-1/rho)-P1))
e13=np.mean((bf*df['Feature 3']+bp*df['Price 3'])*((1/rho)+(ev1/denom)*(1-1/rho)-P1))
e21=np.mean((bf*df['Feature 1']+bp*df['Price 1'])*((1/rho)+(ev2/denom)*(1-1/rho)-P2))
e23=np.mean((bf*df['Feature 3']+bp*df['Price 3'])*((1/rho)+(ev2/denom)*(1-1/rho)-P2))
e32=np.mean((bf*df['Feature 2']+bp*df['Price 2'])*((1/rho)+(ev3/denom)*(1-1/rho)-P3))
e31=np.mean((bf*df['Feature 1']+bp*df['Price 1'])*((1/rho)+(ev3/denom)*(1-1/rho)-P3))

mat=pd.DataFrame({"Brand 1":[e11,e21,e31,e41], "Brand2":[e12,e22,e32,e42],
                  "Brand 3":[e13,e23,e33,e43],'Brand 4':[e14,e24,e34,e44]})
mat.index=["Brand 1","Brand 2", "Brand 3", "Brand 4"]
print("Elasiticy Matrix with Model 1 (Nest 1: 1~3; Nest 2: 4)")
mat

Elasiticy Matrix with Model 1 (Nest 1: 1~3; Nest 2: 4)


Unnamed: 0,Brand 1,Brand2,Brand 3,Brand 4
Brand 1,-2.86233,-2.060837,-1.370846,0.470184
Brand 2,-2.305927,-1.894288,-1.208777,0.470184
Brand 3,-4.206838,-3.232809,-2.129334,0.470184
Brand 4,0.867566,0.839326,0.037066,-2.288338


### IIA Test

MTT(McFadden, Train, Tye)

In [90]:
#Drop a1 to fit a restricted logit model
a2,a3,bf,bp=0.6435046305879636,-3.0861119572117355, 0.4874149107851659,-37.057782766093105 
ev1=np.exp(bf*df['Feature 1']+bp*df['Price 1'])
ev2=np.exp(a2+bf*df['Feature 2']+bp*df['Price 2'])
ev3=np.exp(a3+bf*df['Feature 3']+bp*df['Price 3'])
ev4=np.exp(0+bf*df['Feature 4']+bp*df['Price 4'])
denom=ev1+ev2+ev3+ev4
pc=(ev1*df['Brand 1']+ev2*df['Brand  2']+ev3*df['Brand 3']+ev4*df['Brand 4'])/denom
Inpc=np.log(pc)
LLFR=np.sum(Inpc)
LLFR

-3046.9511846920705

In [101]:
def crit2(params):

    a2,a3,bf,bp=params
    ev1=np.exp(bf*df['Feature 1']+bp*df['Price 1'])
    ev2=np.exp(a2+bf*df['Feature 2']+bp*df['Price 2'])
    ev3=np.exp(a3+bf*df['Feature 3']+bp*df['Price 3'])
    ev4=np.exp(0+bf*df['Feature 4']+bp*df['Price 4'])
    denom=ev1+ev2+ev3+ev4
    pc=(ev1*df['Brand 1']+ev2*df['Brand  2']+ev3*df['Brand 3']+ev4*df['Brand 4'])/denom
    Inpc=np.log(pc)
    LL=np.sum(Inpc)
    return -LL
a2r,a3r,bfr,bpr=1,1,1,1 #Initalization
params_init = np.array([a2,a3,bf,bp])
results = opt.minimize(crit2, params_init)
a2r,a3r,bfr,bpr = results.x
sigmaFR=results.hess_inv
LLR=-results.fun
LLR

-2811.1557606390343

In [89]:
#MMT Formula
MMT=-2*(LLU-LLR)
MMT

471.59084810507375

In [86]:
from scipy.stats import chi2
1 - chi2.cdf(MMT, 3)

0.0

Harbar McFadden

In [131]:
from numpy import matrix
A=matrix(sigmaR[1:,1:]-sigmaFR)
paraR=np.array([a2r,a3r,bfr,bpr])
paraFR=np.array([a2,a3,bf,bp])
paradiff=(paraR-paraFR)
paradiff*A.I*paradiff.reshape(4,1)

matrix([[215.08755288]])

#### Observable heterogeneity model with a priori 

Motivation: use demographical information to divide customers into segments

For Yogurt Data, we divide the customers into four segments based on income level and household size.

In [138]:
df[['Income','HH Size']].describe()

Unnamed: 0,Income,HH Size
count,2430.0,2430.0
mean,8.720988,2.802058
std,3.800654,1.173291
min,1.0,1.0
25%,6.0,2.0
50%,9.0,3.0
75%,12.0,4.0
max,14.0,6.0


In [211]:
#Low Income, Small HH Size
seg1=df[(df['Income']<9) & (df['HH Size']<3)]
n1=seg1.shape[0]

In [212]:
#High Income, Small HH Size
seg2=df[(df['Income']>=9) & (df['HH Size']<3)]
n2=seg2.shape[0]

In [213]:
#Low Income, Large HH Size
seg3=df[(df['Income']<9) & (df['HH Size']>=3)]
n3=seg3.shape[0]

In [214]:
#High Income, Large HH Size
seg4=df[(df['Income']>=9) & (df['HH Size']>=3)]
n4=seg4.shape[0]

In [237]:
def BIC(data):
    params_init = np.array([a1,a2,a3,bf,bp])
    results = opt.minimize(crit, params_init,data)
    n=data.shape[0]
    return np.log(n)*(5+2)-2*(-results.fun)

In [260]:
tab=pd.DataFrame({'BIC ':[BIC(seg1),BIC(seg2),BIC(seg3),BIC(seg4),
                          (BIC(seg1)+BIC(seg2)+BIC(seg3)+BIC(seg4))/4,BIC(df)/4]},
                 index=["Seg 1","Seg 2","Seg 3","Seg 4","Avg BIC","25% of Full Model"])
tab

Unnamed: 0,BIC
Seg 1,1345.463511
Seg 2,1263.959099
Seg 3,1003.496163
Seg 4,1656.280668
Avg BIC,1317.29986
25% of Full Model,1342.92073


The average BIC of four models is close to the 25% of the BIC from full logit model.

There exists heterogeneity among this four groups, as the BIC changes greatly when one condition varies.

#### Observable heterogeneity model with interactions 

In [275]:
def crit(params,df):

    a11,a1i,a1h,a22,a2i,a2h,a33,a3i,a3h,bf,bp,bfi,bfh,bpi,bph=params
    
    a1=a11+a1i*df["Income"]+a1h*df["HH Size"]
    a2=a22+a2i*df["Income"]+a2h*df["HH Size"]
    a3=a33+a3i*df["Income"]+a3h*df["HH Size"]
    
    bfi=bf+bfi*df["Income"]+bfh*df["HH Size"]
    bpi=bp+bpi*df["Income"]+bph*df["HH Size"]
    ev1=np.exp(a1+bfi*df['Feature 1']+bpi*df['Price 1'])
    ev2=np.exp(a2+bfi*df['Feature 2']+bpi*df['Price 2'])
    ev3=np.exp(a3+bfi*df['Feature 3']+bpi*df['Price 3'])
    ev4=np.exp(0+bfi*df['Feature 4']+bpi*df['Price 4'])
    denom=ev1+ev2+ev3+ev4
    pc=(ev1*df['Brand 1']+ev2*df['Brand  2']+ev3*df['Brand 3']+ev4*df['Brand 4'])/denom
    Inpc=np.log(pc)
    LL=np.sum(Inpc)
    return -LL
a11,a1i,a1h,a22,a2i,a2h,a33,a3i,a3h,bf,bp,bfi,bfh,bpi,bph=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
params_init = np.array([a11,a1i,a1h,a22,a2i,a2h,a33,a3i,a3h,bf,bp,bfi,bfh,bpi,bph])
results = opt.minimize(crit, params_init,df)
a11,a1i,a1h,a22,a2i,a2h,a33,a3i,a3h,bf,bp,bfi,bfh,bpi,bph= results.x
print("Maximized Log Likelihood:",-results.fun)

Maximized Log Likelihood: -2537.4825261515534


$\alpha_1=0.51-0.097*Inc+0.617*HH$

$\alpha_2=1.19-0.119*Inc+0.196*HH$

$\alpha_3=-1.88-0.293*Inc+0.373*HH$

$\beta_f=0.542-0.089*Inc+0.219*HH$

$\beta_p=-40+1.22*Inc-2.458*HH$


In [287]:
para=[a11,a1i,a1h,a22,a2i,a2h,a33,a3i,a3h,bf,bp,bfi,bfh,bpi,bph]
paraname=["a11","a1i","a1h","a22","a2i","a2h","a33","a3i","a3h","bf","bp","bfi","bfh","bpi","bph"]

vcv_mle = results.hess_inv
CIlow,CIhigh=[],[]
for i in range(len(para)):
    std=np.sqrt(vcv_mle[i,i])
    CIlow.append(para[i]-1.96*std)
    CIhigh.append(para[i]+1.96*std)
paratable=pd.DataFrame({"Mean Value of Parameters":para,"CI Left":CIlow,"CI Right":CIhigh},
                       index=paraname)
paratable

Unnamed: 0,Mean Value of Parameters,CI Left,CI Right
a11,0.51029,0.187368,0.833211
a1i,-0.097419,-0.129433,-0.065405
a1h,0.616646,0.519699,0.713593
a22,1.190373,0.987831,1.392914
a2i,-0.118501,-0.143341,-0.093661
a2h,0.196365,0.099227,0.293502
a33,-1.884975,-2.344343,-1.425606
a3i,-0.292851,-0.368287,-0.217416
a3h,0.372958,0.175329,0.570587
bf,0.54225,0.136067,0.948433


All of the confidence interval doesn't contain 0 at 95% level. So they are all statistically significant. There exists heterogeneity for groups with different income and household size when responding to feature and price.