# Similarity elasticites and diversion ratios

This notebook discusses computation of elasticites and diversion ratios in the Similarity Model. We also compare the Similarity elasticities and diversion ratios to those of a Multinomial Logit Model using publically available data on the European car market from Frank Verboven's website at https://sites.google.com/site/frankverbo/data-and-software/data-set-on-the-european-car-market.

In [1]:
import numpy as np
import pandas as pd
#pd.options.mode.chained_assignment = None
pd.set_option('display.max_rows', 500)
import os
import sys
from numpy import linalg as la
from scipy import optimize
import scipy.stats as scstat
from matplotlib import pyplot as plt
import itertools as iter
%load_ext line_profiler

# Files
module_path = os.path.abspath(os.path.join('../..'))
if module_path not in sys.path:
    sys.path.append(module_path)

data_path = os.path.join(module_path, 'data')

from utilities.Logit_file import estimate_logit, logit_se, logit_t_p, q_logit, logit_score, logit_score_unweighted, logit_ccp, LogitBLP_estimator, LogitBLP_se
from data.Eurocarsdata_file import Eurocars_cleandata

In [2]:
# Load dataset and variable names
descr = (pd.read_stata(os.path.join(data_path,'eurocars.dta'), iterator = True)).variable_labels() # Obtain variable descriptions
dat_file = pd.read_csv(os.path.join(data_path, 'eurocars.csv')) # reads in the data set as a pandas dataframe.

In [3]:
# Outside option is included if OO == True, otherwise analysis is done on the inside options only.
OO = True

# Choose which variables to include in the analysis, and assign them either as discrete variables or continuous.

x_discretevars = [ 'brand', 'home', 'cla']
x_contvars = ['cy', 'hp', 'we', 'le', 'wi', 'he', 'li', 'sp', 'ac', 'pr']
z_IV_contvars = ['xexr']
z_IV_discretevars = []
x_allvars =  [*x_contvars, *x_discretevars]
z_allvars = [*z_IV_contvars, *z_IV_discretevars]

if OO:
    nest_contvars = [var for var in x_contvars if var != 'pr'] # We nest over all variables other than price, but an alternative list can be specified here if desired.
    nest_discvars = ['in_out', *x_discretevars]
    nest_vars = ['in_out', *nest_contvars, *x_discretevars]
else:
    nest_contvars = [var for var in x_contvars if (var != 'pr')]
    nest_discvars = x_discretevars # See above
    nest_vars = [*nest_contvars, *nest_discvars]

G = len(nest_vars)

# Print list of chosen variables as a dataframe
pd.DataFrame(descr, index=['description'])[x_allvars].transpose().reset_index().rename(columns={'index' : 'variable names'})

Unnamed: 0,variable names,description
0,cy,cylinder volume or displacement (in cc)
1,hp,horsepower (in kW)
2,we,weight (in kg)
3,le,length (in cm)
4,wi,width (in cm)
5,he,height (in cm)
6,li,"average of li1, li2, li3 (used in papers)"
7,sp,maximum speed (km/hour)
8,ac,time to acceleration (in seconds from 0 to 100...
9,pr,price (in destination currency including V.A.T.)


In [4]:
dat, dat_org, x_vars, z_vars, N, pop_share, T, J, K = Eurocars_cleandata(dat_file, x_contvars, x_discretevars, z_IV_contvars, z_IV_discretevars, outside_option=OO)

In [5]:
# Create dictionaries of numpy arrays for each market. This allows the size of the data set to vary over markets.

dat = dat.reset_index(drop = True).sort_values(by = ['market', 'co']) # Sort data so that reshape is successfull

x = {t: dat[dat['market'] == t][x_vars].values.reshape((J[t],K)) for t in np.arange(T)} # Dict of explanatory variables
y = {t: dat[dat['market'] == t]['ms'].to_numpy().reshape((J[t])) for t in np.arange(T)} # Dict of market shares

### Demand derivatives and price Elasticity

While the demand derivatives in the Similarity Model are not quite as simple as in the Logit Model, they are still easy to compute. 
Let $q=P(u|\theta)$ be a vector of Similarity Choice Probabilities, which may be computed by methods presented in ... , for some vector $\theta$ of characteristic and nesting parameters $\theta = (\beta', \lambda')'$. The derivative of demand wrt. utility indexes $u$ is then given as,
$$
\nabla_u P(u|\theta)=\left(\nabla^2_{qq}\Omega(q|\lambda)\right)^{-1}-qq'
$$
where the $()^{-1}$ denotes the matrix inverse and $\Omega$ is the Similarity Pertubation Function described in .... The derivatives with respect to any characteristic $x_{tk\ell}$ can now easily be computed by the chain rule,
$$
    \frac{\partial P_j(u_t|\theta)}{\partial x_{tk\ell}}=\frac{\partial P_j(u_t|\theta)}{\partial u_{tk}}\frac{\partial u_{tk}}{\partial x_{tk\ell}}=\frac{\partial P_j(u_t|\theta)}{\partial u_{tk}}\beta_\ell,
$$

Finally, moving to price elasticity is the same as in the logit Model, if $x_{tk\ell}$ is the price of product $k$ in market $t$, then
$$
    \mathcal{E}_{jk}= \frac{\partial P_j(u_t|\theta)}{\partial x_{tk\ell}}\frac{1}{P_j(u_t|\theta)}=\frac{\partial P_j(u_t|\theta)}{\partial u_{tk}}\frac{1}{P_j(u_t|\theta)}\beta_\ell=\frac{\partial \ln P_j(u_t|\theta)}{\partial u_{tk}}\beta_\ell$$
we can also write this compactly as
$$
\nabla_u \ln P(u|\theta)=\mathrm{diag}(P(u|\theta))^{-1}\nabla_u P(u|\theta) = \mathrm{diag}(q)^{-1}\left[\left(\nabla^2_{qq}\Omega(q|\lambda)\right)^{-1}-qq'\right].
$$
Note that these elasticities may deviate significantly from the Logit elasticities. In particular, the IIA property will not generally apply to the Similarity Model. Additionally, the Similarity Model may detect both substitution and complementarity between products, contrasting it with well-known Nested Logit and Additive Random Utility Models according to which all products can only be substitutes. 

In [None]:
def compute_pertubation_hessian(q, x, Theta, model):
    '''
    This function calucates the hessian of the pertubation function \Omega

    Args.
        q: a dictionary of T numpy arrays (J[t],) of choice probabilities for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        Theta: a numpy array (K+G,) of parameters
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'
    
    Returns
        Hess: a dictionary of T numpy arrays (J[t],J[t]) of second partial derivatives of the pertubation function \Omega for each market t
    '''
    psi = model['psi']
    T = len(q.keys())
    K = x[0].shape[1]

    Gamma = Create_Gamma(Theta[K:], model) # Find the \Gamma matrices 
    
    Hess={}
    for t in np.arange(T):
        psi_q = np.einsum('cj,j->c', psi[t], q[t]) # Compute a matrix product
        Hess[t] = np.einsum('cj,c,cl->jl', Gamma[t], 1/psi_q, psi[t], optimize=True) # Computes the product \Gamma' diag(\psi q)^{-1} \psi (but faster)
        
    return Hess

In [None]:
def ccp_gradient(q, x, Theta, model):
    
    '''
    This function calucates the gradient of the choice proabilities wrt. characteristics

    Args.
        q: a dictionary of T numpy arrays (J[t],) of choice probabilities for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        Theta: a numpy array (K+G,) of parameters
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'
    
    Returns
        Grad: a dictionary of T numpy arrays (J[t],K) of partial derivatives of the choice proabilities wrt. utilities for each market t
    '''

    T = len(q.keys())
    Grad = {}
    Hess = compute_pertubation_hessian(q, x, Theta, model) # Compute the hessian of the pertubation function

    for t in np.arange(T):
        inv_omega_hess = la.inv(Hess[t]) # (J,J) for each t=1,...,T , computes the inverse of the Hessian
        qqT = q[t][:,None]*q[t][None,:] # (J,J) outerproduct of ccp's for each market t
        Grad[t] = inv_omega_hess - qqT  # Compute Similarity gradient of ccp's wrt. utilities

    return Grad

In [None]:
def Similarity_u_grad_Log_ccp(q, x, Theta, model):
    '''
    This function calucates the gradient of the log choice proabilities wrt. characteristics

    Args.
        q: a dictionary of T numpy arrays (J[t],) of choice probabilities for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        Theta: a numpy array (K+G,) of parameters
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'
    
    Returns
        Epsilon: a dictionary of T numpy arrays (J[t],J[t]) of partial derivatives of the log choice proabilities of products j wrt. utilites of products k for each market t
    '''

    T = len(q.keys())
    Epsilon = {}
    Grad = ccp_gradient(q, x, Theta, model) # Find the gradient of ccp's wrt. utilities
    
    for t in np.arange(T):
        Epsilon[t] = Grad[t]/q[t][:,None] # Computes diag(q)^{-1}Grad[t]

    return Epsilon

In [None]:
def Similarity_elasticity(q, x, Theta, model, char_number = K-1):
    ''' 
    This function calculates the elasticity of choice probabilities wrt. any characteristic or nest grouping of products

    Args.
        q: a dictionary of T numpy arrays (J[t],) of choice probabilities for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        Theta: a numpy array (K+G,) of parameters
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'
        char_number: an integer which is an index of the parameter in theta wrt. which we wish calculate the elasticity. Default is the index for the parameter of 'pr'.

    Returns
        a dictionary of T numpy arrays (J[t],J[t]) of choice probability semi-elasticities for each market t
    '''
    T = len(q.keys())
    Epsilon = {}
    Grad = Similarity_u_grad_Log_ccp(q, x, Theta, model) # Find the gradient of log ccp's wrt. utilities

    for t in np.arange(T):
        Epsilon[t] = Grad[t]*Theta[char_number] # Calculate semi-elasticities

    return Epsilon

### Diversion ratios for the Similarity Model

The diversion ratio to product j from product k is the fraction of consumers leaving product k and switching to product j following a one percent increase in the price of product k. Hence we have:

$$
\mathcal{D}_{tjk} = -100 \cdot \frac{\partial P_j(u_t|\lambda) / \partial x_{tk\ell}}{\partial P_k(u_t|\lambda) / \partial x_{tk\ell}} = -100 \cdot \frac{\partial P_j(u_t|\lambda) / \partial u_{tk}}{\partial P_k(u_t|\lambda) / \partial u_{tk}}
$$

Where $\mathcal{D}_{t} = \left( \mathcal{D}_{tjk} \right)_{j,k \in \{0,1,\ldots,J_t\}}$ is the matrix of diversion ratios for market $t$. This can be written more compactly as:

$$
\mathcal{D}_t = -100 \cdot  (\nabla_u P(u|\lambda) \circ I_J)^{-1}\nabla_u P(u|\lambda)
$$

In [None]:
def Similarity_diversion_ratio(q, x, Theta, model):
    '''
    This function calculates diversion ratios from the Similarity Model

    Args.
        q: a dictionary of T numpy arrays (J[t],) of choice probabilities for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        Theta: a numpy array (K+G,) of parameters
        psi: a dictionary of T numpy arrays (J[t] + sum(C_g),J[t]) of the J[t] by J[t] identity stacked on top of the \psi^g matrices for each market t as outputted by 'Create_nests'
        nest_count: a dictionary of T numpy arrays (G,) containing the amount of nests in each category g in each market t

    Returns
        Diversion_ratio: a dictionary of T numpy arrays (J,J) of diversion ratios from product j to product k for each individual i
    '''

    T = len(q.keys())

    Grad = ccp_gradient(q, x, Theta, model) # Find the derivatives of ccp's wrt. utilities
    inv_diaggrad = {t: np.divide(1, np.diag(Grad[t]), out = np.zeros_like(np.diag(Grad[t])), where = (np.diag(Grad[t]) != 0)) for t in np.arange(T)}  # Compute the inverse of the 'own'-derivatives of ccp's
    DR = {t: np.multiply(-100, np.einsum('j,jk->jk', inv_diaggrad[t], Grad[t])) for t in np.arange(T)} # Compute diversion ratios as a hadamard product.
    
    return DR 

Calculating the implied diversion ratios $\mathcal{ D}_t$ from our estimates $\hat \theta^{\text{Similarity}}$, we find for market $t=1$:

In [None]:
DR_hat = Similarity_diversion_ratio(qOpt, z_logit, ThetaOptBLP, Model)
pd.DataFrame(DR_hat[0]).rename_axis(index = 'Diversion ratio of product', columns = 'Diversion ratio wrt. product')

Diversion ratio wrt. product,0,1,2,3,4,5,6,7,8,9,...,35,36,37,38,39,40,41,42,43,44
Diversion ratio of product,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,-100.0,1.38748,2.660693,4.575202,3.318107,4.1119,2.27298,8.275516,0.617707,2.461866,...,1.677369,1.352992,3.017614,1.038177,0.769841,0.062257,2.963479,1.502768,3.624784,3.173936
1,60.4695,-100.0,1.529167,3.866165,1.749631,-1.189705,2.128258,-12.153008,-0.913291,-2.391402,...,1.606052,1.603856,1.928039,0.887003,0.730656,-0.093614,1.928391,0.626792,0.347782,-4.175364
2,61.653282,0.81303,-100.0,-0.086513,1.266147,1.517779,-0.027602,5.370454,0.294729,1.492123,...,0.163766,0.185656,-1.613793,-0.480251,-0.278623,0.030838,-1.329252,0.854338,2.510353,2.458086
3,59.907182,1.161554,-0.048886,-100.0,5.88146,2.639372,-2.763588,9.979497,0.479174,2.201255,...,-0.255751,-1.340111,0.136624,-0.133162,-0.068884,0.039358,0.03448,0.927468,-0.789652,8.21918
4,60.680062,0.734163,0.99926,8.21433,-100.0,2.039063,0.543389,4.498427,0.327,1.057969,...,-1.14879,0.295994,0.654812,0.399442,0.263775,0.000592,0.991939,-0.926073,1.060851,7.577503
5,68.006506,-0.451479,1.083316,3.333804,1.844093,-100.0,1.729358,-4.069884,-0.107392,-0.550378,...,1.122771,1.057597,1.720739,0.61401,0.478332,0.051016,1.605609,0.984403,0.972293,-1.858056
6,59.978718,1.288594,-0.031433,-5.569371,0.784073,2.75917,-100.0,24.218127,1.578768,2.512724,...,2.472756,1.21134,-0.471091,0.002155,0.135048,0.033321,-0.302582,0.955377,-1.004276,4.259038
7,61.229248,-2.063185,1.7148,5.639023,1.81999,-1.820698,6.790518,-100.0,0.380763,-2.951378,...,4.475344,3.692894,2.694916,0.888912,0.404741,-0.064511,2.492233,-0.358617,0.174512,-5.332898
8,60.265815,-2.044505,1.240938,3.570366,1.744541,-0.633506,5.83721,5.02087,-100.0,-1.865979,...,4.078448,3.471582,1.805713,0.605888,0.415755,0.111108,1.642932,0.099422,0.717657,-3.651505
9,64.887494,-1.446238,1.697227,4.430966,1.524804,-0.877101,2.509804,-10.513748,-0.504098,-100.0,...,1.712118,1.523249,2.311494,0.761393,0.649434,-0.065878,2.026014,0.322942,-0.240353,-3.694033
