Picking up from the discussion of simultaneous equations above, where y is N x k, and y = Xβ + u. If X is N x l and cov(u|X) = Ω; then this is a generalization of the assumption of homoskedasticity to a multivariate setting; the resulting structure is called a system of Seemingly Unrelated Regressions (SUR).

(1) If Ω isn't diagonal then there's a sense in which the different equations in the system are dependent, since observing a realization of, say, y1 may change our prediction of y2. (This is why the system is called seemingly unrelated.) Describe this dependence formally.

(Answer) If the covariance matrix, Ω, is not diagonal, then the errors in the different equations of the system are correlated. Since the errors in the two equations are no longer independent, observing a realization of y1 may change our prediction of y2. The joint probability density function of the errors, u, is no longer a product of independent normal densities but a multivariate normal density with covariance matrix Ω.

(2) Adapt the code in weighted_regression.ipynb so that the data-generating process for u can accommodate a general covariance matrix such as Ω, and let X = T. Estimate β.

(Answer) To accommodate a general covariance matrix for u, we can modify the generate_data function to draw errors from a multivariate normal distribution with mean 0 and covariance matrix Ω instead of the identity matrix I. The code is as follows:


In [29]:
%matplotlib inline 
#to enable displaying plots in line
import numpy as np
from scipy.stats import multivariate_normal

k = 3 # Number of observables in T

mu = [0]*k  #mean of the multivariate normal distribution
Sigma=[[1,0.5,0],
       [0.5,2,0],
       [0,0,3]]

#Sigma is the covariance matrix

T = multivariate_normal(mu,Sigma) #defining a multivariate normal distribution T
u = multivariate_normal(cov=0.2) #defining a univariate normal distribution u


In [30]:
beta = [1/2,1] # vector of coefficients that will be used to predict the response variable y

D = np.random.random(size=(3,2)) # Generate random 3x2 matrix

N=1000 # Sample size

# Now: Transform rvs into a sample
T = T.rvs(N)

u = u.rvs(N) # Replace u with a sample

X = (T**3)@D  # Note use of ** operator for exponentiation

y = X@beta + u # Note use of @ operator for matrix multiplication

In [31]:
from scipy.linalg import inv, sqrtm

b = np.linalg.lstsq(T.T@X,T.T@y)[0] # lstsqs returns several results

e = y - X@b

print(b)

TXplus = np.linalg.pinv(T.T@X) # Moore-Penrose pseudo-inverse

# Covariance matrix of b
vb = e.var()*TXplus@T.T@T@TXplus.T  # u is known to be homoskedastic

print(vb)

[0.50189069 0.9981324 ]
[[ 4.56001040e-06 -5.41981870e-06]
 [-5.41981870e-06  7.86523997e-06]]


  b = np.linalg.lstsq(T.T@X,T.T@y)[0] # lstsqs returns several results


In [32]:
# for SUR

import numpy as np
from scipy.stats import multivariate_normal
from scipy.stats import norm

# Define the parameters of the model
k1 = 3 # Number of observables in T1
k2 = 4 # Number of observables in T2
mu1 = [0]*k1 # Mean of T1
mu2 = [0]*k2 # Mean of T2
Sigma1 = [[1,0.5,0.2], [0.5,2,0], [0.2,0,3]] # Covariance of T1
Sigma2 = [[2,0,0,0.5], [0,1,0.2,0], [0,0.2,1,0], [0.5,0,0,3]] # Covariance of T2

# Generate the predictor variables
T1 = multivariate_normal(mu1, Sigma1)
T2 = multivariate_normal(mu2, Sigma2)

# Generate the error terms
u1 = multivariate_normal(cov=0.2)
u2 = multivariate_normal(cov=0.3)

In [33]:
# for SUR

beta1 = [1/2,1,-1] # Coefficients for T1
beta2 = [1,-1/2,0,1] # Coefficients for T2
rho = 0.5 # Correlation between the error terms

D = np.random.random(size=(3,2)) # Generate random 3x2 matrix

N=1000 # Sample size

# Now: Transform rvs into a sample
T1 = T1.rvs(N)
T2 = T2.rvs(N)

# Replace u with a sample
u1 = u1.rvs(N) 
u2 = u2.rvs(N) 

X1 = (T1**3)@D
X2 = (T2**3)@D 

# Generate the response variables
y1 = X1 @ beta1 + u1
y2 = X2 @ beta2 + u2

#y2 = X2 @ beta2 + rho * u1 + np.sqrt(1 - rho**2) * u2

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)

In [28]:
from scipy.linalg import inv, sqrtm

b = np.linalg.lstsq(T.T@X,T.T@y)[0] # lstsqs returns several results

e = y - X@b

print(b)

TXplus = np.linalg.pinv(T.T@X) # Moore-Penrose pseudo-inverse

# Covariance matrix of b
vb = e.var()*TXplus@T.T@T@TXplus.T  # u is known to be homoskedastic

print(vb)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)

(3) How are the estimates obtained from this SUR system different from what one would obtain if one estimated equation by equation using OLS?

(Answer) The SUR estimator takes into account the correlation between the errors in the different equations. If we estimated each equation separately using OLS, we would be assuming that the errors are uncorrelated, which would lead to inefficient estimates. The SUR estimator, on the other hand, uses information from all the equations to obtain more efficient estimates. 

