# Cross-sectional asset pricing
This notebook estimates the Fama-MacBeth ccross-sectional test on US temperature growth and controls, using 25 Fama-French portfolios sorted on Size and Book-to-Market.

**Source:** Essentials of Financial Economics  
**Authors:** Michael Donadelli, Michele Costola, Ivan Gufler  
**Date:** May 8, 2025


## 0. Preliminaries

This Python notebook estimates a Fama-MacBeth cross-sectional test on: 
- Market risk
- Consumption growth 
- Temperature growth

We use returns of the 25 portfolios sorted on Size and Book-to-Market, from January 1975 to December 2016. The risk-free rate is from Fama-French library and proxied by the one-month US Treasury bill. 



## 1. Import libraries

In [5]:
import pandas as pd
import numpy as np
import statsmodels.api as sm

## 2. Load and Prepare Data

In [6]:
Ret = pd.read_excel('../Data/PortfoliosLong.xlsx')
FFactors = pd.read_excel('../Data/FF_FactorsLong.xlsx')
Temperature = pd.read_excel('../Data/Temperature.xlsx')
Consumption = pd.read_excel('../Data/ConsumptionLong.xlsx')

# Calculate temperature change as the difference in log temperature
T = np.diff(np.log(Temperature.iloc[:, 1].values))

# Calculate percentage change in Consumption
# Consumption change: (Current Consumption / Previous Consumption) - 1
C = Consumption.iloc[1:, 1].values / Consumption.iloc[:-1, 1].values - 1

Mkt = FFactors.iloc[1:, 1].values / 100
Rf = FFactors.iloc[1:, 4].values / 100


ExRet = Ret.iloc[1:, 1:].values / 100 - Rf.reshape(-1, 1)

Factors = pd.DataFrame({'Mkt': Mkt, 'C': C, 'T': T})


## 3. First-stage regression

For each portfolio we estimate a CAPM model augmented with consumption growth and temperature growth. Specifically, for each asset $n$, we run the following regression:

$$
(R_{t} - r_{f,t}) = \alpha
+ \beta_{M}(R_{M,t} - r_{f,t})
+ \beta_{\Delta C}\,\Delta C
+ \beta_{\Delta T}\,\Delta T
+ \varepsilon_{t} \quad \forall n
$$

We store the slope coefficients $\beta_i$ with $i=\{M, \Delta C, \Delta T \}$. 

Considering the sample size of 25 assets, we obtain a matrix of first-stage coefficients $B$ of size $25 \times 3$.

In [12]:
# Get the size of the excess returns and factors data
n1, n2 = ExRet.shape # n1 = number of observations, n2 = number of portfolios
nF = Factors.shape[1] # nF = number of factors

# Initialize arrays to store results from the first stage
CoefAll = np.empty((nF, n2))
# Res stores the residuals from each regression
Res = np.empty((n1, n2))

# Loop through each portfolio to perform time-series regression
for i in range(n2):
    # Regress portfolio excess returns on the factors
    # Model: ExRet_i = alpha_i + beta_Mkt*Mkt + beta_dC*dC + beta_T*T + epsilon_i
    # Add a constant to the factors for the intercept (alpha) in OLS
    model = sm.OLS(ExRet[:, i], sm.add_constant(Factors)).fit()

    # Store the factor betas (excluding the intercept, which is at index 0)
    CoefAll[:, i] = round(model.params[1:], 5) # Assuming the original code intended rounding
    # Store the residuals
    Res[:, i] = round(model.resid, 5) # Assuming the original code intended rounding


# Calculate the variance-covariance matrix of the residuals
VarCovErr = np.cov(Res.T)

print("Size of B: ", CoefAll.shape)

Size of B:  (3, 25)


## 4. Second-stage regression

In the second stage we use the coefficients estimated in step 1 as regressors. The price of risk, $\lambda$ can be estimated using two equivalent methods:
- cross-sectional regression on average returns:

$$
\mathbb{E}[R_n - r_f]
= \lambda_M \hat{\beta}_{n,M}
+ \lambda_{\Delta C}\,\hat{\beta}_{n,\Delta C}
+ \lambda_{\Delta T}\,\hat{\beta}_{n,\Delta T}
+ \nu_n
$$

- T cross sectional regressions for eache time observation

$$
(R_{n,t} - r_{f,t})
= \lambda_{t,M}\,\hat{\beta}_{n,M}
+ \lambda_{t,\Delta C}\,\hat{\beta}_{n,\Delta C}
+ \lambda_{t,\Delta T}\,\hat{\beta}_{n,\Delta T} + \nu_{n,t}
$$

Using the second approach, the price of risk will be given by: $\lambda_i = \frac{1}{T} \sum_{t=1}^T \lambda_{i,t} \quad i=\{M, \Delta C, \Delta T\}$


Additionally, we correct for Errors-in-Variables, i.e. the fact that second-stage regressors, $\beta$s are estimates them self. The Shanken correction we apply to the variance of the price of risk reads as:

$$
\sigma^2_{\mathrm{OLS}}(\hat{\lambda})
= \frac{1}{T}
\left[
(\beta' \beta)^{-1}
\beta' \, \mathbb{E}(\varepsilon \varepsilon') \, \beta
(\beta' \beta)^{-1}
\left( 1 + \lambda' \mathbb{E}(f f')^{-1} \lambda \right)
+ \sigma^2 S'
\right]
$$

In [19]:
# Calculate the average excess return for each portfolio
MeanRet = np.mean(ExRet, axis=0)

# Transpose the factor betas matrix for the cross-sectional regression
Betas = CoefAll.T

# Perform cross-sectional regression of average returns on factor betas
model = sm.OLS(MeanRet, Betas).fit()

# Extract and store the factor risk premia (Lambdas) and their standard errors
SE = model.bse # Standard Errors of the Lambdas
Lambda = model.params # Factor Risk Premia (Lambdas)
Tstat = Lambda / SE # T-statistics (standard)

# Calculate Shanken-corrected standard errors
Sigma_f = np.cov(Factors, rowvar=False) # Covariance matrix of factors

B = Betas # Rename Betas for clarity in formula
BtB_inv = np.linalg.inv(B.T @ B) # Inverse of (Betas' * Betas)
# Calculate the correction term for Shanken standard errors
correction = 1 + Lambda.T @ np.linalg.inv(Sigma_f) @ Lambda
# Calculate the Shanken-corrected covariance matrix of Lambdas
VarLam = BtB_inv @ B.T @ VarCovErr @ B @ BtB_inv * correction + Sigma_f
# Divide by the number of observations (n1) as per the Fama-MacBeth approach
VarLam = VarLam / n1

SE_Shanken = np.sqrt(np.diag(VarLam)) # Shanken-corrected Standard Errors (diagonal of covariance matrix)
Tstat_Shanken = Lambda / SE_Shanken # Shanken-corrected T-statistics

# --- Time-series of cross-sectional regressions (for Newey-West standard errors) ---
nF = Betas.shape[1] # Number of factors (same as number of betas)
LambdaFull = np.full((n1, nF), np.nan) # Matrix to store Lambdas from each cross-sectional regression

# Loop through each time period (observation)
for j in range(n1):
    # Use returns from a single time period for the cross-sectional regression
    MeanRet_j = ExRet[j, :]
    # Fit cross-sectional model for the current time period (using betas from the first stage)
    model_j = sm.OLS(MeanRet_j, Betas).fit()
    # Store the estimated Lambdas for this time period
    LambdaFull[j, :] = model_j.params

# Calculate the mean of the estimated Lambdas across all time periods
LambdaMean = np.mean(LambdaFull, axis=0)


## 5. Display results

In [20]:
# Get the names of the portfolios from the original data
NamePort = Ret.iloc[:, 1:].columns

# Create a pandas DataFrame for the First Stage results (Factor Betas)
FirstStageReg = pd.DataFrame(
    Betas,
    columns=['Mkt', 'dC', 'T'], # Column names for factors
    index=NamePort # Row names for portfolios
)
print("First Stage Regression (Factor Betas):")
print(FirstStageReg)

# Optional: save first-stage estimates
#FirstStageReg.to_csv('FirstStage_T.csv', index_label='Portfolio') # Write to CSV

# Create a pandas DataFrame for the Second Stage results (Factor Risk Premia and T-stats)
SecondStage = pd.DataFrame(
    np.vstack([Lambda, Tstat, Tstat_Shanken]),
    columns=['Mkt', 'dC', 'T'], # Column names for factors
    index=['Lambda', 'tstat', 't-stat Shanken'] # Row names for statistics
)
print("\nSecond Stage Regression (Factor Risk Premia and T-statistics):")
print(SecondStage)

# Optional: save second-stage estimates
#SecondStage.to_csv('SecondStage_T.csv', index_label='Statistic') # Write to CSV


First Stage Regression (Factor Betas):
                Mkt       dC        T
SMALL LoBM  1.36650  1.36065  0.01576
ME1 BM2     1.18326  1.36758  0.01516
ME1 BM3     1.05255  0.98371  0.01875
ME1 BM4     0.96856  0.97596  0.01289
SMALL HiBM  1.00317  1.12686  0.02201
ME2 BM1     1.35882  0.78949  0.00634
ME2 BM2     1.13810  0.76580  0.01216
ME2 BM3     1.01990  0.59346  0.01099
ME2 BM4     0.97170  0.64785  0.00660
ME2 BM5     1.10008  0.93404  0.01578
ME3 BM1     1.28180  0.67498  0.00334
ME3 BM2     1.09406  0.62481  0.00632
ME3 BM3     0.98577  0.51610  0.00982
ME3 BM4     0.95959  0.55827  0.00358
ME3 BM5     1.03026  0.56430  0.00475
ME4 BM1     1.20770  0.36820  0.00240
ME4 BM2     1.06962  0.33236  0.00071
ME4 BM3     0.99785  0.26022  0.00604
ME4 BM4     0.94700  0.12704  0.00401
ME4 BM5     1.03653  0.31221  0.00820
BIG LoBM    0.98531 -0.29247 -0.00525
ME5 BM2     0.94152 -0.26532 -0.00100
ME5 BM3     0.87812 -0.23282  0.00002
ME5 BM4     0.89075  0.01969 -0.00225
BIG HiBM   