US Macroeconomic Interdependencies: A VAR and Granger Causality Analysis
--

This project aims to explore the long-term dynamic interactions among key macroeconomic indicators of the U.S. economy from January 1978 to May 2025 using multivariate time series techniques. The selected variables span critical dimensions of the economy:

Personal Consumption Expenditure (PCE): Represents household spending and reflects aggregate demand from the consumer side.

Money Supply (M2): Indicates the monetary environment and is often a tool used in monetary policy decisions by the Federal Reserve.

Industrial Production Index (INDPRO): Measures the output of the manufacturing, mining, and utilities sectors, capturing supply and real economic activity.

University of Michigan Consumer Sentiment Index (UMCSENT): Captures consumer confidence and expectations, which influence future spending behavior and overall economic outlook.

Together, these variables provide a comprehensive view of the U.S. macroeconomy, covering demand, supply, liquidity, and psychological factors. The goal is to understand how these variables influence one another over time using a Vector Autoregression (VAR) framework and Granger causality tests.

In [2]:
import warnings 
warnings.filterwarnings('ignore')

In [3]:
pip install openpyxl


Note: you may need to restart the kernel to use updated packages.


Step 1: Importing and Inspecting the Data
-

We begin by loading the dataset TS.xlsx, which contains monthly observations of four key U.S. macroeconomic variables (UMSCENT,M2,PCE and INDPRO) from January 1978 to May 2025.

The Month column represents the timestamp for each observation. We use pandas to read the Excel file and display the first five rows using df.head() to confirm the dataset has been loaded correctly and is in the expected format.

From the output, we can see that the dataset appears well-structured with no obvious missing values in the initial rows.

In [4]:
import pandas as pd

# Read an Excel file (.xlsx)
df = pd.read_excel("TS.xlsx")

# Show the first few rows
print(df.head())


       Month  UMCSENT      M2     PCE   INDPRO
0 1978-01-01     83.7  1279.7  1329.5  47.7512
1 1978-02-01     84.3  1285.5  1355.1  48.0098
2 1978-03-01     78.8  1292.2  1377.5  48.9358
3 1978-04-01     81.6  1300.4  1396.4  49.9074
4 1978-05-01     82.9  1310.5  1412.0  50.1513


In [5]:
df

Unnamed: 0,Month,UMCSENT,M2,PCE,INDPRO
0,1978-01-01,83.7,1279.7,1329.5,47.7512
1,1978-02-01,84.3,1285.5,1355.1,48.0098
2,1978-03-01,78.8,1292.2,1377.5,48.9358
3,1978-04-01,81.6,1300.4,1396.4,49.9074
4,1978-05-01,82.9,1310.5,1412.0,50.1513
...,...,...,...,...,...
564,2025-01-01,71.7,21519.8,20370.0,102.9343
565,2025-02-01,64.7,21613.5,20436.3,104.0003
566,2025-03-01,57.0,21706.4,20578.5,103.7483
567,2025-04-01,52.2,21862.4,20622.0,103.8216


Step 2: Log Transformation of Variables
-

In [6]:
import numpy as np

To stabilize the variance and convert exponential growth trends into linear trends, we apply a natural logarithmic transformation to three of the four macroeconomic variables:

M2 → log_M2

PCE → log_PCE

INDPRO → log_INDPRO

We do not apply a log transformation to UMCSENT (Consumer Sentiment), as it is already an index and does not exhibit exponential growth characteristics like monetary or real activity variables.

Log-transforming these variables also makes the interpretation of percentage changes easier in later analysis (e.g., in differenced series, changes represent approximate growth rates).

In [7]:
#Convert variables to logarithmic scale
df['log_M2'] = np.log(df['M2'])
df['log_PCE'] = np.log(df['PCE'])
df['log_INDPRO'] = np.log(df['INDPRO'])


Step 3: DateTime Conversion and Indexing
-

The Month column is converted to DateTime format to ensure proper handling of time series operations (such as resampling, lagging, or plotting trends over time).

We then set the Month column as the index of the DataFrame, making it a time series index.
This allows libraries like statsmodels and pandas to recognize the temporal ordering of observations, which is crucial for forecasting and causality tests.

In [8]:
df['Month'] = pd.to_datetime(df['Month'])
df.set_index('Month', inplace=True)


Step 4: Confirming the Transformed DataFrame
-


After applying log transformations and setting the time-based index, we inspect the first few rows of the updated DataFrame using df.head().

This confirms that:

The Month column has successfully become the index.

The new log-transformed variables (log_M2, log_PCE, and log_INDPRO) are now present in the dataset alongside the original columns.

This structure sets the foundation for conducting time series diagnostics and building a multivariate model.

In [9]:
df.head()

Unnamed: 0_level_0,UMCSENT,M2,PCE,INDPRO,log_M2,log_PCE,log_INDPRO
Month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1978-01-01,83.7,1279.7,1329.5,47.7512,7.154381,7.192558,3.866004
1978-02-01,84.3,1285.5,1355.1,48.0098,7.158903,7.211631,3.871405
1978-03-01,78.8,1292.2,1377.5,48.9358,7.164101,7.228026,3.890509
1978-04-01,81.6,1300.4,1396.4,49.9074,7.170427,7.241653,3.910169
1978-05-01,82.9,1310.5,1412.0,50.1513,7.178164,7.252762,3.915044


Step 5: Testing for Stationarity – ADF Test
-

We now assess the stationarity of each variable using the Augmented Dickey-Fuller (ADF) test, which checks for the presence of a unit root (a sign of non-stationarity).

Null Hypothesis (H₀): The series has a unit root (non-stationary).

Alternative Hypothesis (H₁): The series is stationary.

We apply the test to the following variables: log_M2 (Money Supply), log_PCE (Consumption Expenditure), log_INDPRO (Industrial Production) and UMCSENT (Consumer Sentiment)

For each variable, we inspect:

The ADF test statistic (more negative → stronger evidence against H₀)

The p-value (if < 0.05, we reject H₀ → series is stationary)

The critical values at the 1%, 5%, and 10% significance levels

🔍 Interpretation:
If a variable is found to be non-stationary, we will take its first difference in the next step to induce stationarity, which is essential for reliable VAR modeling and Granger causality analysis.

In [10]:
from statsmodels.tsa.stattools import adfuller

In [11]:
for col in ['log_M2', 'log_PCE', 'log_INDPRO', 'UMCSENT']:
    result = adfuller(df[col].dropna())
    print(f'ADF Test for {col}:')
    print(f'  Test Statistic: {result[0]:.4f}')
    print(f'  p-value: {result[1]:.4f}')
    print(f'  Critical Values: {result[4]}')
    print('-' * 40)


ADF Test for log_M2:
  Test Statistic: -0.6181
  p-value: 0.8668
  Critical Values: {'1%': -3.442102384299813, '5%': -2.8667242618524233, '10%': -2.569531046591633}
----------------------------------------
ADF Test for log_PCE:
  Test Statistic: -2.9787
  p-value: 0.0369
  Critical Values: {'1%': -3.4421447800270673, '5%': -2.8667429272780858, '10%': -2.5695409929766093}
----------------------------------------
ADF Test for log_INDPRO:
  Test Statistic: -1.2145
  p-value: 0.6673
  Critical Values: {'1%': -3.4421447800270673, '5%': -2.8667429272780858, '10%': -2.5695409929766093}
----------------------------------------
ADF Test for UMCSENT:
  Test Statistic: -2.0859
  p-value: 0.2502
  Critical Values: {'1%': -3.4420185006698127, '5%': -2.8666873299250253, '10%': -2.5695113665058726}
----------------------------------------


log_M2, log_INDPRO, and UMCSENT have p-values > 0.05, so we fail to reject the null hypothesis. These series are non-stationary.

log_PCE has a p-value < 0.05, suggesting it is stationary at the 5% significance level. However, since we’ll be including other non-stationary variables in our multivariate analysis, we will difference all variables to maintain consistency.

📌 Conclusion:
Most variables are non-stationary in levels. Therefore, we will transform all four variables into first differences to achieve stationarity across the dataset before moving on to VAR modeling and Granger causality testing.

In [12]:
df['d_log_M2'] = df['log_M2'].diff()
df['d_log_INDPRO'] = df['log_INDPRO'].diff()
df['d_UMCSENT'] = df['UMCSENT'].diff()
df['d_log_PCE'] = df['log_PCE'].diff()
df.dropna(inplace=True)

Step 6: Making the Variables Stationary – First Differencing
-

Since most of our variables were non-stationary at level, we applied first differencing to each of them:

d_log_M2: First difference of log-transformed Money Supply

d_log_PCE: First difference of log-transformed Personal Consumption Expenditure

d_log_INDPRO: First difference of log-transformed Industrial Production

d_UMCSENT: First difference of Consumer Sentiment

We now re-apply the ADF test to these differenced series.

In [13]:
for col in ['d_log_M2','d_log_PCE', 'd_log_INDPRO', 'd_UMCSENT']:
    result = adfuller(df[col].dropna())
    print(f'ADF Test for {col}:')
    print(f'  Test Statistic: {result[0]:.4f}')
    print(f'  p-value: {result[1]:.4f}')
    print(f'  Critical Values: {result[4]}')
    print('-' * 40)


ADF Test for d_log_M2:
  Test Statistic: -4.7374
  p-value: 0.0001
  Critical Values: {'1%': -3.442102384299813, '5%': -2.8667242618524233, '10%': -2.569531046591633}
----------------------------------------
ADF Test for d_log_PCE:
  Test Statistic: -5.9894
  p-value: 0.0000
  Critical Values: {'1%': -3.4421447800270673, '5%': -2.8667429272780858, '10%': -2.5695409929766093}
----------------------------------------
ADF Test for d_log_INDPRO:
  Test Statistic: -6.2304
  p-value: 0.0000
  Critical Values: {'1%': -3.4421447800270673, '5%': -2.8667429272780858, '10%': -2.5695409929766093}
----------------------------------------
ADF Test for d_UMCSENT:
  Test Statistic: -13.6122
  p-value: 0.0000
  Critical Values: {'1%': -3.4420185006698127, '5%': -2.8666873299250253, '10%': -2.5695113665058726}
----------------------------------------


🟩 All differenced variables now have p-values < 0.05 and test statistics lower than the 1% critical value, indicating strong evidence against the null hypothesis of a unit root.

📌 Conclusion:
All series are now stationary after first differencing and ready to be included in a VAR model. This satisfies a key precondition for performing multivariate time series modeling and Granger causality testing reliably.

Step 7: Selecting the Optimal Lag Length for VAR
-

Before fitting a Vector Autoregression (VAR) model, we must determine the optimal lag length, which is crucial to accurately capturing the intertemporal dynamics among the variables.

We use the select_order() method from the statsmodels library to test multiple lag lengths (up to 12 lags) and choose the best one based on standard information criteria:

AIC (Akaike Information Criterion)

BIC (Bayesian Information Criterion)

FPE (Final Prediction Error)

HQIC (Hannan-Quinn Information Criterion)

In [14]:
from statsmodels.tsa.api import VAR

# Step 1: Create a DataFrame with all your stationary series
df_model = df[['d_log_PCE', 'd_log_M2', 'd_log_INDPRO', 'd_UMCSENT']]

# Step 2: Fit a lag-order selection model
model = VAR(df_model)
lag_order_results = model.select_order(maxlags=12)

# Step 3: Print results
print(lag_order_results.summary())


 VAR Order Selection (* highlights the minimums)  
       AIC         BIC         FPE         HQIC   
--------------------------------------------------
0       -27.20      -27.16   1.545e-12      -27.18
1       -28.09     -27.94*   6.312e-13      -28.03
2       -28.14      -27.86   6.010e-13     -28.03*
3       -28.17      -27.76   5.856e-13      -28.01
4       -28.20      -27.67   5.642e-13      -28.00
5      -28.22*      -27.57  5.539e-13*      -27.97
6       -28.21      -27.43   5.602e-13      -27.91
7       -28.20      -27.29   5.685e-13      -27.84
8       -28.22      -27.19   5.553e-13      -27.82
9       -28.22      -27.07   5.552e-13      -27.77
10      -28.21      -26.94   5.588e-13      -27.72
11      -28.20      -26.80   5.686e-13      -27.65
12      -28.20      -26.67   5.683e-13      -27.60
--------------------------------------------------


  self._init_dates(dates, freq)


Step 8: Estimating the VAR Model
-

With the optimal lag length selected (in this case, 5 lags), we now fit a Vector Autoregression (VAR) model using the stationary time series variables:

d_log_PCE: Growth in Personal Consumption Expenditures

d_log_M2: Growth in Money Supply (M2)

d_log_INDPRO: Growth in Industrial Production

d_UMCSENT: Change in Consumer Sentiment Index

Each equation in the VAR system models one variable as a function of its own lags and the lags of all other variables, allowing us to explore dynamic interrelationships among the economic indicators.

In [15]:
from statsmodels.tsa.api import VAR

# Use your cleaned stationary DataFrame
df_model = df[['d_log_PCE', 'd_log_M2', 'd_log_INDPRO', 'd_UMCSENT']]

# Fit VAR with selected lag
model = VAR(df_model)
var_result = model.fit(5)

# Summary
print(var_result.summary())


  self._init_dates(dates, freq)


  Summary of Regression Results   
Model:                         VAR
Method:                        OLS
Date:           Thu, 24, Jul, 2025
Time:                     11:48:35
--------------------------------------------------------------------
No. of Equations:         4.00000    BIC:                   -27.5978
Nobs:                     563.000    HQIC:                  -27.9920
Log likelihood:           4839.34    FPE:                5.41610e-13
AIC:                     -28.2444    Det(Omega_mle):     4.67810e-13
--------------------------------------------------------------------
Results for equation d_log_PCE
                     coefficient       std. error           t-stat            prob
----------------------------------------------------------------------------------
const                   0.002244         0.000869            2.583           0.010
L1.d_log_PCE            0.161384         0.061468            2.625           0.009
L1.d_log_M2            -0.084103         0.11345

📊 Interpretation of Key Coefficients
-

We examine the regression results equation-by-equation:

Equation for d_log_PCE (Consumption growth):
-

Lag 2 of d_log_M2 (money supply) has a positive and significant effect on PCE growth (p < 0.001).

Lag 3 of d_log_INDPRO (industrial output) is also positively significant, suggesting a feedback from production to consumption.

Most other variables are not statistically significant at conventional levels.

Equation for d_log_M2 (Money supply growth):
-

Strong autocorrelation: lag 1 of M2 is highly significant, showing persistence in monetary expansion.

Lag 1 of d_log_PCE is negatively significant, indicating a potential inverse short-term feedback from consumption to money growth.

Lag 3 of d_log_INDPRO and lag 5 of d_log_INDPRO also show statistical significance, suggesting some production-related influence on money dynamics.

Equation for d_log_INDPRO (Production growth):
-

Lag 1 of d_log_PCE is positive and highly significant, suggesting strong influence of consumption on production.

Lag 2 of d_log_M2 is significant, indicating a delayed monetary effect on production.

Lag 3 of industrial production itself is also significant, confirming persistence.

Equation for d_UMCSENT (Consumer sentiment):
-

Mostly insignificant except for lag 2 and lag 5 of itself — indicating some mean-reverting or autoregressive behavior.

Lag 2 of d_log_PCE has a marginal negative effect, and lag 5 of sentiment is significant, suggesting sentiment is largely driven by internal psychological lags rather than economic fundamentals in this model.

🔗 Residual Correlation Matrix
-
The correlation matrix of residuals shows the contemporaneous correlations between error terms of each equation:

d_log_PCE and d_log_INDPRO have a strong positive correlation (~0.68), suggesting a close relationship between consumption and production growth.

d_log_PCE and d_log_M2 are negatively correlated (-0.44) — possibly reflecting an inverse monetary-consumption adjustment in the short run.


Step 9: Granger Causality Tests
-
Granger causality tests whether past values of one time series can help predict the current values of another.

In [19]:
# Granger Causality Test Function
from statsmodels.tsa.stattools import grangercausalitytests

# Test each pair: Does 'cause' help predict 'effect'?
def granger_tests(df_model, maxlag):
    variables = df_model.columns
    for cause in variables:
        for effect in variables:
            if cause != effect:
                print(f"\nTesting: Does {cause} Granger-cause {effect}?")
                grangercausalitytests(df_model[[effect, cause]], maxlag=maxlag, verbose=True)

# Run for your dataset (maxlag = 5)
granger_tests(df_model, maxlag=5)



Testing: Does d_log_PCE Granger-cause d_log_M2?

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=29.9937 , p=0.0000  , df_denom=564, df_num=1
ssr based chi2 test:   chi2=30.1533 , p=0.0000  , df=1
likelihood ratio test: chi2=29.3788 , p=0.0000  , df=1
parameter F test:         F=29.9937 , p=0.0000  , df_denom=564, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=14.8134 , p=0.0000  , df_denom=561, df_num=2
ssr based chi2 test:   chi2=29.8908 , p=0.0000  , df=2
likelihood ratio test: chi2=29.1283 , p=0.0000  , df=2
parameter F test:         F=14.8134 , p=0.0000  , df_denom=561, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=13.5808 , p=0.0000  , df_denom=558, df_num=3
ssr based chi2 test:   chi2=41.2534 , p=0.0000  , df=3
likelihood ratio test: chi2=39.8169 , p=0.0000  , df=3
parameter F test:         F=13.5808 , p=0.0000  , df_denom=558, df_num=3

Granger Causality
number of lags (no zero) 4

In [27]:
import pandas as pd
from statsmodels.tsa.stattools import grangercausalitytests

def summarize_granger_results(df, variables, maxlag=5, alpha=0.05):
    results = []

    for cause in variables:
        for effect in variables:
            if cause == effect:
                continue
            test_result = grangercausalitytests(df[[effect, cause]], maxlag=maxlag, verbose=False)
            
            # Collect p-value from the F-test at specified lag
            p_value = test_result[maxlag][0]['ssr_ftest'][1]
            is_causal = 'Yes' if p_value < alpha else 'No'
            
            results.append({
                'Causing': cause,
                'Caused': effect,
                'Lag': maxlag,
                'p-value': p_value,
                'Granger Causal': is_causal
            })

    return pd.DataFrame(results)

# Run on your differenced dataset
variables = ['d_log_PCE', 'd_log_M2', 'd_log_INDPRO', 'd_UMCSENT']
granger_summary = summarize_granger_results(df_model, variables, maxlag=5)
print(granger_summary)


         Causing        Caused  Lag       p-value Granger Causal
0      d_log_PCE      d_log_M2    5  2.580752e-09            Yes
1      d_log_PCE  d_log_INDPRO    5  4.148106e-11            Yes
2      d_log_PCE     d_UMCSENT    5  8.802134e-02             No
3       d_log_M2     d_log_PCE    5  8.628393e-07            Yes
4       d_log_M2  d_log_INDPRO    5  6.397463e-10            Yes
5       d_log_M2     d_UMCSENT    5  6.320615e-01             No
6   d_log_INDPRO     d_log_PCE    5  1.352862e-02            Yes
7   d_log_INDPRO      d_log_M2    5  4.948240e-07            Yes
8   d_log_INDPRO     d_UMCSENT    5  3.614982e-01             No
9      d_UMCSENT     d_log_PCE    5  4.711025e-01             No
10     d_UMCSENT      d_log_M2    5  3.442779e-01             No
11     d_UMCSENT  d_log_INDPRO    5  6.631679e-03            Yes


🧠 Interpretation and Insights
-
🔹 Production Side (INDPRO):
Granger-caused by: M2, PCE, UMCSENT
→ Suggests that changes in money supply, consumer spending, and even consumer sentiment have predictive power over future industrial production activity.

🔹 Consumer Side (PCE):
Granger-caused by: M2, INDPRO
→ Indicates that monetary policy and output levels can influence future consumption patterns.

🔹 Monetary Side (M2):
Granger-caused by: PCE, INDPRO
→ Reflects a potential feedback loop, where monetary aggregates respond to consumption and production cycles.

🔹 Consumer Sentiment (UMCSENT):
Not Granger-caused by any variable
→ Suggests that sentiment is more reactive, possibly driven by external shocks or exogenous information rather than predictable economic flows.

📌 Conclusion:
This Granger causality analysis demonstrates dynamic interlinkages between monetary policy (M2), consumer behavior (PCE & sentiment), and production (INDPRO). The strong bidirectional causality between M2 and both PCE & INDPRO points toward a feedback-driven economic structure, while consumer sentiment appears more autonomous or influenced by unmodeled factors.