# Macro research
## Quantifying geopolitical uncertainty that isn’t yet reflected in the markets


### Introduction

-  The world is currently experiencing more conflicts than at any time since the post-9/11 Iraq war.
-  I believe the risk of nuclear war is at its highest level since the Cold War.
-  Despite these risks, the S&P 500 keeps reaching new highs, driven by the ChatGPT initiated tech boom.
-  Markets seem aware of war risks, as shown by Bitcoin's quick drop when Iran attacked Israel with war drones. S&P 500 did not reflect this drop as it was a Sunday.
-  This leads me to believe that that markets believe AI's economic benefits outweigh the potential negative impacts of war.
-  This study aims to:
  1. Measure the positive effects of AI and negative effects of war on the economy
  2. Analyze how these 2 factors are currently priced into markets
  3. Compare the current situations with the 2000 tech boom and Cold War era to identify possible future scenarios.

### Hypotheses

1. **AI's Economic Impact vs. War Risks**  
   Hypothesis: The S&P 500 and other global indices are more influenced by AI-driven growth trends than by geopolitical risks such as conflicts or war.

   - Test: Analyze correlations between AI-driven tech indices (e.g., NASDAQ Composite or specific AI-related ETFs) and macroeconomic indicators (e.g., GDP, unemployment). Compare these correlations during different periods of conflict.

2. **Bitcoin as a Geopolitical Risk Barometer**  
   Hypothesis: Bitcoin responds more rapidly to geopolitical tensions compared to traditional equity indices due to its decentralized nature.

   - Test: Analyze intraday price movements of Bitcoin and compare them to the S&P 500 during major geopolitical events, such as the Iran-Israel drone attack.

3. **Market Resilience During Conflict**  
   Hypothesis: The current resilience of the S&P 500 to geopolitical shocks is comparable to market behavior during the Cold War and the 2000 Tech Boom, suggesting that technological optimism outweighs geopolitical fears.

   - Test: Compare volatility indices (e.g., VIX) and treasury yields during key geopolitical tensions across the Cold War, 2000 Tech Boom, and post-2022 AI boom.

4. **Geopolitical Risk Premium in Fixed Income**  
   Hypothesis: Treasury yields reflect a greater sensitivity to geopolitical risks compared to equity indices, acting as a haven during periods of heightened tensions.

   - Test: Compare movements in 10-year and 30-year Treasury yields during major geopolitical events with equity indices' performance.

5. **Liquidity Indicators and War Risks**  
   Hypothesis: Increased liquidity injections by central banks during geopolitical crises stabilize markets, reducing immediate impacts of conflict-related news.

   - Test: Compare changes in the Federal Reserve Balance Sheet (WALCL) and Overnight Reverse Repo Agreements (RRPONTSYD) during major conflicts.

### Methodology

1. **Time Period Analysis**  
   Segment the dataset into the predefined periods:  
   - Post-2022 AI Tech Boom  
   - Cold War  
   - 2000 Tech Boom  

   Use these periods to analyze how macroeconomic indicators, equity indices, and fixed-income instruments responded to technological trends and geopolitical risks.

2. **Volatility and Sensitivity Analysis**  
   - Use historical daily returns for indices (`df_yahoo_returns`) to calculate rolling volatilities during conflict periods.  
   - Apply event studies to assess market reactions to major geopolitical events.

3. **Regression Analysis**  
   - Use equity returns (e.g., S&P 500) as the dependent variable and macro indicators (e.g., treasury spreads, inflation, GDP) as independent variables.  
   - Add dummy variables for conflict events to measure their incremental impact.

4. **Scenario Comparisons**  
   - Compare Sharpe ratios, volatility, and drawdowns for indices across the three periods.  
   - Evaluate how liquidity measures and stress indices (e.g., STLFSI) varied during high-conflict periods.

5. **Correlation Analysis**  
   - Assess correlations between Bitcoin and equity indices during geopolitical events.  
   - Compare correlations of growth indicators (e.g., GDP) with equity indices across periods.

6. **Visualizations**  
   - Plot historical returns with time-period-specific shading using `time_period_colors` for intuitive comparison.  
   - Overlay macroeconomic indicators like treasury yields, volatility (VIX), and Bitcoin prices on the same timeline.

### Need to add implied vol options data predicions on spy for forecasts

### DOWNLOAD DATA

In [5]:
import pandas as pd
from yahooquery import Ticker
from fredapi import Fred
import os
import warnings
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler

warnings.filterwarnings("ignore")

# Initialize Yahoo Finance and FRED tickers
yahoo_tickers = {
    # Equity Indices
    "^GSPC": "S&P 500",
    "^N225": "Nikkei 225",
    "^FTSE": "FTSE 100",
    "^HSI": "Hang Seng Index",
    "^IXIC": "Hang Seng Index",

    # Fixed Income
    "^TNX": "10-Year Treasury Yield",
    "^TYX": "30-Year Treasury Yield",
    "^FVX": "5-Year Treasury Yield",
    "^IRX": "13-Week Treasury Bill Yield",

    # Volatility
    "^VIX": "CBOE Volatility Index",

    # Crypto
    "BTC-USD": "Bitcoin_USD",
}

fred_series = {
    # Growth and Employment
    'GDP': 'Gross Domestic Product',
    'UNRATE': 'Unemployment Rate',
    'INDPRO': 'Industrial Production',
    'PAYEMS': 'Nonfarm Payrolls',
    'CIVPART': 'Labor Force Participation Rate',

    # Inflation and Prices
    'CPIAUCSL': 'Consumer Price Index',
    'PPIACO': 'Producer Price Index',
    'PCEPILFE': 'Core PCE Price Index',

    # Trade and Globalization
    'BOPGSTB': 'Trade Balance',
    'DTWEXBGS': 'US Dollar Index',

    # Financial Conditions
    'FEDFUNDS': 'Federal Funds Rate',
    'T10Y2Y': '10-Year Treasury Minus 2-Year Treasury Spread',
    'BAA10Y': 'Moody’s BAA Corporate Bond Yield Spread',
    'STLFSI': 'St. Louis Fed Financial Stress Index',
    
    # Liquidity Indicators
    'WALCL': 'Federal Reserve Balance Sheet',
    'M1SL': 'M1 Money Stock', # NARROW
    'M2SL': 'M2 Money Stock',
    'RRPONTSYD': 'Overnight Reverse Repo Agreements',
    'TOTALSL': 'Total Assets, Liquidity Facilities',
    'DFF': 'Effective Federal Funds Rate'
}

# Initialize data containers
yahoo_data = {}
fred_data = {}

# Fetch data from Yahoo Finance
for ticker, name in yahoo_tickers.items():
    try:
        data = Ticker(ticker)
        history = data.history(period="max")
        if not history.empty:  # Ensure data exists for the ticker
            history['Ticker'] = ticker
            history['Name'] = name
            yahoo_data[ticker] = history
            print(f"Fetched data for {name} ({ticker})")
        else:
            print(f"No data for {name} ({ticker})")
            
    except Exception as e:
        print(f"Error fetching data for {name} ({ticker}): {e}")

# Fetch data from FRED API (ensure you replace the API key with your own)
fred_api_key = os.getenv('FRED_API_KEY')  # Replace with your FRED API key or set it as an environment variable
fred = Fred(api_key=fred_api_key)

for series_id, series_name in fred_series.items():
    try:
        series_data = fred.get_series(series_id)
        if series_data is not None and not series_data.empty:  # Ensure data exists for the series
            series_df = pd.DataFrame(series_data, columns=['Value'])
            series_df['Ticker'] = series_id
            series_df['Name'] = series_name
            fred_data[series_id] = series_df
            print(f"Fetched data for {series_name} ({series_id})")
        else:
            print(f"No data for {series_name} ({series_id})")
            
    except Exception as e:
        print(f"Error fetching data for {series_name} ({series_id}): {e}")

# Define time periods and their corresponding colors
time_periods = {
    "Post-2022 AI Tech Boom": ("2022-01-01", "2025-01-01"),
    "Cold War": ("1947-01-01", "1991-12-31"),
    "2000 Tech Boom": ("1995-01-01", "2002-12-31")
}
time_period_colors = {
    "Post-2022 AI Tech Boom": "lightblue",
    "Cold War": "lightgreen",
    "2000 Tech Boom": "lightcoral"
}

# Combine Yahoo Finance data into a single DataFrame and save to CSV
if yahoo_data:
    df_yahoo = pd.concat(yahoo_data.values(), ignore_index=False)
    df_yahoo.reset_index(inplace=True)
    df_yahoo.to_csv('yahoo_data.csv', index=False)
else:
    print("No Yahoo Finance data to save.")

# Combine FRED data into a single DataFrame and save to CSV
if fred_data:
    df_fred = pd.concat(fred_data.values(), ignore_index=False)
    df_fred.reset_index(inplace=True)
    df_fred.rename(columns={"index":"date"}, inplace=True)
    df_fred.to_csv('fred_data.csv', index=False)
else:
    print("No FRED data to save.")

# Set 'date' as index for easier manipulation
df_yahoo.set_index('date', inplace=True)
df_fred.set_index('date', inplace=True)

# Calculate daily returns for each ticker in df_yahoo (using 'adjclose' column)
tickers_yahoo = df_yahoo['Ticker'].unique()
tickers_fred = df_fred['Ticker'].unique()

# Exclude fixed-income yield tickers for returns calculations
exclude_tickers = ['^TNX', '^TYX', "^FVX", "^IRX"]  # Exclude Treasury yields from returns calculations

# Initialize an empty DataFrame for returns
df_yahoo_returns = pd.DataFrame()

# Loop through tickers_yahoo (the list of tickers you want to process)
for ticker in tickers_yahoo:
    if ticker not in exclude_tickers:  # Only include tickers that are not in the exclude list
        # Filter the dataframe for the current ticker
        ticker_data = df_yahoo[df_yahoo['symbol'] == ticker]
        
        # Calculate percentage change for adjusted close prices and drop NaNs
        df_yahoo_returns[ticker] = ticker_data['adjclose'].pct_change().dropna()

def date_filter(df, start_date, end_date):
    """
    Filter a DataFrame by the given date range.
    Assumes the index is a datetime index.
    """
    return df[(df.index >= start_date) & (df.index <= end_date)]

def ensure_datetime_index(df):
    """
    Ensure the index is a timezone-naive datetime index.
    If the index is timezone-aware, convert to UTC and remove the timezone.
    If it's naive, just ensure it's a correct datetime format.
    """
    if isinstance(df.index, pd.DatetimeIndex):
        if df.index.tz is not None:
            print("Timezone-aware datetime detected. Converting to naive datetime.")
            df.index = df.index.tz_convert('UTC').tz_localize(None)
        else:
            print("Timezone-naive datetime detected.")
            df.index = pd.to_datetime(df.index)  # Ensure it's a valid datetime
    else:
        print("Index is not datetime, converting to naive datetime.")
        df.index = pd.to_datetime(df.index)

    return df

def handle_mixed_datetime(df):
    """
    Handle mixed timezone-aware and timezone-naive datetimes in the DataFrame index.
    Convert all datetimes to timezone-naive.
    """
    if isinstance(df.index, pd.DatetimeIndex):
        if df.index.tz is not None:
            print("Timezone-aware index detected. Converting to naive datetime.")
            df.index = df.index.tz_convert('UTC').tz_localize(None)
        else:
            print("Timezone-naive index detected. Ensuring it's converted to a valid datetime.")
            df.index = pd.to_datetime(df.index)  # Ensure it's valid datetime

    return df

# Apply the data cleaning functions to your DataFrames
df_fred = handle_mixed_datetime(df_fred)
df_yahoo = handle_mixed_datetime(df_yahoo)
df_yahoo_returns = handle_mixed_datetime(df_yahoo_returns)

df_fred = ensure_datetime_index(df_fred)
df_yahoo = ensure_datetime_index(df_yahoo)
df_yahoo_returns = ensure_datetime_index(df_yahoo_returns)

df_fred.dropna(inplace=True)

def event_driven_change(event_date, df, column_name, days=30):
    """
    Calculate the percentage change in a specified column after a given event date.
    Tries to find the closest available date if the exact event date is not in the DataFrame.
    
    Parameters:
    - event_date (pd.Timestamp): The event date.
    - df (pd.DataFrame): DataFrame containing the data.
    - column_name (str): The column name for which we want to calculate the percentage change.
    - days (int): The number of days after the event to calculate the change (default is 30).
    
    Returns:
    - (float): The percentage change in the value of the specified column after the given number of days.
    """
    # Ensure the column exists in the DataFrame
    if column_name not in df.columns:
        raise ValueError(f"Column {column_name} does not exist in the DataFrame.")
    
    # Check if the event_date is in the dataframe
    if event_date not in df.index:
        # Find the closest available date (nearest)
        timedeltas = df.index - event_date
        closest_date = df.index[(np.abs(timedeltas)).argmin()]  # Fix here: use np.abs on timedeltas
        print(f"Event date {event_date} not found. Using closest available date: {closest_date}")
        event_date = closest_date
    
    # Get the value on the event date for the selected column
    value_on_event_date = df.loc[event_date, column_name]
    
    # Calculate the end date (30 days later)
    end_date = event_date + pd.Timedelta(days=days)
    
    # Check if end_date exists in the DataFrame, if not, use the closest available date
    if end_date not in df.index:
        # Find the closest available end_date (bfill)
        timedeltas = df.index - end_date
        end_date = df.index[(np.abs(timedeltas)).argmin()]
        print(f"End date {end_date} not found. Using closest available date.")
    
    # Get the value 30 days after the event date
    value_30_days_later = df.loc[end_date, column_name]
    
    # Calculate the percentage change over 30 days
    percentage_change = (value_30_days_later - value_on_event_date) / value_on_event_date * 100
    return percentage_change


# Example: Event-driven analysis for Bitcoin vs. S&P 500 during an event
event_date = pd.Timestamp('2023-10-01')
print("Bitcoin event-driven change: ", event_driven_change(event_date, df_yahoo_returns, 'BTC-USD'))
print("S&P 500 event-driven change: ", event_driven_change(event_date, df_yahoo_returns, '^GSPC'))

# Define correlation function for Bitcoin and S&P 500 during conflict periods
def correlation_analysis(df_returns, ticker1, ticker2, start_date, end_date):
    df_event = date_filter(df_returns, start_date, end_date)
    correlation = df_event[[ticker1, ticker2]].corr().iloc[0, 1]
    return correlation

# Example: Correlation between Bitcoin and S&P 500 during a geopolitical event
event_start_date = pd.Timestamp('2023-10-01')
event_end_date = pd.Timestamp('2023-10-15')
correlation_btc_sp500 = correlation_analysis(df_yahoo_returns, 'BTC-USD', '^GSPC', event_start_date, event_end_date)
print(f"Correlation between Bitcoin and S&P 500 during the event: {correlation_btc_sp500}")


Fetched data for S&P 500 (^GSPC)
Fetched data for Nikkei 225 (^N225)
Fetched data for FTSE 100 (^FTSE)
Fetched data for Hang Seng Index (^HSI)
Fetched data for Hang Seng Index (^IXIC)
Fetched data for 10-Year Treasury Yield (^TNX)
Fetched data for 30-Year Treasury Yield (^TYX)
Fetched data for 5-Year Treasury Yield (^FVX)
Fetched data for 13-Week Treasury Bill Yield (^IRX)
Fetched data for CBOE Volatility Index (^VIX)
Fetched data for Bitcoin_USD (BTC-USD)
Fetched data for Gross Domestic Product (GDP)
Fetched data for Unemployment Rate (UNRATE)
Fetched data for Industrial Production (INDPRO)
Fetched data for Nonfarm Payrolls (PAYEMS)
Fetched data for Labor Force Participation Rate (CIVPART)
Fetched data for Consumer Price Index (CPIAUCSL)
Fetched data for Producer Price Index (PPIACO)
Fetched data for Core PCE Price Index (PCEPILFE)
Fetched data for Trade Balance (BOPGSTB)
Fetched data for US Dollar Index (DTWEXBGS)
Fetched data for Federal Funds Rate (FEDFUNDS)
Fetched data for 10-Yea