# "Digital Gold": A Visual Analysis of Bitcoin as a Hedge Against Inflation and Market Volatility

As inflation creates general unease within our economy, we're seeing a rise in investments into alternative commodities and assets that are said to "hedge"—an investment that's expected to hold or increase its value over time, even as the cost of everything else goes up. One of these assets is Bitcoin ($BTC), a cryptocurrency often dubbed "Digital Gold" based on popular sentiment. 

As governments printed significant amounts of money, especially in recent years, many investors have looked for assets that can protect their wealth from the resulting inflation. Proponents claim Bitcoin, with its fixed supply, is a perfect candidate. This project explores and investigates this popular claim, asking:

- **Does Bitcoin's volatility change during periods of high vs. low inflation? A good hedge should ideally be stable, but what if Bitcoin becomes more chaotic and unpredictable precisely when you need it to be a safe haven?**
- **During major stock market crashes, does Bitcoin act as a 'safe haven' by holding its value, or does it crash even harder?**
- **Does the relationship between Bitcoin and inflation stay the same over time, or does it change depending on the market environment?**

## Mining Metrics: Sourcing Our Datasets
Several key financial and economic datasets are needed to investigate the relationship between Bitcoin, inflation, and broader market behavior. The following data was sourced from Yahoo Finance (via the `yfinance` library) and the Federal Reserve Economic Data (FRED) database.

**1. Bitcoin (BTC-USD)**
> This dataset contains the daily price history of Bitcoin valued in U.S. Dollars, which is central to evaluating its performance as an asset.

- **Source:** Yahoo Finance (`yfinance` Ticker: BTC-USD)
- **Link:** https://finance.yahoo.com/quote/BTC-USD/
- **Features**:
    - **Date:** The trading day
    - **Open/High/Low/Close:** The opening, highest, lowest, and closing prices for the day.
    - **Volume:** The total number of Bitcoins traded.

**2. S&P 500 Index (^GSPC)**
> This data tracks the performance of 500 of the largest publicly-traded companies in the United States, offering a snapshot of the overall health of the U.S. stock market. This is essential for analyzing how Bitcoin behaves during broad market movements and crashes.

- **Source:** Yahoo Finance (`yfinance` Ticker: ^GSPC)
- **Link:** https://finance.yahoo.com/quote/GSPC/
- **Features:** Includes the same trading day, OHLC (Open, High, Low, Close) and Volume data points as the Bitcoin dataset, but for the S&P 500 index.

**3. CBOE Volatility Index (^VIX)**
> Also known as a "fear index," the VIX measures expected market volatility. It is crucial for understanding how Bitcoin's own volatility and price action correlate with periods of market fear and uncertainty.

 - **Source:** Yahoo Finance (`yfinance` Ticker: ^VIX)
 - **Link:** https://finance.yahoo.com/quote/^VIX/
 - **Features:** Contains OHLC data points representing the daily values of the index.

**4. Gold Futures (GC=F)**
> This dataset tracks the price of gold, the traditional safe-haven asset. It provides a direct benchmark to compare against Bitcoin's performance as an inflation hedge and store of value.

 - **Source:** Yahoo Finance (`yfinance` Ticker: GC=F)
 - **Link:** https://finance.yahoo.com/quote/GC=F/
 - **Features:** Contains OHLC and Volume data for gold futures contracts.

**5. U.S Inflation (CPIAUCSL)**
> This is the primary measure of inflation. The dataset tracks the average change in prices paid by urban consumers for a basket of goods and services.

 - **Source:** Federal Reserve Economic Data (FRED Ticker: CPIAUCSL)
 - **Link:** https://fred.stlouisfed.org/series/CPIAUCSL
 - **Features:**
    - **Date:** The date of the observation
    - **Value:** A seasonally adjusted index value (1982-1984 = 100) representing the relative cost of goods.

**6. Effective Federal Funds Rate (DFF)**
> This dataset tracks the interest rate at which commercial banks lend reserves to each other overnight. It reflects the U.S. monetary policy stance, providing essential context on the macroeconomic environment influencing asset prices.

 - **Source:** Federal Reserve Economic Data (FRED Ticker: DFF)
 - **Link:** https://fred.stlouisfed.org/series/DFF
 - **Features:**
    - **Date:** The date of the observation
    - **Value:** The effective federal funds rate, expressed as a percentage.

## Panning for Gold: Preprocessing the Datasets

The goal is to create a clean dataset in which all variables are aligned in time and transformed into metrics for analysis.

This involves five main preprocessing steps:


**Step 1: Loading the Datasets**

We need to load all six datasets into our Jupyter Notebook. We'll use a common date range to ensure we're looking at the same period for all assets.

We'll import the necessary Python libraries (`pandas`, `yfinance`, `pandas_datareader`) and download the time series for each of the six datasets, storing them in pandas DataFrames:

In [1]:
!pip install pandas yfinance pandas_datareader

Collecting yfinance
  Downloading yfinance-0.2.65-py2.py3-none-any.whl.metadata (5.8 kB)
Collecting pandas_datareader
  Downloading pandas_datareader-0.10.0-py3-none-any.whl.metadata (2.9 kB)
Collecting multitasking>=0.0.7 (from yfinance)
  Downloading multitasking-0.0.12.tar.gz (19 kB)
  Preparing metadata (setup.py) ... [?25ldone
Collecting frozendict>=2.3.4 (from yfinance)
  Downloading frozendict-2.4.6-py313-none-any.whl.metadata (23 kB)
Collecting peewee>=3.16.2 (from yfinance)
  Downloading peewee-3.18.2.tar.gz (949 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m949.2/949.2 kB[0m [31m14.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Collecting curl_cffi>=0.7 (from yfinance)
  Downloading curl_cffi-0.13.0-cp39-abi3-macosx_11_0_arm64.whl.metadata (13 kB)
Collecting protobuf>=3.19.0 (from yfinance)
  Using c

In [None]:
# Libraries to install:
import pandas as pd
import yfinance as yf
import pandas_datareader as pdr
import datetime

# Define tickers and Date Range. The start and end dates are set to cover from a long period ago to the present day.
yf_tickers = ['BTC-USD', '^GSPC', '^VIX', 'GC=F']
fred_tickers = ['CPIAUCSL', 'DFF']
start_date = '2010-01-01'
end_date = datetime.datetime.now().strftime('%Y-%m-%d')

# Download the data from Yahoo Finance using yf.download()
#
# Args:
#   yf_tickers (list[str]): List of tickers to download from Yahoo Finance
#   start (str): Start date for the data
#   end (str): End date for the data
# Returns:
#   yf_data (pd.DataFrame): DataFrame containing the downloaded data
yf_data = yf.download(yf_tickers, start=start_date, end=end_date)
print("--- Market Data (from yfinance) ---")
print(yf_data.head())
print("\n")

# Download the data from FRED using pdr.DataReader()
#
# Args:
#   fred_tickers (list[str]): List of tickers to download from FRED
#   'fred' (str): The name of the data source to use.
#   start (str): Start date for the data
#   end (str): End date for the data
# Returns:
#   fred_data (pd.DataFrame): DataFrame containing the downloaded data
fred_data = pdr.DataReader(fred_tickers, 'fred', start_date, end_date)
print("--- Economic Data (from FRED) ---")
print(fred_data.head())


  yf_data = yf.download(yf_tickers, start=start_date, end=end_date)
[*********************100%***********************]  4 of 4 completed


--- Market Data (from yfinance) ---
Price        Close                                         High               \
Ticker     BTC-USD         GC=F        ^GSPC       ^VIX BTC-USD         GC=F   
Date                                                                           
2010-01-04     NaN  1117.699951  1132.989990  20.040001     NaN  1122.300049   
2010-01-05     NaN  1118.099976  1136.520020  19.350000     NaN  1126.500000   
2010-01-06     NaN  1135.900024  1137.140015  19.160000     NaN  1139.199951   
2010-01-07     NaN  1133.099976  1141.689941  19.059999     NaN  1133.099976   
2010-01-08     NaN  1138.199951  1144.979980  18.129999     NaN  1138.199951   

Price                                  Low                            \
Ticker            ^GSPC       ^VIX BTC-USD         GC=F        ^GSPC   
Date                                                                   
2010-01-04  1133.869995  21.680000     NaN  1097.099976  1116.560059   
2010-01-05  1136.630005  20.129999 

**Step 2: Resampling and Merging**

The market data (BTC, S&P 500, VIX, Gold) and the policy-related rate (DFF) are daily, while the core inflation data (CPI) is monthly. We need to get everything onto a single daily timeline.

We'll select a single, representative column from each of the daily market datasets (e.g., `Close`). Then, we'll merge the daily datasets into one primary DataFrame using the date as the common index.

To align the monthly CPI data with the daily datasets, we will use a method called forward-filling (`ffill`). This is deemed the correct approach because the inflation rate for a given month is considered the prevailing rate for the entire month until the next value is announced. The `ffill` method achieves this by carrying the last valid observation forward to fill any subsequent gaps (e.g., a series like `[10, NaN, NaN, 15]` becomes `[10, 10, 10, 15]`).

In [None]:
# We are selecting the 'Close' price for each asset.
prices = yf_data['Close'].copy()

# Renaming columns for clarity and consistency.
prices.rename(columns={'BTC-USD': 'Bitcoin', '^GSPC': 'SP500', '^VIX': 'VIX', 'GC=F': 'Gold'}, inplace=True)

# Combine All Daily Datasets:
# DFF data is already daily, so we can merge it directly with our daily prices.
# We'll use pd.merge() to combine them based on their date index.
#
# Args:
#   prices (pd.DataFrame): The left DataFrame with daily stock prices.
#   fred_data['DFF'] (pd.Series): The right Series with the daily Federal Funds Rate.
#   left_index (bool): Use the index from the `prices` DataFrame as the join key.
#   right_index (bool): Use the index from the `fred_data['DFF']` Series as the join key.
#   how (str): Type of merge. 'left' keeps all rows/indices from the left DataFrame.
#
# Returns:
#   daily_data (pd.DataFrame): A new DataFrame containing the merged data.
daily_data = pd.merge(prices, fred_data['DFF'], left_index=True, right_index=True, how='left')

# Merge the monthly CPI data into our daily DataFrame.
# 'how='left'' ensures we keep all the dates from our daily data.
combined_df = pd.merge(daily_data, fred_data['CPIAUCSL'], left_index=True, right_index=True, how='left')

# Forward-fill the 'CPIAUCSL' column using fillna() to propagate the last known monthly value
# across all the days of the following month.
#
# Args:
#   method (str): The method used to fill missing values. 'ffill' stands for 'forward fill',
#                 which propagates the last valid observation forward.
#   inplace (bool): If True, the operation is performed directly on the object and modifies it.
#
# Returns:
#   None: When inplace=True, the method modifies the DataFrame directly and returns None.
combined_df['CPIAUCSL'].fillna(method='ffill', inplace=True)

# Forward-fill the entire DataFrame to handle missing values from weekends and holidays
# for the market assets (Gold, SP500, VIX).
combined_df.fillna(method='ffill', inplace=True)

# Drop any remaining rows with NaN values. This removes the initial period
# before Bitcoin's data was available, ensuring all series start on the same day.
analysis_df = combined_df.dropna()


print("--- Final Analysis-Ready DataFrame Head ---")
print(analysis_df.head())

print("\n--- Final Analysis-Ready DataFrame Tail ---")
print(analysis_df.tail())

# The missing values count for all columns should now be 0.
print("\n--- Final Missing Values Check ---")
print(analysis_df.isnull().sum())

--- Final Analysis-Ready DataFrame Head ---
               Bitcoin         Gold        SP500    VIX   DFF  CPIAUCSL
Date                                                                   
2014-09-17  457.334015  1234.400024  2001.569946  12.65  0.09    237.46
2014-09-18  424.440002  1225.699951  2011.359985  12.03  0.09    237.46
2014-09-19  394.795990  1215.300049  2010.400024  12.11  0.09    237.46
2014-09-20  408.903992  1215.300049  2010.400024  12.11  0.09    237.46
2014-09-21  398.821014  1215.300049  2010.400024  12.11  0.09    237.46

--- Final Analysis-Ready DataFrame Tail ---
                  Bitcoin         Gold        SP500    VIX   DFF  CPIAUCSL
Date                                                                      
2025-08-30  108808.070312  3473.699951  6460.259766  15.36  4.33   322.132
2025-08-31  108236.710938  3473.699951  6460.259766  15.36  4.33   322.132
2025-09-01  109250.593750  3473.699951  6460.259766  15.36  4.33   322.132
2025-09-02  111200.585938  3549.

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  combined_df['CPIAUCSL'].fillna(method='ffill', inplace=True)
  combined_df['CPIAUCSL'].fillna(method='ffill', inplace=True)
  combined_df.fillna(method='ffill', inplace=True)


**Step 3: Handling Missing Values (NaNs)**

Financial datasets often have missing values, typically on weekends and holidays when markets are closed. Our merged dataset will have these gaps.

We will again use the forward-fill (`ffill`) method on the entire dataset. This will propagate the last valid observation forward to fill any gaps.

The value of an asset on a non-trading day (like a Saturday) is the same as its closing price on the last trading day (Friday). Forward-filling correctly handles these non-trading periods, ensuring our time series is continuous and free of `NaN` values without creating artificial data points.

In [23]:
# The .fillna(method='ffill') command works on the entire DataFrame. This handles gaps from weekends and holidays.
cleaned_df = combined_df.fillna(method='ffill')

# After forward-filling, there might still be NaNs at the very start of the dataset
# if one asset's history began later than others (like Bitcoin).
# .dropna() will remove these initial rows where we have incomplete data.
#
# Args:
#   (no arguments): Uses default settings to drop rows, which are the following:
#   - axis=0: Drops rows containing missing values (default).
#   - how='any': Drops a row if at least one NaN is present (default).
#
# Returns:
#   analysis_df (pd.DataFrame): A new DataFrame with rows containing any NaN values removed.
analysis_df = cleaned_df.dropna()


print("--- Cleaned DataFrame Head ---")
print(analysis_df.head())

print("\n--- Cleaned DataFrame Tail ---")
print(analysis_df.tail())

# The missing values count for all columns should now be 0.
print("\n--- Missing Values Check After Cleaning ---")
print(analysis_df.isnull().sum())


--- Cleaned DataFrame Head ---
               Bitcoin         Gold        SP500    VIX   DFF  CPIAUCSL
Date                                                                   
2014-09-17  457.334015  1234.400024  2001.569946  12.65  0.09    237.46
2014-09-18  424.440002  1225.699951  2011.359985  12.03  0.09    237.46
2014-09-19  394.795990  1215.300049  2010.400024  12.11  0.09    237.46
2014-09-20  408.903992  1215.300049  2010.400024  12.11  0.09    237.46
2014-09-21  398.821014  1215.300049  2010.400024  12.11  0.09    237.46

--- Cleaned DataFrame Tail ---
                  Bitcoin         Gold        SP500    VIX   DFF  CPIAUCSL
Date                                                                      
2025-08-30  108808.070312  3473.699951  6460.259766  15.36  4.33   322.132
2025-08-31  108236.710938  3473.699951  6460.259766  15.36  4.33   322.132
2025-09-01  109250.593750  3473.699951  6460.259766  15.36  4.33   322.132
2025-09-02  111200.585938  3549.399902  6415.540039  17.17

  cleaned_df = combined_df.fillna(method='ffill')
