# Quantifying the Speculative Bubble Component in Bitcoin Markets: A Comparative Analysis of Statistical Models and Empirical Applications

Amin Boulouma, 2023

## Heterogeneous agent model developed by Brock and Hommes (1998)


The first model we use is based on the standard framework for estimating the speculative bubble component in asset prices. This model assumes that asset prices can be decomposed into two components: a fundamental component and a speculative component. The fundamental component is driven by the intrinsic value of the asset, while the speculative component is driven by market sentiment and investors' expectations.

To estimate the fundamental component of Bitcoin prices, we use a range of economic indicators, including the hash rate, transaction volume, and mining difficulty. We also consider the macroeconomic environment, such as inflation rates and interest rates, to account for the broader economic context in which Bitcoin operates.

To estimate the speculative component of Bitcoin prices, we use a variety of technical indicators, including moving averages, relative strength index (RSI), and the stochastic oscillator. We also use sentiment analysis of social media and news articles to gauge market sentiment and investor expectations.

The Heterogeneous agent model developed by Brock and Hommes (1998) assumes that the asset price $P_t$ can be decomposed into a fundamental component $F_t$ and a speculative component $S_t$ as follows:

$$P_t = F_t + S_t$$

The fundamental component of Bitcoin prices can be estimated using the following equation:

$$F_t = \omega_0 + \sum_{j=1}^{N} \omega_j X_{j,t}$$

where $F_t$ is the fundamental component of Bitcoin prices at time $t$, $X_{j,t}$ are the **economic indicators** and **macroeconomic factors** at time $t$, $N$ is the total number of indicators and factors, and $\omega_j$ are the corresponding weights.

$$ F_t = \omega_0 + w_1 \cdot \text{hash rate}_t + w_2 \cdot \text{transaction volume}_t + w_3 \cdot \text{mining difficulty}_t + w_4 \cdot \text{inflation rates}_t + w_5 \cdot \text{interest rates}_t $$

Where:

- $F_t$ is the fundamental component of Bitcoin prices at time $t$
- $\text{hash rate}_t$ is the hash rate of the Bitcoin network at time $t$
- $\text{transaction volume}_t$ is the transaction volume on the Bitcoin network at time $t$
- $\text{mining difficulty}_t$ is the mining difficulty of the Bitcoin network at time $t$
- $\text{inflation rates}_t$ is the inflation rate at time $t$
- $\text{interest rates}_t$ is the interest rate at time $t$
- $w_1, w_2, w_3, w_4,$ and $w_5$ are weights assigned to each of the economic indicators and macroeconomic factors, respectively.

The speculative component of Bitcoin prices can be estimated using the following equation:

$$S_t = \sum_{j=1}^{M} \alpha_j Y_{j,t} + \beta S_{t-1}$$

where $S_t$ is the speculative component of Bitcoin prices at time $t$, $Y_{j,t}$ are the **technical indicators** and **sentiment analysis** at time $t$, $M$ is the total number of technical indicators and sentiment analysis, $\alpha_j$ are the corresponding weights, and $\beta$ is the persistence parameter.

$Y_{j,t}$, which represents the $j$th technical indicator or sentiment analysis at time $t$, can be written as:

$$Y_{j,t} = f_j (P_t, V_t, M_t, N_t, S_t, A_t, E_t)$$

where $P_t$ is the price of Bitcoin at time $t$, $V_t$ is the trading volume of Bitcoin at time $t$, $M_t$ is the mining difficulty of Bitcoin at time $t$, $N_t$ is the number of active Bitcoin nodes at time $t$, $S_t$ is the market sentiment of Bitcoin at time $t$, $A_t$ is the adoption rate of Bitcoin at time $t$, and $E_t$ is the external news and events related to Bitcoin at time $t$. The function $f_j$ represents the specific technical indicator or sentiment analysis being used, and may have different inputs and parameters depending on the indicator.

For example, the formula for the **moving average** indicator ($MA$) with a window size of $k$ can be written as:

$$Y_{MA,t} = \frac{1}{k} \sum_{i=t-k+1}^{t} P_i$$

where $P_i$ is the price of Bitcoin at time $i$.

Similarly, the formula for the **relative strength index** ($RSI$) with a window size of $k$ can be written as:

$$Y_{RSI,t} = 100 - \frac{100}{1 + RS}$$

where $RS$ is the relative strength at time $t$, which is calculated as:

$$RS = \frac{\sum_{i=t-k+1}^{t} Max(P_i - P_{i-1}, 0)}{\sum_{i=t-k+1}^{t} |P_i - P_{i-1}|}$$

The formula for the **stochastic oscillator** ($SO$) with a window size of $k$ can be written as:

$$Y_{SO,t} = \frac{P_t - Min_{k}(P)}{Max_{k}(P) - Min_{k}(P)} \times 100$$

where $Min_{k}(P)$ and $Max_{k}(P)$ are the minimum and maximum prices of Bitcoin over the past $k$ periods, respectively.

The **sentiment analysis** indicator ($SA$) at time $t$ can be written as:

$$Y_{SA,t} = f_{SA}(T_t, A_t, E_t)$$

where $T_t$ is the text data extracted from news articles and social media related to Bitcoin at time $t$, and $f_{SA}$ is a function that processes the text data to generate a sentiment score. The sentiment score may be based on techniques such as keyword analysis, natural language processing, or machine learning.

## Modified version of the model proposed by Phillips et al. (2011)

The second model we use is based on the behavioral finance literature, which suggests that investors' irrational behavior can create speculative bubbles in financial markets. 

This model assumes that market sentiment and investor behavior are driven by a range of psychological biases, including herding behavior, overconfidence, and confirmation bias.

To estimate the speculative bubble component of Bitcoin prices using this model, we use a range of behavioral finance indicators, including the ratio of Bitcoin to the market capitalization of all cryptocurrencies, the Google Trends search volume index for the term "Bitcoin," and the number of Bitcoin-related tweets and Reddit posts.


1. I will use the second model, which is based on behavioral finance literature, to estimate the speculative bubble component of Bitcoin prices.


In [None]:
# Load necessary libraries
import pandas as pd
import numpy as np

# Load Bitcoin price data
bitcoin_data = pd.read_csv('bitcoin_price_data.csv')

# Define a function to calculate the speculative bubble component using behavioral finance indicators
def calculate_speculative_bubble(data):
    # Calculate the ratio of Bitcoin to the market capitalization of all cryptocurrencies
    bitcoin_market_share = data['Bitcoin Market Share']
    crypto_market_cap = data['Crypto Market Cap']
    bitcoin_market_cap = bitcoin_market_share * crypto_market_cap
    bitcoin_market_cap_ratio = bitcoin_market_cap / crypto_market_cap
    
    # Calculate the Google Trends search volume index for the term "Bitcoin"
    google_trends_index = data['Google Trends Index']
    
    # Calculate the number of Bitcoin-related tweets and Reddit posts
    twitter_count = data['Twitter Count']
    reddit_count = data['Reddit Count']
    
    # Combine the behavioral finance indicators into a single metric
    speculative_bubble = bitcoin_market_cap_ratio + google_trends_index + twitter_count + reddit_count
    
    return speculative_bubble

# Calculate the speculative bubble component for the Bitcoin price data
speculative_bubble = calculate_speculative_bubble(bitcoin_data)


2. I assume that market sentiment and investor behavior are driven by psychological biases such as herding behavior, overconfidence, and confirmation bias.


In [None]:
# Define a function to calculate the speculative bubble component using behavioral finance indicators and psychological biases
def calculate_speculative_bubble_with_bias(data):
    # Calculate the speculative bubble component using the calculate_speculative_bubble function defined in Step 1
    speculative_bubble = calculate_speculative_bubble(data)
    
    # Apply a weighting function to the speculative bubble component based on psychological biases
    herding_bias = 1.2
    overconfidence_bias = 0.8
    confirmation_bias = 1.1
    speculative_bubble_with_bias = (speculative_bubble * herding_bias * overconfidence_bias * confirmation_bias) / 3
    
    return speculative_bubble_with_bias

# Calculate the speculative bubble component with psychological biases for the Bitcoin price data
speculative_bubble_with_bias = calculate_speculative_bubble_with_bias(bitcoin_data)


3. I will use the speculative bubble component calculated in Steps 1 and 2 to estimate the overall speculative component of Bitcoin prices.


In [None]:
# Calculate the overall speculative component of Bitcoin prices
overall_speculative_component = speculative_bubble_with_bias / (1 + speculative_bubble_with_bias)

# Calculate the fundamental component of Bitcoin prices
fundamental_component = 1 - overall_speculative_component

# Combine the fundamental and speculative components into a single metric
bitcoin_price_estimate = fundamental_component + overall_speculative_component


## Model 3: The Log-Periodic Power Law (LPPL) by Didier Sornette (1990)

The Log-Periodic Power Law (LPPL) model is a tool used to predict speculative bubbles in financial markets. The LPPL model is based on the theory that asset prices can experience exponential growth rates in the short term, but will eventually experience a crash as market participants realize that the asset is overvalued.

To apply the LPPL model to Bitcoin prices, we first need to gather historical price data for Bitcoin. We can do this by accessing an API that provides historical price data, such as the Coinbase API.

Once we have the historical price data, we can fit the LPPL model to the data using nonlinear regression. The LPPL model has several parameters that need to be estimated, including the critical time, the amplitude, and the frequency.

After estimating the LPPL parameters, we can use the model to predict when a speculative bubble is likely to occur. Specifically, we can look for signs of a divergence between the predicted price and the actual price, which is an indication that a bubble may be forming.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# Get historical Bitcoin prices
bitcoin_data = pd.read_csv('bitcoin_prices.csv')

# Define the LPPL function
def lppl(t, A, B, C, tc, m, omega, phi):
    return A + B*((tc-t)**m)*(1 + C*np.cos(omega*np.log(tc-t) + phi))

# Define the time variable
t = np.arange(len(bitcoin_data))

# Fit the LPPL model to the Bitcoin prices
popt, pcov = curve_fit(lppl, t, bitcoin_data['Price'])

# Extract the LPPL parameters
A, B, C, tc, m, omega, phi = popt

# Predict future Bitcoin prices using the LPPL model
future_t = np.arange(len(bitcoin_data), len(bitcoin_data)+365)
future_prices = lppl(future_t, A, B, C, tc, m, omega, phi)

# Plot the Bitcoin prices and LPPL fit
plt.plot(t, bitcoin_data['Price'], label='Bitcoin Prices')
plt.plot(future_t, future_prices, label='LPPL Fit')
plt.legend()
plt.show()

## Model 4:  "Beauty Contest" model of Keynes (1936)

1. Assemble a group of participants: In the beauty contest model, participants are asked to guess the average of a particular value, in this case, the future price of Bitcoin. To simulate this, we can create a list of "participants" and ask them to guess the future price.

In [None]:
import random

num_participants = 10
guesses = []
for i in range(num_participants):
    guess = random.uniform(0, 100000)
    guesses.append(guess)


2. Calculate the average guess: In the beauty contest model, the final prediction is not the actual average guess, but the average of the guesses. To simulate this, we can calculate the average of the guesses.


In [None]:
average_guess = sum(guesses) / len(guesses)


3. Calculate the winning guess: The winning guess in the beauty contest model is the guess that is closest to two-thirds of the average guess. To simulate this, we can calculate two-thirds of the average guess and find the guess that is closest to it.


In [None]:
two_thirds_average = (2 / 3) * average_guess
winning_guess = min(guesses, key=lambda x: abs(x - two_thirds_average))


4. Use the winning guess to estimate Bitcoin prices: Once we have the winning guess, we can use it to estimate the future price of Bitcoin. This can be done by multiplying the current price of Bitcoin by the winning guess.


In [None]:
current_price = 50000
predicted_price = current_price * winning_guess


4. Repeat the process: To get a more accurate estimate of the future price, we can repeat this process multiple times with different groups of participants and take the average of the predicted prices.


In [None]:
num_trials = 10
predicted_prices = []
for i in range(num_trials):
    guesses = []
    for j in range(num_participants):
        guess = random.uniform(0, 100000)
        guesses.append(guess)

    average_guess = sum(guesses) / len(guesses)
    two_thirds_average = (2 / 3) * average_guess
    winning_guess = min(guesses, key=lambda x: abs(x - two_thirds_average))

    predicted_price = current_price * winning_guess
    predicted_prices.append(predicted_price)

average_predicted_price = sum(predicted_prices) / len(predicted_prices)


## Model 5: Shiller's CAPE Ratio (2000)


Shiller's CAPE ratio has been applied to the valuation of Bitcoin in the literature. However, it is important to note that the applicability of traditional stock market valuation models, such as the CAPE ratio, to the cryptocurrency market is still a matter of debate and further research is needed to determine their effectiveness in predicting Bitcoin prices.

1. Import the necessary libraries: We need to import the libraries required for data manipulation, analysis, and visualization. The most commonly used libraries are pandas, numpy, and matplotlib.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


2. Load the Bitcoin price data: We need to load the historical price data of Bitcoin to calculate its CAPE ratio. We can use any reliable cryptocurrency data source such as CoinMarketCap or Yahoo Finance to obtain the data. Here, we assume the data is in CSV format with the date and closing price columns.


In [None]:
data = pd.read_csv('bitcoin_prices.csv')


3. Load the Bitcoin price data: We need to load the historical price data of Bitcoin to calculate its CAPE ratio. We can use any reliable cryptocurrency data source such as CoinMarketCap or Yahoo Finance to obtain the data. Here, we assume the data is in CSV format with the date and closing price columns.


In [None]:
url = "https://api.blockchain.info/charts/miners-revenue?timespan=all&format=csv"
miners_revenue = pd.read_csv(url)
avg_earnings = miners_revenue['y'].mean()


4. Calculate the average earnings: To calculate the CAPE ratio, we need to determine the average earnings of Bitcoin over the long term. In the case of Bitcoin, earnings can be approximated as the total transaction fees paid to miners per day. We can obtain this data from blockchain.info.


In [None]:
data['10-year MA'] = data['Closing Price'].rolling(window=3650).mean()


5. Calculate the 10-year moving average of Bitcoin prices: The CAPE ratio uses a 10-year moving average of the asset price. We need to calculate the 10-year moving average of Bitcoin's closing price.


In [None]:
data['CAPE Ratio'] = data['Closing Price'] / avg_earnings


6. Calculate the CAPE ratio: The CAPE ratio is calculated by dividing the current price by the average earnings over the long term. In the case of Bitcoin, we divide the current closing price by the average transaction fees paid to miners per day.


In [None]:
plt.plot(data['Date'], data['CAPE Ratio'])
plt.title('Bitcoin CAPE Ratio')
plt.xlabel('Year')
plt.ylabel('CAPE Ratio')
plt.show()


## Other models from the litterature
- Autoregressive Integrated Moving Average (ARIMA) models: Developed by George Box and Gwilym Jenkins in the 1970s.
- Artificial Neural Networks (ANN): The concept of artificial neural networks has its roots in the work of Warren McCulloch and Walter Pitts in the 1940s. The development of modern neural networks, including backpropagation, can be attributed to multiple researchers including Paul Werbos in 1974, David Rumelhart and James McClelland in 1986, and Geoffrey Hinton in the 2000s.
- Long Short-Term Memory (LSTM) models: Developed by Sepp Hochreiter and Jürgen Schmidhuber in 1997.
- Support Vector Regression (SVR) models: Developed by Vladimir Vapnik and Alexey Chervonenkis in the 1990s.
- Random Forest (RF) models: Developed by Leo Breiman in 2001.
- Bayesian Neural Networks (BNN): Bayesian neural networks have been developed by multiple researchers including David MacKay in the 1990s and Radford Neal in 1995.
- Vector Autoregression (VAR) models: Introduced by Clive Granger in the 1960s.
- Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models: Developed by Robert Engle in the 1980s.
- Multivariate GARCH (MGARCH) models: Developed by Robert Engle in the 1990s.
- Markov switching GARCH (MSGARCH) models: Developed by Robert Engle and Kevin Sheppard in the 2000s.
- ARIMA model: Developed by George Box and Gwilym Jenkins in the 1970s.
- GARCH model: Developed by Robert Engle in the 1980s.
- Neural network model: The concept of artificial neural networks has its roots in the work of Warren McCulloch and Walter Pitts in the 1940s. The development of modern neural networks, including backpropagation, can be attributed to multiple researchers including Paul Werbos in 1974, David Rumelhart and James McClelland in 1986, and Geoffrey Hinton in the 2000s.
- Vector autoregression (VAR) model: Introduced by Clive Granger in the 1960s.
- Bayesian regression model: Bayesian regression has been developed by multiple researchers including Harold Jeffreys in the 1940s and Bruno de Finetti in the 1950s.
- Random Forest model: Developed by Leo Breiman in 2001.
- Long short-term memory (LSTM) model: Developed by Sepp Hochreiter and Jürgen Schmidhuber in 1997.
- Markov regime-switching model: Developed by Andrew Harvey in the 1980s.
- Heterogeneous agent model: Heterogeneous agent models have been developed by multiple researchers including Cars Hommes in the 1990s and Blake LeBaron in the 2000s.
- Support vector regression (SVR) model: Developed by Vladimir Vapnik and Alexey Chervonenkis in the 1990s.