# Kraken - Structural breaks

Are you intrigued by the potential of leveraging your data science skills to generate income through the creation of models on financial data? Have you ever pondered the possibility of crafting algorithmic trading strategies that could yield profitable results? Perhaps you're simply interested in conducting basic analyses of Bitcoin exchange data. Look no further. This series of notebooks commences with the fundamental step of acquiring historical trade data, setting the stage for a comprehensive exploration into the realm of financial analytics and algorithmic trading.

This notebook is the third in a series where I work on obtaining and deriving insights from data from the [Kraken Exchange](https://www.kraken.com/), using the biggest and most convenient cryptocurrency toolbox in the world [CCXT](https://github.com/ccxt/ccxt). The input of this notebook will be the data gathered in my previous notebook. 


# Content

- What are structural breaks?
- What is the CUSUM filter?
- How do you apply CUSUM filters on price tickers?
- Why would you apply a CUSUM filter on variance?
- How do you apply a CUSUM filter on variance?

# What are structural breaks
In the context of K-lines, which are used in technical analysis of financial markets, structural breaks refer to significant shifts or disruptions in the patterns of price movements represented by the K-lines. These breaks can indicate changes in market sentiment, investor behavior, or underlying market dynamics, potentially signaling the emergence of new trends or the end of existing ones. Traders often analyze K-line patterns to identify these structural breaks and make informed decisions about buying, selling, or holding assets.

In [1]:
import pandas as pd
import numpy as np
import plotly.figure_factory as ff
import plotly.graph_objects as go

In [2]:
def apply_cusum_filter(historical_trades_df, limit=5000, column: str = "open"):
    trades_copy_df = historical_trades_df.copy()
    trades_copy_df["open_time"] = pd.to_datetime(trades_copy_df["open_time"])
    events = []
    differences = trades_copy_df[column].diff()
    #sums 
    sum_pos = 0
    sum_neg = 0
    for index in differences.index[1:]:
        sum_pos = max(0, sum_pos+differences.loc[index])
        sum_neg = min(0, sum_neg+differences.loc[index])
        if sum_neg < -limit:
            sum_neg = 0
            events.append(index)
        if sum_pos > limit:
            sum_pos = 0
            events.append(index)
    return events


In [3]:
historical_time_bars_df = pd.read_csv("/kaggle/input/kraken-processed-historical-data/kraken_btcusdt_2022_time_bar.csv")
historical_tick_bars_df = pd.read_csv("/kaggle/input/kraken-processed-historical-data/kraken_btcusdt_2022_tick_bar.csv")

In [4]:
#Computing the structural breaks 
events = apply_cusum_filter(historical_time_bars_df, limit=5000)

#pre-computing something for convenience.
structural_break_open_times = historical_time_bars_df.iloc[events]["open_time"].tolist()
open_times_as_list = historical_time_bars_df['open'].tolist()

fig = go.Figure(data=[go.Candlestick(x=historical_time_bars_df['open_time'],
                open=historical_time_bars_df['open'],
                high=historical_time_bars_df['high'],
                low=historical_time_bars_df['low'],
                close=historical_time_bars_df['close'])])

# Adding title
fig.update_layout(title="Hourly Sampled Candlestick Chart")
# Adding crosses for events
fig.add_trace(go.Scatter(x=structural_break_open_times,
                         y=[open_times_as_list[event] for event in events],  # Constant y value, you can choose any value
                         mode='markers',
                         marker=dict(symbol='x', size=10, color='black'),
                         name='Event'))

# Adding axis labels
fig.update_layout(xaxis=dict(title="Time"),
                  yaxis=dict(title="Structural Break"))

fig.show()

In [5]:
#Computing the structural breaks 
events = apply_cusum_filter(historical_tick_bars_df, limit=5000)

#pre-computing something for convenience.
structural_break_open_times = historical_tick_bars_df.iloc[events]["open_time"].tolist()
open_times_as_list = historical_tick_bars_df['open'].tolist()

fig = go.Figure(data=[go.Candlestick(x=historical_tick_bars_df['open_time'],
                open=historical_tick_bars_df['open'],
                high=historical_tick_bars_df['high'],
                low=historical_tick_bars_df['low'],
                close=historical_tick_bars_df['close'],     )])

# Adding title
fig.update_layout(title="Hourly Sampled Candlestick Chart")
# Adding crosses for events
fig.add_trace(go.Scatter(x=structural_break_open_times,
                         y=[open_times_as_list[event] for event in events],  # Constant y value, you can choose any value
                         mode='markers',
                         marker=dict(symbol='x', size=10, color='black'),
                         name='Structural Break'))

# Adding axis labels
fig.update_layout(xaxis=dict(title="Time"),
                  yaxis=dict(title="Price"))
fig.show()

# Why would you apply a CUSUM filter on the variance?
In financial machine learning, applying a CUSUM filter on variance helps detect structural breaks by identifying significant deviations from the expected variance in a financial time series. This approach is valuable because structural breaks indicate fundamental changes in market dynamics, such as shifts in volatility regimes or unexpected events, which can profoundly impact trading strategies and risk management. By monitoring changes in variance using the CUSUM filter, practitioners can enhance their ability to adapt to evolving market conditions and make informed decisions.

# How do you apply a CUSUM filter on the variance?

1. Well the first thing is to actually compute the rolling variance, which I did for a window size of $5$.
2. Then we can re-use the previously defined CUSUM filter to actually compute the possible structural breaks. 

In [6]:
historical_time_bars_df["rolling_var"] = historical_time_bars_df["open"].rolling(window=5).var()

In [7]:
fig = go.Figure()

# Add a line plot for the rolling variance column
fig.add_trace(go.Scatter(
    x=historical_time_bars_df.index,
    y=historical_time_bars_df["rolling_var"],
    mode='lines',
    name='Rolling Variance'
))

# Update layout
fig.update_layout(
    title='Rolling Variance Over Time',
    xaxis_title='Date',
    yaxis_title='Rolling Variance',
    template='plotly_dark'  # Choose a plotly template
)
# Show plot
fig.show()

In [8]:
#Computing the structural breaks 
events = apply_cusum_filter(historical_time_bars_df, limit=500_000, column="rolling_var")

#pre-computing something for convenience.
structural_break_open_times = historical_time_bars_df.iloc[events]["open_time"].tolist()
open_times_as_list = historical_time_bars_df['open'].tolist()

fig = go.Figure(data=[go.Candlestick(x=historical_time_bars_df['open_time'],
                open=historical_time_bars_df['open'],
                high=historical_time_bars_df['high'],
                low=historical_time_bars_df['low'],
                close=historical_time_bars_df['close'],     )])

# Adding title
fig.update_layout(title="Hourly Sampled Candlestick Chart")
# Adding crosses for events
fig.add_trace(go.Scatter(x=structural_break_open_times,
                         y=[open_times_as_list[event] for event in events],  # Constant y value, you can choose any value
                         mode='markers',
                         marker=dict(symbol='x', size=10, color='black'),
                         name='Structural Break'))

# Adding axis labels
fig.update_layout(xaxis=dict(title="Time"),
                  yaxis=dict(title="Price"))
fig.show()