# Volatility Analysis

Volatility analysis is a critical aspect of financial market strategies, providing insights into price fluctuations and helping traders assess risk and identify profitable opportunities. In this analysis, several volatility metrics are calculated to give a comprehensive view of asset behavior and market risk. Here's how the analysis works:

- **Identifying Volatility Metrics**: A set of volatility indicators is computed to assess the magnitude and frequency of price movements. These metrics include:
  - **Standard Deviation of Returns**: Measures the overall price variability over a set period. Higher values indicate higher volatility.
  - **Average True Range (ATR) Relative to Price**: Measures the average range of price movement, adjusted for the closing price, with higher values indicating greater volatility.
  - **Relative Strength Index (RSI)**: Assesses whether an asset is overbought or oversold. Extreme values (>70 or <30) can signal increased volatility potential.
  - **Price Range Relative to Price**: Evaluates the average daily price movement relative to the closing price, with higher values indicating more volatile assets.
  - **Skewness of Returns**: Indicates asymmetry in the asset's price distribution, with negative skewness suggesting the likelihood of larger downward movements.
  - **Choppiness Index (CHOP)**: Measures how erratic the price movement is. A higher value indicates more choppy, volatile price behavior.
  - **Cumulative Volatility**: Sums the rolling volatility over a set period, highlighting overall market fluctuation.
  - **Logarithmic Returns Volatility**: Provides a more nuanced volatility measurement by considering the compounding of price changes over time.

<!-- - **Analyzing Historical Performance**: Historical price data is analyzed to determine how the asset behaves under different market conditions. These metrics allow traders to identify periods of high and low volatility, helping in the decision-making process for risk management and trade entry/exit. -->

- **Executing Trades**: By identifying assets with high volatility, traders can use this information to adjust their trading strategies. For example, increased volatility may present short-term trading opportunities, while lower volatility periods may suggest a more cautious approach. Traders might enter positions during periods of high volatility if they anticipate price movement in their favor or avoid positions during low volatility to reduce risk.

- **Monitoring and Adjusting**: Volatility analysis requires continuous monitoring of asset behavior and market conditions. Traders adjust their strategies in real-time, responding to sudden volatility spikes or drops, ensuring that risk is managed appropriately.

In this notebook, volatility analysis is demonstrated using historical price data from various cryptocurrency exchanges such as Binance, OKX, and Bybit. The computed metrics offer valuable insights for market participants seeking to understand asset behavior and make informed decisions in volatile markets.


## Prepare your Environment

Ensure that the correct kernel is selected for this notebook. If you are following the instruction in *README.md*, the kernel is 'crypto-trading-analysis'. To check, click on 'Kernel' at the top bar, select 'Change Kernel...' and select the correct kernel. For convenience, ensure that 'Always start the preferred kernel' is ticked. Click 'Select' to confirm the setting.

Install the environment's dependencies using the command below. After installation, restart the kernel to use the updated packages. To restart, click on 'Kernel' at the top bar and select 'Restart Kernel' and click on 'Restart'. Please skip this step if you have already done it.

In [None]:
pip install -r requirements.txt

## Import packages

In [1]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import os
import sys
from datetime import datetime
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from statsmodels.tsa.stattools import coint
from scipy.stats import skew
from itertools import combinations
from statsmodels.tsa.stattools import coint
from utils import calculate_profit, plot_strategy
from data_manager import load_ts_df, process_data, sanitize_data

## Process Price Dataframe

- Before proceeding, ensure that the price data has been downloaded using ***'data_manager.py'***.
- Enter the ***cex*** (Centralized Exchange) and ***interval*** values used for data download to load the relevant *.pkl* files and retrieve the dataframe.
- You can specify a batch of pairs to load using the ***selected_pairs*** variable. If no pairs are selected, all available pairs will be loaded by default.
- Note that some pairs might be new and may lack sufficient data within the downloaded timeframe. Such pairs will be removed based on the ***nan_remove_threshold*** setting, which defines the maximum percentage of NaN values allowed relative to the total data points. For example, with a ***nan_remove_threshold*** of 0.1, if a pair has 100 data points and 15 are NaN, the pair will be excluded.
- From the remaining pairs, you can filter the top N volume pairs using the ***top_n_volume_pairs*** parameter.
- This part of the code will also ensure that all timeseries columns have the same number of data points.
- The earliest and latest dates for all pairs will be recorded. These dates can then be used to determine the timeframe for slicing the data in the next step.

### Inputs

In [2]:
##### INPUTS #####
cex = 'binance'
interval = '1h'
nan_remove_threshold = 1

# Select only the pairs below to analyse. All pairs will be selected if the list is empty.
selected_pairs = []

# Select only the top N mean volume pairs from the selected pairs to analyse.
top_n_volume_pairs = 100

# Select volume filter mode. Options: ['rolling', 'mean'].
volume_filter_mode = 'rolling'
##################

In [3]:
print("\nMode: Volatility Strategy")
print("CEX: {}".format(str(cex).capitalize()))
print("Interval: {}".format(interval))
print("NaN Remove Threshold: {}".format(nan_remove_threshold))
print("Selected pairs to analyse: {}".format(selected_pairs))
print("Top N Volume Pairs: {}".format(top_n_volume_pairs))
print("Volume Filter Mode: {}".format(str(volume_filter_mode).capitalize()))

merged_df = process_data('volatility', cex, interval, nan_remove_threshold, selected_pairs,
                 top_n_volume_pairs, volume_filter_mode)

print("\n")


Mode: Volatility Strategy
CEX: Binance
Interval: 1h
NaN Remove Threshold: 1
Selected pairs to analyse: []
Top N Volume Pairs: 100
Volume Filter Mode: Rolling

Columns that contains NaN values:
                 Pair  NaN Count          Remark
16       BERAUSDT_Low        662  To Interpolate
15      BERAUSDT_High        662  To Interpolate
17     BERAUSDT_Close        662  To Interpolate
26      VVVUSDT_Close        463  To Interpolate
25        VVVUSDT_Low        463  To Interpolate
24       VVVUSDT_High        463  To Interpolate
9     PIPPINUSDT_High        345  To Interpolate
11   PIPPINUSDT_Close        345  To Interpolate
10     PIPPINUSDT_Low        345  To Interpolate
8      VINEUSDT_Close        345  To Interpolate
7        VINEUSDT_Low        345  To Interpolate
6       VINEUSDT_High        345  To Interpolate
1       ANIMEUSDT_Low        328  To Interpolate
2     ANIMEUSDT_Close        328  To Interpolate
0      ANIMEUSDT_High        328  To Interpolate
20     VTHOUSDT_Close 

## Sanitize the dataframe

- Slice the dataframe according to the specified ***start_date*** and ***end_date***. Choose ***start_date*** and ***end_date*** within the timeframe shown by the output of the previous cell.
- Interpolate any missing values in the dataframe.
- If the interpolation fails, just backfill with the latest valid value.
- Verify that all is as expected with an `assert` and check the shapes of 2 random pairs, which should have the same dimensions.

### Inputs

In [4]:
##### INPUTS #####
start_date = '2025-02-01'
end_date = '2025-02-09'
##################

In [5]:
print("\n")

data_sanitized, sorted_available_pairs = sanitize_data(merged_df, start_date, end_date, is_volatility_strategy=True)

if data_sanitized:
    print("-Data Check-")
    keys = list(data_sanitized.keys())
    count = 0

    for key in keys:
        print("{}'s Data Shape: {}".format(key, data_sanitized[key].shape))
        count+=1

        if count == 2:
            break
            
else:
    print("No data found.")

print("\n")



-Data Check-
1000BONKUSDT's Data Shape: (193, 3)
1000FLOKIUSDT's Data Shape: (193, 3)




## Volatility Metrics Calculation

This function calculates several volatility metrics used to assess the risk and fluctuation of asset prices over time. These metrics are commonly used in financial analysis and trading strategies to identify periods of high and low volatility.

1. **Standard Deviation of Returns**  
   The standard deviation of returns quantifies the variability of asset returns. A higher value indicates greater volatility in the asset's price movement. It is calculated as the standard deviation of the daily percentage change in the asset's closing price.

   $$ \sigma_{\text{returns}} = \sqrt{\frac{1}{n-1} \sum_{i=1}^{n} (R_i - \bar{R})^2} $$  
   where:  
   - $R_i$ represents the return on day $i$,  
   - $\bar{R}$ is the mean return over the period.

2. **Average True Range (ATR) Relative to Price**  
   ATR measures market volatility by considering the largest price movement over a period, including gaps. The ATR is calculated for a 14-day window, and the relative value to the closing price is computed to normalize volatility across different price levels. Higher ATR values indicate greater volatility.

   $$ \text{ATR/Price} = \frac{\text{ATR}}{\text{Close}} \times 100 $$

3. **Relative Strength Index (RSI)**  
   RSI is a momentum oscillator that measures the speed and change of price movements. It ranges from 0 to 100, with values above 70 indicating overbought conditions (higher volatility potential), and values below 30 indicating oversold conditions. RSI is calculated as the ratio of average gains to average losses over a 14-day period.

   $$ \text{RSI} = 100 - \frac{100}{1 + \frac{\text{Average Gain}}{\text{Average Loss}}} $$

4. **Coin's Daily Price Range Relative to Price**  
   This metric calculates the average difference between the daily high and low prices, normalized by the closing price. Higher values indicate larger daily fluctuations, suggesting greater volatility.

   $$ \text{Price Range/Price} = \frac{\text{Average High-Low Range}}{\text{Close}} \times 100 $$

5. **Skewness of Returns**  
    High Negative Skewness: Implies higher risk and higher potential volatility, as the asset could experience large downward moves. The occurrence of extreme losses could drive volatility up in the long run, even if the asset has relatively stable daily movements.

    High Positive Skewness: Implies lower volatility in the short term, as the asset might experience fewer extreme losses, with larger upside moves. However, the asset's price behavior might still be volatile due to potential large upward price changes.

    Skewness as a Risk Indicator: A negative skew often signals an asset with asymmetric risks—i.e., the potential for significant downside risk compared to the likelihood of large upside movements. It is useful for understanding the tail risks that could cause sharp drops in price.

7. **Choppiness Index (CHOP)**  
   The Choppiness Index measures how much the price is fluctuating (choppy) over a given period, with higher values indicating more erratic price movement, which is indicative of volatility. It compares the sum of the high-low ranges over a 14-day window to the overall maximum price range during the period.

   $$ \text{CHOP} = 100 \times \frac{\sum_{i=1}^{14} (H_i - L_i)}{\text{Max Range}} $$  
   where $H_i$ and $L_i$ are the high and low prices for the window, and Max Range is the highest difference between the maximum high and minimum low over the period.

8. **Cumulative Volatility**  
   This metric sums the rolling standard deviation of returns over a 30-day period, providing a cumulative measure of volatility. A higher value signifies higher total price fluctuation over the period.

9. **Logarithmic Returns Volatility**  
   Logarithmic returns are used for calculating volatility as they account for compounding over time. The standard deviation of the log returns over a period is used as a volatility measure, where higher values indicate higher volatility.

   $$ \text{Log Return} = \log\left(\frac{\text{Close}_t}{\text{Close}_{t-1}}\right) $$  
   where $\log$ denotes the natural logarithm.

By combining these volatility metrics, traders can get a clearer picture of market dynamics and asset behavior, allowing them to make informed decisions about risk management and strategy optimization.


### Inputs

In [7]:
##### INPUTS #####
window_length = 7*24 # 1 week
##################

In [19]:
def calculate_volatility_metrics(df, window_length):
    # Standard Deviation of Returns
    df['Return'] = df['Close'].pct_change()
    volatility_std = df['Return'].std()

    # Average True Range (ATR) Relative to Price (ATR/Price)
    df['HL'] = df['High'] - df['Low']
    df['HC'] = abs(df['High'] - df['Close'].shift(1))
    df['LC'] = abs(df['Low'] - df['Close'].shift(1))
    df['True Range'] = df[['HL', 'HC', 'LC']].max(axis=1)
    atr = df['True Range'].rolling(window=window_length).mean()
    atr_relative_to_price = (atr / df['Close']) * 100
    atr_relative_to_price = atr_relative_to_price.iloc[-1]

    # Relative Strength Index (RSI)
    delta = df['Close'].diff()
    gain = (delta.where(delta > 0, 0)).rolling(window=window_length).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window=window_length).mean()
    rs = gain / loss
    rsi = 100 - (100 / (1 + rs)).iloc[-1]

    # Coin's Daily Price Range Relative to Price (Price Range/Price)
    price_range = (df['High'] - df['Low']).mean()
    price_range_relative_to_price = (price_range / df['Close'].iloc[-1]) * 100

    # Skewness of Returns
    returns_skewness = skew(df['Return'].dropna())

    # Choppiness Index (CHOP)
    hl_range = df['High'] - df['Low']
    max_range = df['High'].max() - df['Low'].min()
    chop = 100 * (hl_range.rolling(window=window_length).sum() / max_range).iloc[-1]

    # Cumulative Volatility
    cum_volatility = df['Return'].rolling(window=window_length).std().sum()

    # Logarithmic Returns Volatility
    df['Log Return'] = np.log(df['Close'] / df['Close'].shift(1))
    log_volatility = df['Log Return'].std()

    return {
        'Standard Deviation (Higher = Higher Volatility)': volatility_std,
        'ATR/Price (Higher = Higher Volatility)': atr_relative_to_price,
        'Cumulative Volatility (Higher = More Volatile)': cum_volatility,
        'Logarithmic Returns Volatility (Higher = More Volatile)': log_volatility,
        'RSI (>70 or <30 = Higher Volatility)': rsi,
        'CHOP (Higher = More Choppy/Volatile)': chop,
        'Price Range/Price (Higher = Higher Volatility)': price_range_relative_to_price,
        'Skewness (More Negative = Higher Volatility)': returns_skewness,
    }

volatility_metrics = {}

for coin, df in data_sanitized.items():
    metrics = calculate_volatility_metrics(df, window_length)
    volatility_metrics[coin] = metrics

volatility_df = pd.DataFrame(volatility_metrics).T  # Transpose to have coins as rows
styled_df = volatility_df.style.background_gradient(cmap="coolwarm", axis=0)

styled_df

Unnamed: 0,Standard Deviation (Higher = Higher Volatility),ATR/Price (Higher = Higher Volatility),Cumulative Volatility (Higher = More Volatile),Logarithmic Returns Volatility (Higher = More Volatile),RSI (>70 or <30 = Higher Volatility),CHOP (Higher = More Choppy/Volatile),Price Range/Price (Higher = Higher Volatility),Skewness (More Negative = Higher Volatility)
1000BONKUSDT,0.019007,2.941858,0.499999,0.019205,44.995778,762.732326,2.899729,-0.776541
1000FLOKIUSDT,0.019665,2.938626,0.516804,0.019873,43.533224,687.605549,2.915139,-0.684945
1000PEPEUSDT,0.019306,3.075348,0.506494,0.019477,44.204953,782.619269,3.05592,-0.615807
1000SATSUSDT,0.011117,1.674521,0.285406,0.011206,49.593496,1630.875576,1.754574,-1.252072
1000SHIBUSDT,0.017348,2.265057,0.458729,0.017573,46.964251,795.549738,2.164533,-1.204127
AAVEUSDT,0.018745,2.88517,0.495677,0.018999,44.70032,822.434266,2.774299,-1.059963
ACHUSDT,0.028224,4.753771,0.68437,0.027781,40.909503,821.530704,5.44799,1.49998
ACTUSDT,0.020059,3.685544,0.523943,0.019888,49.768262,1360.576699,3.551143,0.971811
ADAUSDT,0.021202,2.859751,0.564002,0.021982,43.793411,749.19607,2.700786,-2.730286
AI16ZUSDT,0.035261,6.088297,0.9204,0.035338,46.517067,896.125461,6.106476,0.256128
