# Herding Behavior in Cryptocurrency Markets

## Overview

This notebook explores herding behavior in cryptocurrency markets. Herding occurs when investors follow the collective actions of the market rather than their own independent analysis. This phenomenon is particularly relevant in cryptocurrency markets due to:

- High retail investor participation
- Strong social media influence
- Information asymmetry
- Market immaturity and volatility

## Key Herding Measures

We will implement several established measures:

1. **CSSD (Cross-Sectional Standard Deviation)** - Christie and Huang (1995)
2. **CSAD (Cross-Sectional Absolute Deviation)** - Chang, Cheng, and Khorana (2000)
3. **State-dependent herding analysis**

## References

- Christie, W.G., & Huang, R.D. (1995). Following the Pied Piper: Do Individual Returns Herd around the Market? *Financial Analysts Journal*, 51(4), 31-37.
- Chang, E.C., Cheng, J.W., & Khorana, A. (2000). An examination of herd behavior in equity markets: An international perspective. *Journal of Banking & Finance*, 24(10), 1651-1679.
- Ballis, A., & Drakos, K. (2020). Testing for herding in the cryptocurrency market. *Finance Research Letters*, 33, 101210.

## 1. Setup and Data Collection

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Statistical libraries
from scipy import stats
import statsmodels.api as sm
from statsmodels.regression.linear_model import OLS

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("Libraries imported successfully!")

### Data Collection

For this analysis, you can use data from various sources:
- **CoinGecko API** (free, no API key required for basic usage)
- **CoinMarketCap API** (requires free API key)
- **Binance API** (for high-frequency data)
- **Yahoo Finance** (via yfinance library)

Below is an example using simulated data. Replace this with actual API calls for real analysis.

In [None]:
# Example: Using yfinance for cryptocurrency data
# Uncomment and install if needed: !pip install yfinance

# import yfinance as yf

# crypto_tickers = ['BTC-USD', 'ETH-USD', 'BNB-USD', 'XRP-USD', 'ADA-USD', 
#                   'SOL-USD', 'DOT-USD', 'DOGE-USD', 'AVAX-USD', 'MATIC-USD']

# start_date = '2022-01-01'
# end_date = '2024-01-01'

# crypto_data = {}
# for ticker in crypto_tickers:
#     crypto_data[ticker] = yf.download(ticker, start=start_date, end=end_date)['Adj Close']

# prices_df = pd.DataFrame(crypto_data)
# prices_df.columns = [col.replace('-USD', '') for col in prices_df.columns]

# For demonstration, let's create simulated data
np.random.seed(42)
dates = pd.date_range(start='2022-01-01', end='2024-01-01', freq='D')
n_cryptos = 10
crypto_names = ['BTC', 'ETH', 'BNB', 'XRP', 'ADA', 'SOL', 'DOT', 'DOGE', 'AVAX', 'MATIC']

# Simulate correlated returns (to create potential herding)
market_factor = np.random.normal(0.001, 0.03, len(dates))
prices_df = pd.DataFrame(index=dates, columns=crypto_names)

for crypto in crypto_names:
    idiosyncratic = np.random.normal(0, 0.02, len(dates))
    beta = np.random.uniform(0.8, 1.5)  # Market sensitivity
    returns = market_factor * beta + idiosyncratic
    prices_df[crypto] = 100 * (1 + returns).cumprod()

print(f"Data shape: {prices_df.shape}")
print(f"\nFirst few rows:")
prices_df.head()

In [None]:
# Calculate returns
returns_df = prices_df.pct_change().dropna()

print(f"Returns shape: {returns_df.shape}")
print(f"\nSummary statistics:")
returns_df.describe()

In [None]:
# Visualize price movements
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# Normalized prices
normalized_prices = prices_df / prices_df.iloc[0] * 100
normalized_prices.plot(ax=axes[0], alpha=0.7)
axes[0].set_title('Normalized Cryptocurrency Prices (Base = 100)', fontsize=14, fontweight='bold')
axes[0].set_ylabel('Normalized Price')
axes[0].legend(loc='best', ncol=5)
axes[0].grid(True, alpha=0.3)

# Returns distribution
returns_df.plot(ax=axes[1], alpha=0.5, legend=False)
axes[1].set_title('Daily Returns', fontsize=14, fontweight='bold')
axes[1].set_ylabel('Return')
axes[1].set_xlabel('Date')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 2. Cross-Sectional Standard Deviation (CSSD)

### Methodology (Christie and Huang, 1995)

The CSSD measures the dispersion of individual asset returns around the market return:

$$CSSD_t = \sqrt{\frac{\sum_{i=1}^{N}(R_{i,t} - R_{m,t})^2}{N-1}}$$

Where:
- $R_{i,t}$ = return of cryptocurrency $i$ at time $t$
- $R_{m,t}$ = equally-weighted market return at time $t$
- $N$ = number of cryptocurrencies

### Interpretation

- **Low CSSD during extreme market movements** suggests herding (investors converge toward market consensus)
- **High CSSD** suggests investors acting on diverse information

Christie and Huang test for herding during extreme market stress by examining if CSSD is significantly lower during periods of large market movements.

In [None]:
def calculate_cssd(returns_df):
    """
    Calculate Cross-Sectional Standard Deviation (CSSD)
    
    Parameters:
    -----------
    returns_df : pd.DataFrame
        DataFrame of asset returns
    
    Returns:
    --------
    pd.Series
        CSSD values for each time period
    """
    # Calculate equally-weighted market return
    market_return = returns_df.mean(axis=1)
    
    # Calculate squared deviations from market return
    squared_deviations = (returns_df.sub(market_return, axis=0)) ** 2
    
    # Calculate CSSD
    cssd = np.sqrt(squared_deviations.sum(axis=1) / (len(returns_df.columns) - 1))
    
    return cssd

# Calculate CSSD
cssd = calculate_cssd(returns_df)

print("CSSD Statistics:")
print(cssd.describe())
print(f"\nMean CSSD: {cssd.mean():.6f}")
print(f"Median CSSD: {cssd.median():.6f}")

In [None]:
# Visualize CSSD over time
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# CSSD time series
axes[0].plot(cssd.index, cssd.values, alpha=0.7, linewidth=1)
axes[0].axhline(y=cssd.mean(), color='red', linestyle='--', label=f'Mean: {cssd.mean():.4f}')
axes[0].set_title('Cross-Sectional Standard Deviation (CSSD) Over Time', fontsize=14, fontweight='bold')
axes[0].set_ylabel('CSSD')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# CSSD distribution
axes[1].hist(cssd, bins=50, alpha=0.7, edgecolor='black')
axes[1].axvline(x=cssd.mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {cssd.mean():.4f}')
axes[1].axvline(x=cssd.median(), color='green', linestyle='--', linewidth=2, label=f'Median: {cssd.median():.4f}')
axes[1].set_title('Distribution of CSSD', fontsize=14, fontweight='bold')
axes[1].set_xlabel('CSSD')
axes[1].set_ylabel('Frequency')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

### Testing for Herding with CSSD

We test whether CSSD is lower during extreme market movements using regression:

$$CSSD_t = \alpha + \beta_1 D_t^L + \beta_2 D_t^U + \epsilon_t$$

Where:
- $D_t^L$ = dummy variable (1 if market return is in the lower tail)
- $D_t^U$ = dummy variable (1 if market return is in the upper tail)

**Herding Evidence**: Negative and significant coefficients ($\beta_1$, $\beta_2$ < 0)

In [None]:
def test_cssd_herding(returns_df, cssd, percentile=5):
    """
    Test for herding using CSSD methodology
    
    Parameters:
    -----------
    returns_df : pd.DataFrame
        DataFrame of asset returns
    cssd : pd.Series
        CSSD values
    percentile : int
        Percentile for defining extreme movements (default 5%)
    
    Returns:
    --------
    dict
        Regression results and interpretation
    """
    # Calculate market return
    market_return = returns_df.mean(axis=1)
    
    # Define extreme movement thresholds
    lower_threshold = np.percentile(market_return, percentile)
    upper_threshold = np.percentile(market_return, 100 - percentile)
    
    # Create dummy variables
    D_lower = (market_return <= lower_threshold).astype(int)
    D_upper = (market_return >= upper_threshold).astype(int)
    
    # Prepare regression data
    regression_data = pd.DataFrame({
        'CSSD': cssd,
        'D_lower': D_lower,
        'D_upper': D_upper
    })
    
    # Run regression
    X = sm.add_constant(regression_data[['D_lower', 'D_upper']])
    y = regression_data['CSSD']
    
    model = OLS(y, X).fit()
    
    return {
        'model': model,
        'lower_threshold': lower_threshold,
        'upper_threshold': upper_threshold,
        'n_lower_extreme': D_lower.sum(),
        'n_upper_extreme': D_upper.sum()
    }

# Test for herding
cssd_results = test_cssd_herding(returns_df, cssd, percentile=5)

print("="*70)
print("CSSD HERDING TEST RESULTS")
print("="*70)
print(f"\nExtreme movement thresholds (5th and 95th percentiles):")
print(f"  Lower tail: {cssd_results['lower_threshold']:.4f}")
print(f"  Upper tail: {cssd_results['upper_threshold']:.4f}")
print(f"\nNumber of extreme observations:")
print(f"  Lower tail: {cssd_results['n_lower_extreme']}")
print(f"  Upper tail: {cssd_results['n_upper_extreme']}")
print("\n" + "="*70)
print("REGRESSION RESULTS")
print("="*70)
print(cssd_results['model'].summary())
print("\n" + "="*70)
print("INTERPRETATION")
print("="*70)
print("Herding is present if coefficients on D_lower and D_upper are NEGATIVE and significant.")
print("Negative coefficients indicate lower dispersion during extreme market movements.")
print("="*70)

## 3. Cross-Sectional Absolute Deviation (CSAD)

### Methodology (Chang, Cheng, and Khorana, 2000)

CSAD is a more robust measure that uses absolute deviations:

$$CSAD_t = \frac{1}{N}\sum_{i=1}^{N}|R_{i,t} - R_{m,t}|$$

### Testing for Herding

Under the Capital Asset Pricing Model (CAPM), CSAD should increase linearly with market return. During herding, this relationship becomes non-linear:

$$CSAD_t = \alpha + \gamma_1|R_{m,t}| + \gamma_2 R_{m,t}^2 + \epsilon_t$$

**Herding Evidence**: Negative and significant $\gamma_2$ coefficient

The non-linear term captures the tendency for return dispersion to decrease (rather than increase) during periods of extreme market movements.

In [None]:
def calculate_csad(returns_df):
    """
    Calculate Cross-Sectional Absolute Deviation (CSAD)
    
    Parameters:
    -----------
    returns_df : pd.DataFrame
        DataFrame of asset returns
    
    Returns:
    --------
    pd.Series
        CSAD values for each time period
    """
    # Calculate equally-weighted market return
    market_return = returns_df.mean(axis=1)
    
    # Calculate absolute deviations from market return
    abs_deviations = (returns_df.sub(market_return, axis=0)).abs()
    
    # Calculate CSAD (mean of absolute deviations)
    csad = abs_deviations.mean(axis=1)
    
    return csad

# Calculate CSAD
csad = calculate_csad(returns_df)
market_return = returns_df.mean(axis=1)

print("CSAD Statistics:")
print(csad.describe())
print(f"\nMean CSAD: {csad.mean():.6f}")
print(f"Median CSAD: {csad.median():.6f}")

In [None]:
# Visualize CSAD
fig, axes = plt.subplots(2, 2, figsize=(16, 10))

# CSAD time series
axes[0, 0].plot(csad.index, csad.values, alpha=0.7, linewidth=1)
axes[0, 0].axhline(y=csad.mean(), color='red', linestyle='--', label=f'Mean: {csad.mean():.4f}')
axes[0, 0].set_title('Cross-Sectional Absolute Deviation (CSAD) Over Time', fontsize=12, fontweight='bold')
axes[0, 0].set_ylabel('CSAD')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# CSAD distribution
axes[0, 1].hist(csad, bins=50, alpha=0.7, edgecolor='black')
axes[0, 1].axvline(x=csad.mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {csad.mean():.4f}')
axes[0, 1].set_title('Distribution of CSAD', fontsize=12, fontweight='bold')
axes[0, 1].set_xlabel('CSAD')
axes[0, 1].set_ylabel('Frequency')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)

# CSAD vs absolute market return
axes[1, 0].scatter(market_return.abs(), csad, alpha=0.5, s=20)
axes[1, 0].set_title('CSAD vs Absolute Market Return', fontsize=12, fontweight='bold')
axes[1, 0].set_xlabel('|Market Return|')
axes[1, 0].set_ylabel('CSAD')
axes[1, 0].grid(True, alpha=0.3)

# CSAD vs market return (showing non-linearity)
axes[1, 1].scatter(market_return, csad, alpha=0.5, s=20)
axes[1, 1].set_title('CSAD vs Market Return', fontsize=12, fontweight='bold')
axes[1, 1].set_xlabel('Market Return')
axes[1, 1].set_ylabel('CSAD')
axes[1, 1].axvline(x=0, color='black', linestyle='-', linewidth=0.5)
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
def test_csad_herding(returns_df, csad):
    """
    Test for herding using CSAD methodology (Chang et al., 2000)
    
    Parameters:
    -----------
    returns_df : pd.DataFrame
        DataFrame of asset returns
    csad : pd.Series
        CSAD values
    
    Returns:
    --------
    dict
        Regression results and interpretation
    """
    # Calculate market return
    market_return = returns_df.mean(axis=1)
    
    # Prepare regression variables
    regression_data = pd.DataFrame({
        'CSAD': csad,
        'abs_Rm': market_return.abs(),
        'Rm_squared': market_return ** 2
    })
    
    # Run regression: CSAD = α + γ1|Rm| + γ2(Rm^2) + ε
    X = sm.add_constant(regression_data[['abs_Rm', 'Rm_squared']])
    y = regression_data['CSAD']
    
    model = OLS(y, X).fit()
    
    return {
        'model': model,
        'regression_data': regression_data
    }

# Test for herding
csad_results = test_csad_herding(returns_df, csad)

print("="*70)
print("CSAD HERDING TEST RESULTS")
print("="*70)
print("\nRegression Model: CSAD = α + γ1|Rm| + γ2(Rm²) + ε")
print("="*70)
print(csad_results['model'].summary())
print("\n" + "="*70)
print("INTERPRETATION")
print("="*70)
print("Herding is present if γ2 (coefficient on Rm_squared) is NEGATIVE and significant.")
print("\nA negative γ2 indicates that return dispersion DECREASES (rather than increases)")
print("during extreme market movements, suggesting investors are herding.")
print("\nUnder rational asset pricing (CAPM), γ2 should be zero or positive.")
print("="*70)

In [None]:
# Visualize the relationship between CSAD and market return
fig, ax = plt.subplots(figsize=(12, 7))

# Scatter plot
ax.scatter(csad_results['regression_data']['abs_Rm'], 
           csad_results['regression_data']['CSAD'], 
           alpha=0.5, s=30, label='Observed CSAD')

# Fitted values
sorted_indices = csad_results['regression_data']['abs_Rm'].argsort()
sorted_abs_Rm = csad_results['regression_data']['abs_Rm'].iloc[sorted_indices]
fitted_values = csad_results['model'].fittedvalues.iloc[sorted_indices]

ax.plot(sorted_abs_Rm, fitted_values, 'r-', linewidth=2, label='Fitted (with Rm² term)')

# Linear fit (without squared term) for comparison
linear_model = OLS(
    csad_results['regression_data']['CSAD'],
    sm.add_constant(csad_results['regression_data']['abs_Rm'])
).fit()
linear_fitted = linear_model.fittedvalues.iloc[sorted_indices]
ax.plot(sorted_abs_Rm, linear_fitted, 'g--', linewidth=2, label='Linear fit (without Rm² term)')

ax.set_title('CSAD vs Absolute Market Return: Testing for Non-Linear Relationship', 
             fontsize=14, fontweight='bold')
ax.set_xlabel('|Market Return|', fontsize=12)
ax.set_ylabel('CSAD', fontsize=12)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)

# Add annotation
gamma2 = csad_results['model'].params['Rm_squared']
pval = csad_results['model'].pvalues['Rm_squared']
annotation_text = f'γ2 = {gamma2:.6f}\np-value = {pval:.4f}'
if gamma2 < 0 and pval < 0.05:
    annotation_text += '\n\nEvidence of HERDING'
ax.text(0.05, 0.95, annotation_text, transform=ax.transAxes, 
        fontsize=11, verticalalignment='top',
        bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

plt.tight_layout()
plt.show()

## 4. State-Dependent Herding Analysis

Herding behavior may vary across different market conditions:

- **Bull vs Bear markets**
- **High vs Low volatility periods**
- **Up vs Down markets**

We can extend the CSAD model to test for state-dependent herding:

$$CSAD_t = \alpha + \gamma_1|R_{m,t}| + \gamma_2 R_{m,t}^2 + \gamma_3 D_t|R_{m,t}| + \gamma_4 D_t R_{m,t}^2 + \epsilon_t$$

Where $D_t$ is a dummy variable for the market state (e.g., down market).

In [None]:
def test_state_dependent_herding(returns_df, csad):
    """
    Test for state-dependent herding (up vs down markets)
    
    Parameters:
    -----------
    returns_df : pd.DataFrame
        DataFrame of asset returns
    csad : pd.Series
        CSAD values
    
    Returns:
    --------
    dict
        Regression results for up and down markets
    """
    # Calculate market return
    market_return = returns_df.mean(axis=1)
    
    # Create dummy for down market
    D_down = (market_return < 0).astype(int)
    
    # Prepare regression variables
    regression_data = pd.DataFrame({
        'CSAD': csad,
        'abs_Rm': market_return.abs(),
        'Rm_squared': market_return ** 2,
        'D_down': D_down,
        'D_abs_Rm': D_down * market_return.abs(),
        'D_Rm_squared': D_down * (market_return ** 2)
    })
    
    # Full model with interaction terms
    X_full = sm.add_constant(regression_data[['abs_Rm', 'Rm_squared', 'D_down', 'D_abs_Rm', 'D_Rm_squared']])
    y = regression_data['CSAD']
    model_full = OLS(y, X_full).fit()
    
    # Separate regressions for up and down markets
    up_market_data = regression_data[regression_data['D_down'] == 0]
    down_market_data = regression_data[regression_data['D_down'] == 1]
    
    X_up = sm.add_constant(up_market_data[['abs_Rm', 'Rm_squared']])
    model_up = OLS(up_market_data['CSAD'], X_up).fit()
    
    X_down = sm.add_constant(down_market_data[['abs_Rm', 'Rm_squared']])
    model_down = OLS(down_market_data['CSAD'], X_down).fit()
    
    return {
        'model_full': model_full,
        'model_up': model_up,
        'model_down': model_down,
        'regression_data': regression_data,
        'n_up': (D_down == 0).sum(),
        'n_down': (D_down == 1).sum()
    }

# Test for state-dependent herding
state_results = test_state_dependent_herding(returns_df, csad)

print("="*70)
print("STATE-DEPENDENT HERDING TEST")
print("="*70)
print(f"\nNumber of UP market days: {state_results['n_up']}")
print(f"Number of DOWN market days: {state_results['n_down']}")
print("\n" + "="*70)
print("FULL MODEL WITH INTERACTION TERMS")
print("="*70)
print(state_results['model_full'].summary())
print("\n" + "="*70)
print("UP MARKET MODEL")
print("="*70)
print(state_results['model_up'].summary())
print("\n" + "="*70)
print("DOWN MARKET MODEL")
print("="*70)
print(state_results['model_down'].summary())

In [None]:
# Visualize state-dependent herding
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Separate data
up_data = state_results['regression_data'][state_results['regression_data']['D_down'] == 0]
down_data = state_results['regression_data'][state_results['regression_data']['D_down'] == 1]

# UP MARKET
axes[0].scatter(up_data['abs_Rm'], up_data['CSAD'], alpha=0.5, s=30, color='green', label='Observed')
sorted_up = up_data['abs_Rm'].argsort()
axes[0].plot(up_data['abs_Rm'].iloc[sorted_up], 
             state_results['model_up'].fittedvalues.iloc[sorted_up], 
             'r-', linewidth=2, label='Fitted')
axes[0].set_title('UP MARKET: CSAD vs |Market Return|', fontsize=12, fontweight='bold')
axes[0].set_xlabel('|Market Return|')
axes[0].set_ylabel('CSAD')
gamma2_up = state_results['model_up'].params['Rm_squared']
pval_up = state_results['model_up'].pvalues['Rm_squared']
annotation_up = f'γ2 = {gamma2_up:.6f}\np-value = {pval_up:.4f}'
if gamma2_up < 0 and pval_up < 0.05:
    annotation_up += '\n\nHERDING detected'
axes[0].text(0.05, 0.95, annotation_up, transform=axes[0].transAxes, 
             fontsize=10, verticalalignment='top',
             bbox=dict(boxstyle='round', facecolor='lightgreen', alpha=0.8))
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# DOWN MARKET
axes[1].scatter(down_data['abs_Rm'], down_data['CSAD'], alpha=0.5, s=30, color='red', label='Observed')
sorted_down = down_data['abs_Rm'].argsort()
axes[1].plot(down_data['abs_Rm'].iloc[sorted_down], 
             state_results['model_down'].fittedvalues.iloc[sorted_down], 
             'b-', linewidth=2, label='Fitted')
axes[1].set_title('DOWN MARKET: CSAD vs |Market Return|', fontsize=12, fontweight='bold')
axes[1].set_xlabel('|Market Return|')
axes[1].set_ylabel('CSAD')
gamma2_down = state_results['model_down'].params['Rm_squared']
pval_down = state_results['model_down'].pvalues['Rm_squared']
annotation_down = f'γ2 = {gamma2_down:.6f}\np-value = {pval_down:.4f}'
if gamma2_down < 0 and pval_down < 0.05:
    annotation_down += '\n\nHERDING detected'
axes[1].text(0.05, 0.95, annotation_down, transform=axes[1].transAxes, 
             fontsize=10, verticalalignment='top',
             bbox=dict(boxstyle='round', facecolor='lightcoral', alpha=0.8))
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 5. Additional Analysis: Volatility-Dependent Herding

We can also examine whether herding varies with market volatility.

In [None]:
# Calculate rolling volatility
rolling_vol = market_return.rolling(window=30).std()

# Create high volatility dummy (above median)
high_vol_threshold = rolling_vol.median()
D_high_vol = (rolling_vol > high_vol_threshold).astype(int)

# Prepare data (drop NaN from rolling calculation)
vol_data = pd.DataFrame({
    'CSAD': csad,
    'abs_Rm': market_return.abs(),
    'Rm_squared': market_return ** 2,
    'D_high_vol': D_high_vol,
    'rolling_vol': rolling_vol
}).dropna()

# Add interaction terms
vol_data['D_abs_Rm'] = vol_data['D_high_vol'] * vol_data['abs_Rm']
vol_data['D_Rm_squared'] = vol_data['D_high_vol'] * vol_data['Rm_squared']

# Run regression
X_vol = sm.add_constant(vol_data[['abs_Rm', 'Rm_squared', 'D_high_vol', 'D_abs_Rm', 'D_Rm_squared']])
model_vol = OLS(vol_data['CSAD'], X_vol).fit()

print("="*70)
print("VOLATILITY-DEPENDENT HERDING TEST")
print("="*70)
print(f"\nHigh volatility threshold (30-day rolling std): {high_vol_threshold:.6f}")
print(f"Number of HIGH volatility periods: {vol_data['D_high_vol'].sum()}")
print(f"Number of LOW volatility periods: {(vol_data['D_high_vol'] == 0).sum()}")
print("\n" + "="*70)
print(model_vol.summary())
print("\n" + "="*70)
print("INTERPRETATION")
print("="*70)
print("Look at the coefficient on D_Rm_squared:")
print("  - Negative & significant: Additional herding during high volatility")
print("  - Positive & significant: Less herding during high volatility")
print("="*70)

## 6. Summary and Conclusions

### Key Takeaways

1. **CSSD Method**: Tests if return dispersion is lower during extreme market movements
2. **CSAD Method**: Tests for non-linear relationship between dispersion and market returns
3. **State-Dependent Analysis**: Examines herding across different market conditions

### Cryptocurrency Market Implications

Herding in cryptocurrency markets can be driven by:
- Social media sentiment (Twitter, Reddit, etc.)
- Influencer activity
- Fear of missing out (FOMO)
- Limited fundamental analysis tools
- Market immaturity

### Extensions for Further Research

1. **Sentiment Analysis**: Incorporate social media sentiment data
2. **Network Effects**: Analyze herding across different cryptocurrency categories (DeFi, NFTs, etc.)
3. **High-Frequency Analysis**: Use intraday data for more granular insights
4. **Cross-Market Herding**: Examine herding between crypto and traditional markets
5. **Machine Learning**: Use ML models to predict herding episodes

In [None]:
# Create comprehensive summary table
summary_results = pd.DataFrame({
    'Method': ['CSSD - Lower Tail', 'CSSD - Upper Tail', 'CSAD - Overall', 
               'CSAD - Up Market', 'CSAD - Down Market', 'CSAD - High Vol'],
    'Coefficient': [
        cssd_results['model'].params['D_lower'],
        cssd_results['model'].params['D_upper'],
        csad_results['model'].params['Rm_squared'],
        state_results['model_up'].params['Rm_squared'],
        state_results['model_down'].params['Rm_squared'],
        model_vol.params['D_Rm_squared']
    ],
    'P-value': [
        cssd_results['model'].pvalues['D_lower'],
        cssd_results['model'].pvalues['D_upper'],
        csad_results['model'].pvalues['Rm_squared'],
        state_results['model_up'].pvalues['Rm_squared'],
        state_results['model_down'].pvalues['Rm_squared'],
        model_vol.pvalues['D_Rm_squared']
    ]
})

# Add significance stars
def add_stars(p):
    if p < 0.01:
        return '***'
    elif p < 0.05:
        return '**'
    elif p < 0.10:
        return '*'
    else:
        return ''

summary_results['Significance'] = summary_results['P-value'].apply(add_stars)

# Determine herding evidence
def herding_evidence(row):
    if 'CSSD' in row['Method']:
        return 'Yes' if (row['Coefficient'] < 0 and row['P-value'] < 0.05) else 'No'
    else:  # CSAD
        return 'Yes' if (row['Coefficient'] < 0 and row['P-value'] < 0.05) else 'No'

summary_results['Herding Evidence'] = summary_results.apply(herding_evidence, axis=1)

print("="*90)
print(" "*30 + "SUMMARY OF HERDING TESTS")
print("="*90)
print(summary_results.to_string(index=False))
print("="*90)
print("\nSignificance levels: *** p<0.01, ** p<0.05, * p<0.10")
print("\nHerding Evidence Criteria:")
print("  - CSSD: Negative coefficient & p < 0.05")
print("  - CSAD: Negative coefficient on Rm² term & p < 0.05")
print("="*90)

## 7. Next Steps and Real Data Implementation

To use this notebook with real cryptocurrency data:

```python
# Install required packages
!pip install yfinance pycoingecko ccxt

# Example with yfinance
import yfinance as yf

crypto_tickers = ['BTC-USD', 'ETH-USD', 'BNB-USD', 'XRP-USD', 'ADA-USD', 
                  'SOL-USD', 'DOT-USD', 'DOGE-USD', 'AVAX-USD', 'MATIC-USD']

prices_df = yf.download(crypto_tickers, start='2020-01-01', end='2024-01-01')['Adj Close']
```

### Additional Resources

- CoinGecko API: https://www.coingecko.com/en/api
- Binance API: https://github.com/binance/binance-spot-api-docs
- CCXT Library: https://github.com/ccxt/ccxt
- Crypto Fear & Greed Index: https://alternative.me/crypto/fear-and-greed-index/