# Bitcoin Price Prediction Using Deep Learning Techniques

## Part 2: Fractional Differencing and Data Preparation

In this notebook, we implement fractional differencing as an advanced technique for achieving stationarity while preserving long-term memory in the time series. We then prepare the data for our deep learning models.

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import yfinance as yf

# Visualization Settings
import matplotlib as mpl
mpl.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['axes.grid'] = True
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

# Suppress Warnings
import warnings
warnings.filterwarnings("ignore")

# Set random seed for reproducibility
np.random.seed(42)

# Load data from Part 1 or fetch it again
try:
    # Try to load data from a saved file if available
    btc_data = pd.read_csv('btc_data.csv', index_col=0, parse_dates=True)
    print("Loaded data from saved file.")
except FileNotFoundError:
    # If file doesn't exist, fetch data again
    print("Fetching data from Yahoo Finance...")
    start_date = "2020-01-01"
    end_date = "2023-12-31"
    btc_data = yf.download("BTC-USD", start=start_date, end=end_date)
    
    # Calculate log returns
    btc_data['Log_Returns'] = np.log(btc_data['Close'] / btc_data['Close'].shift(1))
    
    # Save data for future use
    btc_data.to_csv('btc_data.csv')
    print("Data fetched and saved.")

# Display data info
print("\nData Information:")
print(f"Shape: {btc_data.shape}")
print(f"Date Range: {btc_data.index.min()} to {btc_data.index.max()}")

## Fractional Differencing

### Theoretical Background

Fractional differencing is an advanced technique that allows us to achieve stationarity in a time series while preserving more of the long-term memory compared to traditional differencing methods.

In traditional differencing, we use integer orders (typically 1 or 2) to make a time series stationary. However, this can sometimes remove too much information from the series. Fractional differencing allows us to use non-integer orders (e.g., 0.4, 0.5, etc.) to find an optimal balance between stationarity and information preservation.

The fractional differencing operation is defined as:

$(1-L)^d X_t = \sum_{k=0}^{\infty} \binom{d}{k} (-1)^k X_{t-k}$

where $L$ is the lag operator, $d$ is the differencing order (which can be fractional), and $\binom{d}{k}$ is the binomial coefficient.

Let's implement this technique and apply it to our Bitcoin price data.

In [None]:
def get_weights(d, size):
    """
    Compute weights for fractional differencing.
    
    Parameters:
    -----------
    d : float
        Fractional differencing order
    size : int
        Number of weights to compute
        
    Returns:
    --------
    numpy.ndarray
        Array of weights
    """
    weights = [1.0]
    for k in range(1, size):
        weights.append(weights[-1] * (d - k + 1) / k)
    return np.array(weights)

def fractional_difference(series, d, threshold=1e-5):
    """
    Apply fractional differencing to a time series.
    
    Parameters:
    -----------
    series : pandas.Series
        Time series to difference
    d : float
        Fractional differencing order
    threshold : float
        Threshold for truncating weights
        
    Returns:
    --------
    pandas.Series
        Fractionally differenced series
    """
    # Get weights
    weights = get_weights(d, len(series))
    
    # Truncate weights based on threshold
    weights_threshold = weights[np.abs(weights) > threshold]
    width = len(weights_threshold)
    
    # Apply weights to series
    df_diff = pd.Series(index=series.index)
    for i in range(width, len(series)):
        df_diff.iloc[i] = np.dot(weights_threshold, series.iloc[i-width:i].values[::-1])
    
    return df_diff

def find_optimal_d(series, d_range=np.arange(0, 1.01, 0.1), threshold=1e-5, significance=0.05):
    """
    Find the optimal fractional differencing order that achieves stationarity
    while minimizing information loss.
    
    Parameters:
    -----------
    series : pandas.Series
        Time series to analyze
    d_range : numpy.ndarray
        Range of d values to test
    threshold : float
        Threshold for truncating weights
    significance : float
        Significance level for ADF test
        
    Returns:
    --------
    float
        Optimal differencing order
    pandas.DataFrame
        Results for all tested d values
    """
    results = []
    
    for d in d_range:
        # Apply fractional differencing
        diff_series = fractional_difference(series, d, threshold)
        diff_series = diff_series.dropna()
        
        # Perform ADF test
        adf_stat, p_val, _, _, critical_values, _ = adfuller(diff_series, maxlag=1)
        
        # Calculate information retention (correlation with original series)
        common_index = diff_series.index.intersection(series.index)
        if len(common_index) > 0:
            corr = np.corrcoef(diff_series.loc[common_index], series.loc[common_index])[0, 1]
        else:
            corr = np.nan
        
        # Store results
        results.append({
            'd': d,
            'ADF Statistic': adf_stat,
            'p-value': p_val,
            'Critical Value (5%)': critical_values['5%'],
            'Stationary': p_val < significance,
            'Information Retention': corr
        })
    
    # Convert to DataFrame
    results_df = pd.DataFrame(results)
    
    # Find optimal d (smallest d that achieves stationarity)
    stationary_results = results_df[results_df['Stationary']]
    if len(stationary_results) > 0:
        optimal_d = stationary_results['d'].min()
    else:
        optimal_d = 1.0  # Default to 1.0 if no stationary result found
    
    return optimal_d, results_df

# Find optimal fractional differencing order for Bitcoin prices
optimal_d, results_df = find_optimal_d(btc_data['Close'])

print(f"Optimal fractional differencing order: {optimal_d}")
display(results_df)

# Plot results
plt.figure(figsize=(12, 6))
plt.plot(results_df['d'], results_df['ADF Statistic'], marker='o', label='ADF Statistic')
plt.axhline(y=results_df['Critical Value (5%)'].iloc[0], color='r', linestyle='--', label='5% Critical Value')
plt.axvline(x=optimal_d, color='g', linestyle='--', label=f'Optimal d = {optimal_d}')
plt.xlabel('Fractional Differencing Order (d)')
plt.ylabel('ADF Statistic')
plt.title('Finding Optimal Fractional Differencing Order')
plt.legend()
plt.grid(True)
plt.show()

# Plot information retention
plt.figure(figsize=(12, 6))
plt.plot(results_df['d'], results_df['Information Retention'], marker='o')
plt.axvline(x=optimal_d, color='g', linestyle='--', label=f'Optimal d = {optimal_d}')
plt.xlabel('Fractional Differencing Order (d)')
plt.ylabel('Information Retention (Correlation)')
plt.title('Information Retention vs. Differencing Order')
plt.legend()
plt.grid(True)
plt.show()

## Apply Fractional Differencing

Now that we've found the optimal fractional differencing order, let's apply it to our Bitcoin price data and analyze the resulting series.

In [None]:
# Apply fractional differencing with the optimal order
btc_data['Frac_Diff'] = fractional_difference(btc_data['Close'], optimal_d)

# Plot the original and fractionally differenced series
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# Original price series
axes[0].plot(btc_data.index, btc_data['Close'], color='#1f77b4')
axes[0].set_title('Original Bitcoin Price Series', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Date')
axes[0].set_ylabel('Price (USD)')
axes[0].grid(True, alpha=0.3)

# Fractionally differenced series
axes[1].plot(btc_data.index, btc_data['Frac_Diff'], color='#ff7f0e')
axes[1].set_title(f'Fractionally Differenced Series (d={optimal_d})', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Date')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Test stationarity of the fractionally differenced series
frac_diff_series = btc_data['Frac_Diff'].dropna()
adf_result = adfuller(frac_diff_series)

print("Augmented Dickey-Fuller Test for Fractionally Differenced Series:")
print(f"ADF Statistic: {adf_result[0]:.4f}")
print(f"p-value: {adf_result[1]:.8f}")
print("Critical Values:")
for key, value in adf_result[4].items():
    print(f"\t{key}: {value:.4f}")

if adf_result[1] <= 0.05:
    print("\nThe fractionally differenced series is stationary.")
else:
    print("\nThe fractionally differenced series is non-stationary.")

# Plot ACF and PACF of fractionally differenced series
fig, axes = plt.subplots(2, 1, figsize=(14, 10))

# ACF
plot_acf(frac_diff_series, lags=40, ax=axes[0])
axes[0].set_title('ACF of Fractionally Differenced Series', fontsize=14, fontweight='bold')
axes[0].grid(True, alpha=0.3)

# PACF
plot_pacf(frac_diff_series, lags=40, ax=axes[1])
axes[1].set_title('PACF of Fractionally Differenced Series', fontsize=14, fontweight='bold')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Data Preparation for Modeling

Now that we have our original series and the fractionally differenced series, let's prepare the data for our deep learning models. We'll create sequences of past observations to predict future values.

In [None]:
def create_sequences(data, target, window_size):
    """
    Create sequences of past observations for time series prediction.
    
    Parameters:
    -----------
    data : pandas.Series
        Input time series data
    target : pandas.Series
        Target time series data (can be the same as data for direct prediction)
    window_size : int
        Number of past observations to use for prediction
        
    Returns:
    --------
    tuple
        (X, y) where X is the input sequences and y is the target values
    """
    X, y = [], []
    for i in range(len(data) - window_size):
        X.append(data[i:i+window_size])
        y.append(target[i+window_size])
    return np.array(X), np.array(y)

def train_test_split_time_series(X, y, train_ratio=0.8):
    """
    Split time series data into training and testing sets.
    
    Parameters:
    -----------
    X : numpy.ndarray
        Input sequences
    y : numpy.ndarray
        Target values
    train_ratio : float
        Proportion of data to use for training
        
    Returns:
    --------
    tuple
        (X_train, X_test, y_train, y_test)
    """
    train_size = int(len(X) * train_ratio)
    X_train, X_test = X[:train_size], X[train_size:]
    y_train, y_test = y[:train_size], y[train_size:]
    return X_train, X_test, y_train, y_test

# Parameters
WINDOW_SIZE = 30  # Use 30 days of past data to predict the next day
TRAIN_RATIO = 0.8

# Prepare data for original price series
# Drop NaN values
price_data = btc_data['Close'].dropna()
X_price, y_price = create_sequences(price_data.values, price_data.values, WINDOW_SIZE)
X_train_price, X_test_price, y_train_price, y_test_price = train_test_split_time_series(
    X_price, y_price, TRAIN_RATIO
)

# Prepare data for fractionally differenced series
frac_diff_data = btc_data['Frac_Diff'].dropna()
X_frac, y_frac = create_sequences(frac_diff_data.values, frac_diff_data.values, WINDOW_SIZE)
X_train_frac, X_test_frac, y_train_frac, y_test_frac = train_test_split_time_series(
    X_frac, y_frac, TRAIN_RATIO
)

# Print data shapes
print("Original Price Data:")
print(f"X_train shape: {X_train_price.shape}")
print(f"y_train shape: {y_train_price.shape}")
print(f"X_test shape: {X_test_price.shape}")
print(f"y_test shape: {y_test_price.shape}")

print("\nFractionally Differenced Data:")
print(f"X_train shape: {X_train_frac.shape}")
print(f"y_train shape: {y_train_frac.shape}")
print(f"X_test shape: {X_test_frac.shape}")
print(f"y_test shape: {y_test_frac.shape}")

## Save Prepared Data

Let's save our prepared data for use in the next parts of our analysis.

In [None]:
# Save prepared data
import pickle

data_dict = {
    'btc_data': btc_data,
    'optimal_d': optimal_d,
    'X_train_price': X_train_price,
    'X_test_price': X_test_price,
    'y_train_price': y_train_price,
    'y_test_price': y_test_price,
    'X_train_frac': X_train_frac,
    'X_test_frac': X_test_frac,
    'y_train_frac': y_train_frac,
    'y_test_frac': y_test_frac,
    'window_size': WINDOW_SIZE
}

with open('btc_prepared_data.pkl', 'wb') as f:
    pickle.dump(data_dict, f)

print("Data prepared and saved successfully.")

## Summary of Part 2

In this part of our analysis, we've:

1. Implemented fractional differencing as an advanced technique for achieving stationarity while preserving long-term memory
2. Found the optimal fractional differencing order for our Bitcoin price data
3. Applied fractional differencing and verified the stationarity of the resulting series
4. Prepared sequences of past observations for both the original and fractionally differenced series
5. Split the data into training and testing sets for model development

In the next parts, we'll:
- Build and train MLP models on both the original and fractionally differenced data
- Create GAF image representations and build CNN models
- Evaluate and compare the performance of different approaches