# Experimenting with Deep Neural Network

This notebook demonstrates a feed-forward neural network approach to stock price prediction. Instead of predicting absolute prices directly, we predict returns (percentage changes) which offers several advantages for financial time series.

## Model Philosophy
The key idea is to transform the price prediction problem into a returns prediction problem. This approach:
- Makes the target variable more stationary
- Reduces the scale dependency of the model
- Better aligns with how financial markets actually move (in percentage terms)
- Allows for more meaningful confidence intervals
- Makes the learning task more manageable for the network

## Network Architecture
A minimal fully-connected (dense) feed-forward neural network with:
- Input layer accepting standardized technical indicators
- Two hidden layers with 8 units each
- ReLU activation in hidden layers
- Batch normalization after each hidden layer
- Dropout (0.3) for regularization
- Linear output layer for returns prediction

## Feature Engineering
1. **Price Returns** (1, 2, 3, 5, 10 days)
   - Capture momentum at different timeframes
   - Already in percentage form, matching our target

2. **Moving Average Returns** (5, 10, 20 days)
   - Price relative to different trend periods
   - Measure trend strength and divergence

3. **Volatility Features** (5, 10, 20 days)
   - Price and volume volatility
   - Risk indicators at multiple horizons

4. **Volume Indicators**
   - Volume trend (5-day MA / 20-day MA)
   - Capture trading activity patterns

## Training Approach
- Mean Squared Error loss on returns
- Adam optimizer with learning rate 0.0005
- Cyclic learning rate (0.0005 to 0.005)
- Early stopping (patience=50) to prevent overfitting
- Batch size of 32 for stable gradients
- 70/20/10 train/validation/test split

## Post-Processing
- Convert predicted returns back to prices
- Calculate confidence intervals based on prediction errors
- Evaluate trading metrics (win rates, returns, risk measures)

The combination of returns prediction, non-linear activations, and regularization techniques aims to create a model that can capture market patterns while remaining robust to noise and outliers.

## [ADVANCED] Future Directions: Sequence Modeling

While our feed-forward network provides a solid baseline, recurrent architectures could potentially capture more complex temporal patterns:

### LSTM (Long Short-Term Memory)
- Better suited for learning long-term dependencies
- Can maintain memory of market regimes
- Advantages:
  * Handles variable-length sequences naturally
  * Gates help control information flow
  * Can learn when to "forget" outdated market information
  * May better capture market regime changes

### RNN with Attention
- Could learn which historical periods are most relevant
- Advantages:
  * Dynamic focus on relevant timeframes
  * More interpretable (attention weights show important periods)
  * Might handle market shifts more effectively
  * Could identify similar historical patterns automatically

### Why These Could Help
1. **Market Memory**
   - Markets often exhibit "memory effects"
   - Past patterns can repeat at irregular intervals
   - Sequential models could capture these better than fixed windows

2. **Adaptive Timeframes**
   - Different features matter at different times
   - Attention mechanisms could dynamically adjust feature importance
   - More flexible than our fixed lookback periods

3. **Regime Detection**
   - Markets switch between different regimes
   - LSTM gates could help detect and adapt to regime changes
   - More sophisticated than our current volatility features

4. **Non-Linear Temporal Patterns**
   - Market relationships evolve over time
   - Recurrent architectures might capture these dynamics better
   - Could improve on our static feature engineering

The key challenge would be preventing overfitting given the increased model complexity. This would require careful regularization and possibly larger datasets.

In [646]:
# Imports
import yfinance as yf
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px
import tensorflow as tf
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
import warnings

warnings.simplefilter("ignore", DeprecationWarning)
warnings.simplefilter("ignore", FutureWarning)

# Check TensorFlow version
print(f"TensorFlow version: {tf.__version__}")

TensorFlow version: 2.16.2


* Need a reliable data download: forum discussion: https://www.pythonanywhere.com/forums/topic/35201/
  *  Yahoo finance is down again. Been working great since late Oct 2024.
Same issue that Yahoo Finance is blocking or is it something new?

  * Hey @Jaberdeen since you were asking in earlier messages: I personally implemented a try - except mechanism where if it fails to retrieve data from yfinance it fetches a combination of data from openexchangerates api and exchange rate api - found this was a super stable solution for my use case and have not even noticed it was down this time!


In [647]:
def download_stock_data(ticker_symbol):
    """
    Download stock data with fallback mechanism:
    1. Try yfinance first
    2. If fails, try stooq with SSL verification disabled
    """
    def try_yfinance():
        try:
            yticker = yf.Ticker(ticker_symbol)
            df = yticker.history(period='max')
            if not df.empty:
                return df
        except Exception as e:
            print(f"yfinance error: {str(e)}")
        return None
    
    def try_stooq():
        try:
            import ssl
            import urllib3
            import certifi
            
            # Create a custom SSL context that doesn't verify certificates
            http = urllib3.PoolManager(
                cert_reqs='CERT_NONE',
                ca_certs=certifi.where()
            )
            
            url = f'https://stooq.com/q/d/l/?s={ticker_symbol.lower()}&i=d'
            response = http.request('GET', url)
            
            if response.status == 200:
                # Use StringIO to create a file-like object from the response data
                from io import StringIO
                csv_data = StringIO(response.data.decode('utf-8'))
                df = pd.read_csv(csv_data)
                
                if not df.empty:
                    # Rename columns to match our expected format
                    df.columns = [col.capitalize() for col in df.columns]
                    df['Date'] = pd.to_datetime(df['Date'])
                    df.set_index('Date', inplace=True)
                    # Sort index in ascending order (oldest to newest)
                    df.sort_index(inplace=True)
                    return df
        except Exception as e:
            print(f"stooq error: {str(e)}")
        return None
    
    try:
        # Try yfinance first
        print("Attempting to download from yfinance...")
        df = try_yfinance()
        
        # If yfinance fails, try stooq
        if df is None:
            print("yfinance failed, trying stooq...")
            df = try_stooq()
        
        if df is None:
            raise ValueError(f"Failed to download data for {ticker_symbol} from all sources")
        
        print(f"Successfully downloaded {len(df)} days of {ticker_symbol} data")
        
        # Basic validation
        required_columns = ['Open', 'Close', 'Volume']
        missing_cols = [col for col in required_columns if col not in df.columns]
        if missing_cols:
            raise ValueError(f"Missing required columns: {missing_cols}")
        
        return df
    
    except Exception as e:
        print(f"Error downloading {ticker_symbol}: {str(e)}")
        return None

In [648]:
# Try both QQQ (Yahoo Finance) and "qqq.us" (stooq)
ticker = "QQQ"
df = download_stock_data(ticker)

if df is None:
    print("\nTrying alternative symbol...")
    ticker = "qqq.us"
    df = download_stock_data(ticker)

if df is not None:
    print("\nRaw data sample:")
    print(df.tail(1))

Attempting to download from yfinance...


$QQQ: possibly delisted; no price data found  (1d 1926-03-17 -> 2025-02-20)


yfinance failed, trying stooq...




Failed to get ticker 'QQQ.US' reason: Expecting value: line 1 column 1 (char 0)
$QQQ.US: possibly delisted; no timezone found


Error downloading QQQ: Failed to download data for QQQ from all sources

Trying alternative symbol...
Attempting to download from yfinance...
yfinance failed, trying stooq...






Successfully downloaded 6528 days of qqq.us data

Raw data sample:
              Open   High     Low   Close      Volume
Date                                                 
2025-02-20  538.73  539.1  532.46  537.23  26329519.0


In [649]:
def transform_stock_data_for_dnn(df, ticker):
    """
    Transform stock data for DNN model with returns prediction
    """
    if df is None:
        return None
    
    # Reset index to make the date a column
    df = df.reset_index()
    
    # Create DataFrame
    dnn_df = pd.DataFrame()
    dnn_df['ds'] = pd.to_datetime(df['Date']).dt.tz_localize(None)
    
    # Target: 3-day future return
    dnn_df['y_return'] = (df['Close'].shift(-3) / df['Close'] - 1) * 100
    
    # Store actual future price for evaluation
    dnn_df['y'] = df['Close'].shift(-3)
    
    # Store actual close price for later conversion
    dnn_df['close'] = df['Close']
    
    # Recent returns at different timeframes
    for i in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 30, 60]:
        dnn_df[f'return_{i}d'] = (
            df['Close'].pct_change(periods=i) * 100
        )
    
    # Moving average returns
    for window in [5, 10, 20, 30, 60]:
        ma = df['Close'].rolling(window=window).mean()
        dnn_df[f'ma{window}_return'] = (
            (df['Close'] / ma - 1) * 100
        )
    
    # Volatility features
    for window in [5, 10, 20, 30, 60]:
        # Return volatility
        dnn_df[f'volatility_{window}d'] = (
            df['Close'].pct_change().rolling(window=window).std() * 100
        )
        
        # Volume volatility
        dnn_df[f'volume_volatility_{window}d'] = (
            df['Volume'].pct_change().rolling(window=window).std() * 100
        )
    
    # Volume ratios
    dnn_df['volume_ma5'] = df['Volume'].rolling(window=5).mean()
    dnn_df['volume_ma20'] = df['Volume'].rolling(window=20).mean()
    dnn_df['volume_ratio'] = dnn_df['volume_ma5'] / dnn_df['volume_ma20']
    
    # Split data
    total_days = len(df)
    train_end = int(total_days * 0.7)
    val_end = int(total_days * 0.9)

    dnn_df['split'] = 'train'
    dnn_df.loc[train_end:val_end-1, 'split'] = 'validation'
    dnn_df.loc[val_end:, 'split'] = 'test'
    
    return dnn_df

In [650]:
# Transform data into DNN format
dnn_df = transform_stock_data_for_dnn(df, ticker)

print("\nTransformed data sample:")
print(dnn_df.tail(4))

# Clean data by removing NaN values
dnn_df_clean = dnn_df.dropna()
print(f"\nOriginal shape: {dnn_df.shape}")
print(f"Clean shape: {dnn_df_clean.shape}")


Transformed data sample:
             ds  y_return       y   close  return_1d  return_2d  return_3d  \
6524 2025-02-14 -0.170956  537.23  538.15   0.419854   1.864471   1.924279   
6525 2025-02-18       NaN     NaN  539.37   0.226703   0.647509   2.095400   
6526 2025-02-19       NaN     NaN  539.52   0.027810   0.254576   0.675499   
6527 2025-02-20       NaN     NaN  537.23  -0.424451  -0.396759  -0.170956   

      return_4d  return_5d  return_6d  ...  volatility_20d  \
6524   1.681625   2.912491   1.614426  ...        1.097437   
6525   2.155344   1.912140   3.145797  ...        1.043615   
6526   2.123793   2.183753   1.940482  ...        1.039391   
6527   0.248181   1.690327   1.750033  ...        1.010881   

      volume_volatility_20d  volatility_30d  volume_volatility_30d  \
6524              58.841894        1.155332              49.093461   
6525              56.187457        1.153198              48.997392   
6526              54.474927        1.120643              48.79

In [651]:
def prepare_dnn_data(df):
    """
    Prepare data for DNN model
    """
    # Select features
    feature_columns = []
    
    # Returns features
    for i in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 30, 60]:
        feature_columns.append(f'return_{i}d')
    
    # Moving average returns
    for window in [5, 10, 20, 30, 60]:
        feature_columns.append(f'ma{window}_return')
    
    # Volatility features
    for window in [5, 10, 20, 30, 60]:
        feature_columns.append(f'volatility_{window}d')
        feature_columns.append(f'volume_volatility_{window}d')
    
    # Volume features
    feature_columns.extend(['volume_ratio'])
    
    target_column = 'y_return'  # Changed from 'y' to 'y_return'
    
    print(f"Number of features: {len(feature_columns)}")
    print("\nFeatures used:")
    for i, feat in enumerate(feature_columns, 1):
        print(f"{i}. {feat}")
    
    # Create scalers
    feature_scaler = StandardScaler()
    target_scaler = StandardScaler()
    
    # Split data
    train_data = df[df['split'] == 'train']
    val_data = df[df['split'] == 'validation']
    test_data = df[df['split'] == 'test']
    
    # Fit scalers on training data only
    feature_scaler.fit(train_data[feature_columns])
    target_scaler.fit(train_data[[target_column]])
    
    # Transform all sets
    X_train = feature_scaler.transform(train_data[feature_columns])
    y_train = target_scaler.transform(train_data[[target_column]])
    
    X_val = feature_scaler.transform(val_data[feature_columns])
    y_val = target_scaler.transform(val_data[[target_column]])
    
    X_test = feature_scaler.transform(test_data[feature_columns])
    y_test = target_scaler.transform(test_data[[target_column]])
    
    return (X_train, y_train, X_val, y_val, X_test, y_test, 
            feature_scaler, target_scaler, feature_columns)

In [652]:
# Prepare data for DNN
(X_train, y_train, X_val, y_val, X_test, y_test,
 feature_scaler, target_scaler, feature_columns) = prepare_dnn_data(dnn_df_clean)

print("\nTraining set shape:", X_train.shape)
print("Validation set shape:", X_val.shape)
print("Test set shape:", X_test.shape)

Number of features: 28

Features used:
1. return_1d
2. return_2d
3. return_3d
4. return_4d
5. return_5d
6. return_6d
7. return_7d
8. return_8d
9. return_9d
10. return_10d
11. return_30d
12. return_60d
13. ma5_return
14. ma10_return
15. ma20_return
16. ma30_return
17. ma60_return
18. volatility_5d
19. volume_volatility_5d
20. volatility_10d
21. volume_volatility_10d
22. volatility_20d
23. volume_volatility_20d
24. volatility_30d
25. volume_volatility_30d
26. volatility_60d
27. volume_volatility_60d
28. volume_ratio

Training set shape: (4509, 28)
Validation set shape: (1306, 28)
Test set shape: (650, 28)


In [653]:
# SIMPLE ARCHITECTURE

# def build_dnn_model(input_dim):
#     """
#     Build minimal DNN model for returns prediction
#     """
#     model = Sequential([
#         # First layer
#         Dense(32, activation='relu', input_dim=input_dim),
#         Dropout(0.2),
        
#         # Second layer
#         Dense(16, activation='relu'),
#         Dropout(0.1),
        
#         # Output layer
#         Dense(1, activation='linear')
#     ])
    
#     model.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),  # Slightly higher learning rate
#         loss='huber',  # More robust to outliers
#         metrics=[tf.keras.metrics.RootMeanSquaredError(name='rmse')]
#     )
    
#     return model

# def train_dnn_model(X_train, y_train, X_val, y_val, input_dim):
#     """
#     Train DNN model with simple training regime
#     """
#     # Build model
#     model = build_dnn_model(input_dim)
    
#     # Early stopping
#     early_stopping = EarlyStopping(
#         monitor='val_loss',
#         patience=30,
#         restore_best_weights=True,
#         mode='min'
#     )
    
#     # Reduce LR on plateau
#     reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(
#         monitor='val_loss',
#         factor=0.5,
#         patience=10,
#         min_lr=1e-5,
#         mode='min',
#         verbose=1
#     )
    
#     # Train model
#     history = model.fit(
#         X_train, y_train,
#         validation_data=(X_val, y_val),
#         epochs=200,
#         batch_size=32,
#         callbacks=[early_stopping, reduce_lr],
#         verbose=1
#     )
    
#     return model, history

In [654]:
# COMPLEX ARCHITECTURE: (Trainable params: 58,945 (230.25 KB))
# def build_dnn_model(input_dim):
#     """
#     Build more complex DNN model with residual connections
#     """
#     inputs = tf.keras.Input(shape=(input_dim,))
    
#     # First block
#     x = Dense(128, activation=None)(inputs)
#     x = BatchNormalization()(x)
#     x = tf.keras.layers.Activation('relu')(x)
#     x = Dropout(0.3)(x)
    
#     # Residual block 1
#     r = x
#     x = Dense(128, activation=None)(x)
#     x = BatchNormalization()(x)
#     x = tf.keras.layers.Activation('relu')(x)
#     x = Dropout(0.3)(x)
#     x = Dense(128, activation=None)(x)
#     x = BatchNormalization()(x)
#     x = tf.keras.layers.Add()([x, r])  # Skip connection
#     x = tf.keras.layers.Activation('relu')(x)
    
#     # Residual block 2
#     r = x
#     x = Dense(64, activation=None)(x)
#     x = BatchNormalization()(x)
#     x = tf.keras.layers.Activation('relu')(x)
#     x = Dropout(0.2)(x)
#     x = Dense(64, activation=None)(x)
#     x = BatchNormalization()(x)
#     x = tf.keras.layers.Add()([x, Dense(64)(r)])  # Skip connection with dimension matching
#     x = tf.keras.layers.Activation('relu')(x)
    
#     # Final layers
#     x = Dense(32, activation='relu')(x)
#     x = BatchNormalization()(x)
#     x = Dropout(0.1)(x)
    
#     outputs = Dense(1, activation='linear')(x)
    
#     model = tf.keras.Model(inputs=inputs, outputs=outputs)
    
#     # Simple optimizer with fixed learning rate
#     optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
    
#     # Compile with Huber loss for robustness
#     model.compile(
#         optimizer=optimizer,
#         loss=tf.keras.losses.Huber(delta=1.0),
#         metrics=[
#             tf.keras.metrics.RootMeanSquaredError(name='rmse'),
#             tf.keras.metrics.MeanAbsoluteError(name='mae')
#         ]
#     )
    
#     return model
# def train_dnn_model(X_train, y_train, X_val, y_val, input_dim):
#     """
#     Train model with more sophisticated training regime
#     """
#     # Build model
#     model = build_dnn_model(input_dim)
    
#     # Early stopping with longer patience
#     early_stopping = EarlyStopping(
#         monitor='val_loss',
#         patience=100,
#         restore_best_weights=True,
#         mode='min'
#     )
    
#     # Reduce LR on plateau with more aggressive reduction
#     reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(
#         monitor='val_loss',
#         factor=0.2,  # More aggressive reduction
#         patience=30,
#         min_lr=1e-6,
#         mode='min',
#         verbose=1
#     )
    
#     # Train with larger batch size and more epochs
#     history = model.fit(
#         X_train, y_train,
#         validation_data=(X_val, y_val),
#         epochs=1000,
#         batch_size=64,
#         callbacks=[early_stopping, reduce_lr],
#         verbose=1
#     )
    
#     return model, history

In [655]:
# MODERATE ARCHITECTURE

# def build_dnn_model(input_dim):
#     """
#     Build DNN model for returns prediction
#     """
#     model = Sequential([
#         # Input normalization
#         BatchNormalization(input_dim=input_dim),
        
#         # First block
#         Dense(64, activation='relu'),
#         BatchNormalization(),
#         Dropout(0.3),
        
#         # Second block
#         Dense(64, activation='relu'),
#         BatchNormalization(),
#         Dropout(0.3),
        
#         # Output layer (returns prediction)
#         Dense(1, activation='linear')
#     ])
    
#     model.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
#         loss='mse',
#         metrics=[tf.keras.metrics.RootMeanSquaredError(name='rmse')]
#     )
    
#     return model

# def train_dnn_model(X_train, y_train, X_val, y_val, input_dim):
#     """
#     Train DNN model with early stopping
#     """
#     # Build model
#     model = build_dnn_model(input_dim)
    
#     # Early stopping callback
#     early_stopping = EarlyStopping(
#         monitor='val_rmse',
#         patience=50,
#         restore_best_weights=True,
#         mode='min'
#     )
    
#     # Reduce learning rate on plateau
#     reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(
#         monitor='val_rmse',
#         factor=0.2,
#         patience=20,
#         min_lr=1e-6,
#         mode='min',
#         verbose=1
#     )
    
#     # Train model
#     history = model.fit(
#         X_train, y_train,
#         validation_data=(X_val, y_val),
#         epochs=500,
#         batch_size=32,
#         callbacks=[early_stopping, reduce_lr],
#         verbose=1
#     )
    
#     return model, history

In [656]:
# SIMPLE ARCHITECTURE
def build_dnn_model(input_dim):
    """
    Build simple two-layer DNN model with 8 units each
    """
    inputs = tf.keras.Input(shape=(input_dim,))
    
    # First layer
    x = Dense(8, activation='relu')(inputs)
    x = BatchNormalization()(x)
    x = Dropout(0.3)(x)
    
    # Second layer
    x = Dense(8, activation='relu')(x)
    x = BatchNormalization()(x)
    x = Dropout(0.25)(x)
    
    # Output layer
    outputs = Dense(1)(x)
    
    model = tf.keras.Model(inputs=inputs, outputs=outputs)
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
        loss='mse',
        metrics=[tf.keras.metrics.RootMeanSquaredError(name='rmse')]
    )
    
    return model

def train_dnn_model(X_train, y_train, X_val, y_val, input_dim):
    """
    Train with cyclic learning rate
    """
    model = build_dnn_model(input_dim)
    
    # Early stopping
    early_stopping = EarlyStopping(
        monitor='val_rmse',
        patience=50,
        restore_best_weights=True,
        mode='min'
    )
    
    # Cyclic learning rate
    initial_learning_rate = 0.0005
    maximal_learning_rate = 0.005
    step_size = 8
    
    def cyclic_lr(epoch):
        cycle = np.floor(1 + epoch/(2 * step_size))
        x = np.abs(epoch/step_size - 2 * cycle + 1)
        lr = initial_learning_rate + (maximal_learning_rate - initial_learning_rate) * max(0, (1-x))
        return lr
    
    lr_scheduler = tf.keras.callbacks.LearningRateScheduler(cyclic_lr)
    
    # Train model
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=300,
        batch_size=32,
        callbacks=[early_stopping, lr_scheduler],
        verbose=1
    )
    
    return model, history

In [657]:
# Train the DNN model
print("\nTraining DNN model...")
model, history = train_dnn_model(X_train, y_train, X_val, y_val, len(feature_columns))

# Plot training history
fig = go.Figure()

fig.add_trace(go.Scatter(
    y=history.history['loss'],
    name='Train Loss'
))

fig.add_trace(go.Scatter(
    y=history.history['val_loss'],
    name='Validation Loss'
))

fig.update_layout(
    title='Model Training History',
    xaxis_title='Epoch',
    yaxis_title='Loss',
    template='plotly_white'
)

fig.show()

# Print final loss values
print("\nFinal training loss:", history.history['loss'][-1])
print("Final validation loss:", history.history['val_loss'][-1])


Training DNN model...
Epoch 1/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - loss: 3.5428 - rmse: 1.8810 - val_loss: 0.8325 - val_rmse: 0.9124 - learning_rate: 5.0000e-04
Epoch 2/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 2.2259 - rmse: 1.4905 - val_loss: 0.7772 - val_rmse: 0.8816 - learning_rate: 0.0011
Epoch 3/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 1.5791 - rmse: 1.2556 - val_loss: 0.7140 - val_rmse: 0.8450 - learning_rate: 0.0016
Epoch 4/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 1.2721 - rmse: 1.1272 - val_loss: 0.6868 - val_rmse: 0.8287 - learning_rate: 0.0022
Epoch 5/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 1.1581 - rmse: 1.0753 - val_loss: 0.6783 - val_rmse: 0.8236 - learning_rate: 0.0027
Epoch 6/300
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 


Final training loss: 0.9677157998085022
Final validation loss: 0.6885048747062683


In [658]:
def make_predictions(model, X_data, target_scaler, dnn_df):
    """
    Make predictions and convert returns to prices
    """
    # Predict returns
    y_pred_scaled = model.predict(X_data)
    y_pred_returns = target_scaler.inverse_transform(y_pred_scaled)
    
    # Convert returns to prices
    current_prices = dnn_df['close'].values
    y_pred_prices = current_prices * (1 + y_pred_returns.flatten()/100)
    
    # Calculate prediction std for confidence intervals
    train_pred = model.predict(X_train)
    train_errors = train_pred - y_train
    error_std = np.std(train_errors)
    
    # Add confidence intervals (2 standard deviations)
    conf_interval = error_std * 2
    y_pred_returns_lower = y_pred_returns.flatten() - conf_interval
    y_pred_returns_upper = y_pred_returns.flatten() + conf_interval
    
    y_pred_lower = current_prices * (1 + y_pred_returns_lower/100)
    y_pred_upper = current_prices * (1 + y_pred_returns_upper/100)
    
    return y_pred_prices, y_pred_lower, y_pred_upper

In [659]:
# Make predictions and convert to prices
print("\nMaking predictions...")
train_pred, train_lower, train_upper = make_predictions(model, X_train, target_scaler, dnn_df_clean[dnn_df_clean['split'] == 'train'])
val_pred, val_lower, val_upper = make_predictions(model, X_val, target_scaler, dnn_df_clean[dnn_df_clean['split'] == 'validation'])
test_pred, test_lower, test_upper = make_predictions(model, X_test, target_scaler, dnn_df_clean[dnn_df_clean['split'] == 'test'])

# Add predictions back to DataFrame
dnn_df_clean['yhat'] = np.nan
dnn_df_clean['yhat_lower'] = np.nan
dnn_df_clean['yhat_upper'] = np.nan

train_idx = dnn_df_clean[dnn_df_clean['split'] == 'train'].index
val_idx = dnn_df_clean[dnn_df_clean['split'] == 'validation'].index
test_idx = dnn_df_clean[dnn_df_clean['split'] == 'test'].index

dnn_df_clean.loc[train_idx, 'yhat'] = train_pred
dnn_df_clean.loc[train_idx, 'yhat_lower'] = train_lower
dnn_df_clean.loc[train_idx, 'yhat_upper'] = train_upper

dnn_df_clean.loc[val_idx, 'yhat'] = val_pred
dnn_df_clean.loc[val_idx, 'yhat_lower'] = val_lower
dnn_df_clean.loc[val_idx, 'yhat_upper'] = val_upper

dnn_df_clean.loc[test_idx, 'yhat'] = test_pred
dnn_df_clean.loc[test_idx, 'yhat_lower'] = test_lower
dnn_df_clean.loc[test_idx, 'yhat_upper'] = test_upper


Making predictions...


[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
[1m41/41[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 859us/step
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step 
[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 778us/step




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [660]:
def plot_dnn_analysis(forecast_df, split_column='split'):
    """
    Plot DNN predictions with confidence intervals and residuals
    """
    # Create figure with secondary y-axis
    fig = make_subplots(
        rows=2, cols=1, 
        shared_xaxes=True,
        vertical_spacing=0.1,
        subplot_titles=('Actual vs Predicted', 'Residuals')
    )
    
    # Calculate residuals
    forecast_df['residuals'] = forecast_df['y'] - forecast_df['yhat']
    
    # Define colors for each split
    split_colors = {
        'train': 'rgb(0, 114, 178)',      # Blue
        'validation': 'rgb(240, 228, 66)', # Yellow
        'test': 'rgb(230, 159, 0)'        # Orange
    }
    
    # Add traces for each split
    for split in forecast_df[split_column].unique():
        mask = forecast_df[split_column] == split
        split_data = forecast_df[mask]
        color = split_colors[split]
        
        # Main plot
        fig.add_trace(
            go.Scatter(
                x=split_data.ds, 
                y=split_data.y,
                name=f'Actual ({split})',
                mode='markers',
                marker=dict(color=color)
            ),
            row=1, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=split_data.ds, 
                y=split_data.yhat,
                name=f'Predicted ({split})',
                mode='lines',
                line=dict(color=color)
            ),
            row=1, col=1
        )
        
        # Confidence intervals
        rgb_values = [int(x) for x in color.replace('rgb(', '').replace(')', '').split(',')]
        rgba_color = f'rgba({rgb_values[0]}, {rgb_values[1]}, {rgb_values[2]}, 0.2)'
        
        fig.add_trace(
            go.Scatter(
                x=split_data.ds,
                y=split_data.yhat_upper,
                fill=None,
                mode='lines',
                line=dict(width=0),
                showlegend=False
            ),
            row=1, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=split_data.ds,
                y=split_data.yhat_lower,
                fill='tonexty',
                mode='lines',
                line=dict(width=0),
                fillcolor=rgba_color,
                name=f'CI ({split})'
            ),
            row=1, col=1
        )
        
        # Residuals plot
        fig.add_trace(
            go.Scatter(
                x=split_data.ds,
                y=split_data.residuals,
                name=f'Residuals ({split})',
                mode='markers',
                marker=dict(color=color)
            ),
            row=2, col=1
        )
    
    # Update layout
    fig.update_layout(
        height=800,
        showlegend=True,
        title_text="DNN Forecast Analysis with Residuals",
        template='plotly_white'
    )
    
    # Update axes labels
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price", row=1, col=1)
    fig.update_yaxes(title_text="Residual", row=2, col=1)
    
    return fig

In [661]:
# Plot results
fig = plot_dnn_analysis(dnn_df_clean)
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [662]:
# Calculate and display metrics
def calculate_metrics(df):
    metrics = {}
    
    for split in df['split'].unique():
        split_data = df[df['split'] == split]
        
        # Trading signals
        signals = split_data['yhat'] > split_data['close']
        returns = (split_data['y'] - split_data['close']) / split_data['close'] * 100
        
        # Model Win/Loss metrics
        win_rate = ((signals) & (returns > 0)).sum() / signals.sum() * 100
        loss_rate = ((signals) & (returns < 0)).sum() / signals.sum() * 100
        
        # Unconditional Win/Loss metrics (if we traded every day)
        uncond_win_rate = (returns > 0).sum() / len(returns) * 100
        uncond_loss_rate = (returns < 0).sum() / len(returns) * 100
        
        # Win/Loss rate outperformance
        win_rate_outperf = win_rate - uncond_win_rate
        loss_rate_outperf = uncond_loss_rate - loss_rate
        
        # Returns metrics
        avg_return = returns[signals].mean()
        
        # Trading activity
        n_trades = signals.sum()
        trading_freq = (signals.sum() / len(signals)) * 100
        
        # Risk metrics
        wins = returns[(signals) & (returns > 0)]
        losses = returns[(signals) & (returns < 0)]
        pl_ratio = abs(wins.mean() / losses.mean()) if len(losses) > 0 else float('inf')
        
        metrics[split] = {
            'Win_Rate': win_rate,
            'Uncond_Win_Rate': uncond_win_rate,
            'Win_Rate_Outperf': win_rate_outperf,
            'Loss_Rate': loss_rate,
            'Uncond_Loss_Rate': uncond_loss_rate,
            'Loss_Rate_Outperf': loss_rate_outperf,
            'Avg_Return': avg_return,
            'N_Trades': n_trades,
            'Trading_Freq': trading_freq,
            'PL_Ratio': pl_ratio
        }
    
    return pd.DataFrame.from_dict(metrics, orient='index')

In [663]:
# Calculate metrics
metrics_df = calculate_metrics(dnn_df_clean)
print("\nTrading Metrics:")
print(metrics_df.round(2))


Trading Metrics:
            Win_Rate  Uncond_Win_Rate  Win_Rate_Outperf  Loss_Rate  \
train          56.34            55.93              0.41      43.40   
validation     57.98            58.88             -0.90      42.02   
test           59.26            56.77              2.49      40.74   

            Uncond_Loss_Rate  Loss_Rate_Outperf  Avg_Return  N_Trades  \
train                  43.76               0.36        0.21      3097   
validation             41.12              -0.90        0.25       902   
test                   43.23               2.49        0.45       432   

            Trading_Freq  PL_Ratio  
train              68.68      0.94  
validation         69.07      0.95  
test               66.46      1.12  


In [664]:
def plot_trading_metrics(metrics_df):
    fig = make_subplots(
        rows=4, cols=1,
        subplot_titles=(
            'Win/Loss Rates (%)',
            'Returns (%)',
            'Trading Activity',
            'Risk Metrics'
        ),
        vertical_spacing=0.1
    )
    
    split_colors = {
        'train': 'rgb(0, 114, 178)',      # Blue
        'validation': 'rgb(240, 228, 66)', # Yellow
        'test': 'rgb(230, 159, 0)'        # Orange
    }
    
    # 1. Win/Loss Rates with unconditional rates
    for split in metrics_df.index:
        fig.add_trace(
            go.Bar(
                name=split,
                x=['Win Rate', 'Uncond Win Rate', 'Win Outperf',
                   'Loss Rate', 'Uncond Loss Rate', 'Loss Outperf'],
                y=[metrics_df.loc[split, 'Win_Rate'],
                   metrics_df.loc[split, 'Uncond_Win_Rate'],
                   metrics_df.loc[split, 'Win_Rate_Outperf'],
                   metrics_df.loc[split, 'Loss_Rate'],
                   metrics_df.loc[split, 'Uncond_Loss_Rate'],
                   metrics_df.loc[split, 'Loss_Rate_Outperf']],
                marker_color=split_colors[split],
                text=[f"{v:.1f}%" for v in [
                    metrics_df.loc[split, 'Win_Rate'],
                    metrics_df.loc[split, 'Uncond_Win_Rate'],
                    metrics_df.loc[split, 'Win_Rate_Outperf'],
                    metrics_df.loc[split, 'Loss_Rate'],
                    metrics_df.loc[split, 'Uncond_Loss_Rate'],
                    metrics_df.loc[split, 'Loss_Rate_Outperf']]],
                textposition='auto',
                width=0.2
            ),
            row=1, col=1
        )
    
    # 2. Returns
    for split in metrics_df.index:
        fig.add_trace(
            go.Bar(
                name=split,
                x=['Avg Return'],
                y=[metrics_df.loc[split, 'Avg_Return']],
                marker_color=split_colors[split],
                text=[f"{metrics_df.loc[split, 'Avg_Return']:.2f}%"],
                textposition='auto',
                width=0.2
            ),
            row=2, col=1
        )
    
    # 3. Trading Activity
    for split in metrics_df.index:
        fig.add_trace(
            go.Bar(
                name=split,
                x=['N Trades', 'Trading Freq (%)'],
                y=[metrics_df.loc[split, 'N_Trades'],
                   metrics_df.loc[split, 'Trading_Freq']],
                marker_color=split_colors[split],
                text=[f"{int(metrics_df.loc[split, 'N_Trades'])}",
                      f"{metrics_df.loc[split, 'Trading_Freq']:.1f}%"],
                textposition='auto',
                width=0.2
            ),
            row=3, col=1
        )
    
    # 4. Risk Metrics
    for split in metrics_df.index:
        fig.add_trace(
            go.Bar(
                name=split,
                x=['P/L Ratio'],
                y=[metrics_df.loc[split, 'PL_Ratio']],
                marker_color=split_colors[split],
                text=[f"{metrics_df.loc[split, 'PL_Ratio']:.2f}"],
                textposition='auto',
                width=0.2
            ),
            row=4, col=1
        )
    
    # Update layout
    fig.update_layout(
        width=1000,
        height=1200,
        showlegend=True,
        template='plotly_white',
        title=dict(
            text="DNN Trading Performance Metrics by Split",
            y=0.98,
            x=0.5,
            xanchor='center',
            yanchor='top',
            pad=dict(b=20)
        ),
        legend=dict(
            orientation="h",
            yanchor="bottom",
            y=1.02,
            xanchor="right",
            x=1
        ),
        bargap=0.15,
        bargroupgap=0.1
    )
    
    # Update y-axes titles
    fig.update_yaxes(title_text="Percentage (%)", row=1, col=1)
    fig.update_yaxes(title_text="Percentage (%)", row=2, col=1)
    fig.update_yaxes(title_text="Count/Percentage", row=3, col=1)
    fig.update_yaxes(title_text="Ratio", row=4, col=1)
    
    return fig

# Calculate and display metrics
metrics_df = calculate_metrics(dnn_df_clean)
print("\nTrading Metrics:")
print(metrics_df.round(2))

# Plot updated metrics
fig = plot_trading_metrics(metrics_df)
fig.show()


Trading Metrics:
            Win_Rate  Uncond_Win_Rate  Win_Rate_Outperf  Loss_Rate  \
train          56.34            55.93              0.41      43.40   
validation     57.98            58.88             -0.90      42.02   
test           59.26            56.77              2.49      40.74   

            Uncond_Loss_Rate  Loss_Rate_Outperf  Avg_Return  N_Trades  \
train                  43.76               0.36        0.21      3097   
validation             41.12              -0.90        0.25       902   
test                   43.23               2.49        0.45       432   

            Trading_Freq  PL_Ratio  
train              68.68      0.94  
validation         69.07      0.95  
test               66.46      1.12  


# Model Performance Comparison: ARIMA vs Prophet vs DNN

## Trading Metrics Analysis

### Test Set Performance

#### ARIMA
- Win Rate: ~54.5%
- Strong market outperformance (+3.3%)
- Higher average returns per trade (0.42%)
- Superior P/L ratio (1.45)
- Most consistent performance across splits

#### Prophet
- Win Rate: ~51.2%
- Modest market outperformance (+1.1%)
- Lower average returns (0.21%)
- Lower P/L ratio (1.15)
- More variance across splits

#### DNN
- Win Rate: ~58.1%
- Good market outperformance (+1.3%)
- Strong average returns (0.38%)
- Competitive P/L ratio (1.12)
- Lower trading frequency (~83%)

## Key Findings

1. **Model Strengths**
   - ARIMA: Best overall performance, especially in risk-adjusted returns
   - Prophet: Good for trend identification, more conservative
   - DNN: Strong win rate, good balance of returns and risk

2. **Trading Style**
   - ARIMA: Aggressive with high frequency
   - Prophet: Conservative with lower conviction
   - DNN: Selective with moderate frequency

3. **Risk Management**
   - ARIMA: Best risk-adjusted returns
   - Prophet: Most conservative approach
   - DNN: Good balance of risk and return

## Model Characteristics

### ARIMA
- Excellent for short-term predictions
- Captures market momentum and mean reversion
- Strong performance in volatile periods
- Best for market timing decisions

### Prophet
- Better for long-term trend analysis
- Handles seasonality well
- More conservative predictions
- Suitable for strategic planning

### DNN
- Good balance of short and medium-term predictions
- Effective at filtering out noise
- More selective in trading signals
- Competitive with traditional models

## Conclusion

Each model shows distinct strengths:
- ARIMA proves most effective for short-term trading signals
- Prophet excels at long-term trend analysis
- DNN offers a balanced approach with good win rate and selective trading

The DNN's performance suggests that simple architectures with careful feature engineering can compete with traditional time series models, while potentially offering more flexibility in incorporating additional features.
