# Price ML Predicion

## Price Prediction's Caveats

  When the characteristics of the instrument are far from stable, price prediction can be quite off.
  - Examples...
    - SERV
    - MEDS
    - XCUR
  
  Price Prediction should be combined with Prediction of market highs and lows because if you just entered the market based on predicted price, you could get washed out by manipulative Market Wales hitting your stops.
  
  Predicting High, Low, Market Open and Market Close Should be done using separate models, using the same model parameters.
  
  Before trading an instrument, one should review the median market price range, verses median prediction accuracy delta, to verify whether the instrument is worth trading. I have a hunch that the smaller the trading range as compared to the prediction delta, the less likely the instrument is worth engaging with.
<p>

## Current Enhancements

From a visual review, although the prediction strays from the price
   the trend of the prediction seems quite accurate.
   By pulling the prediction closer to the price, by an average of the delta,
   between the price and the prediction, the new prediction is quite accurate.
<p>

## Enhancements (to add)

We should be able to add prediction targets for high and low for the day by 
   adding the target high-difference and low-difference to the data set,
   similar to the way we added the target close-difference.
   
We can also apparently pull minute bars from Yahoo. And with that, we can
   perform more granular predictions, to the hour, 15min, 5min, and
   1min period.

# Choose an existing model -or- Create a New Model


In [1]:
# Create a list of available .keras models from model_dir...
# Restart the kernel
%reset -f
%gui asyncio

import ipywidgets as widgets
from ipywidgets import Output, Checkbox, TwoByTwoLayout
import re
import datetime
import os
import asyncio
import yfinance as yf

# os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'

global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, Selected_Model

Ticker    = 'AMZN'

DateStart = '2015-06-03'
DateEnd   = '2024-07-30'
CreateNewModel = "Create New Model"

# Get list of file from the model directory...
model_dir = 'models/'
model_files = os.listdir(model_dir)
Selected_Model = None

def wait_for_change(widget, value):
    future = asyncio.Future()
    def getvalue(change):
        # make the new value available
        future.set_result(change.new)
        widget.unobserve(getvalue, value)
    widget.observe(getvalue, value)
    return future

def model_choices():
    global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, Selected_Model
    
    # Filter the list to only include .keras files...
    models_lst = [file for file in model_files if re.search(r'\.keras$', file)]
    models_lst.sort()
    model_files = [CreateNewModel] + models_lst
    # Display the list of files...
    Selected_Model = widgets.Select ( options=model_files,
                                      value=model_files[0],
                                      description='Model File: ',
                                      disabled=False )
    ticker     = widgets.Text(value=Ticker, description='TICKER: ')
    DateStart = "2015-01-01"
    dateStart = widgets.Text(value=DateStart, description='Start Date: ')
    chk_save_model = widgets.Checkbox(value=True, description='Save Model')
    # Set the DateEnd to today's date with the format 'YYYY-MM-DD, determined by the system date.
    if Selected_Model.value == CreateNewModel:
        DateEnd = datetime.datetime.now().strftime('%Y-%m-%d')
    dateEnd   = widgets.Text(value=DateEnd, description='End Date: ')
    out = Output()
    
    async def f0():
        for i in range(10):
            out.append_stdout('did work ' + str(i) + '\n')
            x = await wait_for_change(Selected_Model, 'value')
            if Selected_Model.value != CreateNewModel:
                ticker.value = Selected_Model.value.split('_')[0]
                if '~' in ticker.value:
                    ticker.value = ticker.value.replace('~', '=')
            out.append_stdout('async function continued with value ' + str(x) + '\n')
        asyncio.ensure_future(f0())
    
    async def f1():
        for i in range(10):
            out.append_stdout('did work ' + str(i) + '\n')
            x = await wait_for_change(ticker, 'value')
            tkr = ticker.value
            tkr = re.sub(r'Invalid: ', '', tkr)
            if Selected_Model.value == CreateNewModel:
                # Check yahoo if symbol is valid...
                try:
                    ticker_ = yf.Ticker(tkr).history(period='5d',interval='1d')
                except:
                    ticker_ = []
                if len(ticker_) == 0:
                    ticker.value = 'Invalid: ' + tkr
                else:
                    ticker.value = ticker.upper()
            out.append_stdout('async function continued with value ' + str(x) + '\n')
        asyncio.ensure_future(f1())
    
    asyncio.ensure_future(f0())
    asyncio.ensure_future(f1())
    
    from IPython.display import display, Pretty, Markdown, Latex
    inst0 = """
        +-------------------------------+
        | Choose or Create a New Model  |
        +-------------------------------+
        
    Make you Model Selection and Updates the Start and End Dates.
    Or, Create a New Model by entering by leave the Model File: as
    "Create New Model", and entering the Ticker, Start Date and End Date.
    """
    display(Pretty(inst0))
    
    # print(inst0, inst1)
    display(Selected_Model)
    print()
    display(ticker, dateStart, dateEnd, chk_save_model)
    print()
    # display(out)

    # if selected_model.value == CreateNewModel:
    # - Create a new model using the ticker.value, dateStart.value, DateEnd.vale values.
    return(Selected_Model, ticker, dateStart, dateEnd, chk_save_model, asyncio)

Selected_Model, ticker, dateStart, dateEnd, chk_save_model, asyncio = model_choices()



        +-------------------------------+
        | Choose or Create a New Model  |
        +-------------------------------+
        
    Make you Model Selection and Updates the Start and End Dates.
    Or, Create a New Model by entering by leave the Model File: as
    "Create New Model", and entering the Ticker, Start Date and End Date.
    

Select(description='Model File: ', options=('Create New Model', 'AAPL_2015-01-01_2024-07-30.keras', 'AMZN_2015…




Text(value='AMZN', description='TICKER: ')

Text(value='2015-01-01', description='Start Date: ')

Text(value='2024-08-12', description='End Date: ')

Checkbox(value=True, description='Save Model')




In [2]:
def print_model_choices(Selected_Model, ticker, dateStart, dateEnd, chk_save_model):
    print( "Selected_Model:", Selected_Model.value)
    print("Ticker: ", ticker.value)
    print("DateStart: ", dateStart.value)
    print("DateEnd: ", dateEnd.value)
    print("Save Model: ", chk_save_model.value)

print_model_choices(Selected_Model, ticker, dateStart, dateEnd, chk_save_model)

Selected_Model: MOH_2015-01-01_2024-08-08.keras
Ticker:  MOH
DateStart:  2015-01-01
DateEnd:  2024-08-12
Save Model:  True


#  Load the latest data for the selected Ticker...

In [3]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn
import pandas as pd
import yfinance as yf

global data, feature_cnt, feature_cnt_dl, model_path

def load_data():
    global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, model_path, Selected_Model
    global data_orig, data, feature_cnt, feature_cnt_dl

    model_dir = 'models/'

    if Selected_Model.value == CreateNewModel:
        modelDateStart = DateStart
        modelDateEnd = DateEnd
        model_file = ticker.value + "_" + dateStart.value + "_" + dateEnd.value + ".keras"
        model_path = model_dir + model_file
    else:
        model_path = model_dir + Selected_Model.value

    # If we are using an existing model, we still have to download the data as we need the
    # actual price data to adjust the prediction.

    # If the predicted value varies/swings-about too much from the price,
    # the days/period training period many be changed to exclude the prior period
    # That does not have present-time characteristics with regard to price movement.
    data_orig = yf.download(tickers = ticker.value, start = dateStart.value, end = dateEnd.value)
    data=data_orig.copy(deep=True)
    offset = pd.Timedelta(days=-30)
    # Resample to 'W'eekly or 'ME'(Month End)
    # logic = {'Open'  : 'first',
    #          'High'  : 'max',
    #          'Low'   : 'min',
    #          'Close' : 'last',
    #          'Adj Close': 'last',
    #          'Volume': 'sum'}
    # data = data.resample('W', offset=offset).apply(logic)
    print("Pulling Data for:", ticker.value, "from", dateStart.value, "to", dateEnd.value)
    print("data.len():", len(data), "data.shape:", data.shape)

    feature_cnt_dl = data.shape[1]
    feature_cnt = feature_cnt_dl + 1  # Actual count vs index count.

load_data()
# data.tail(10)
data.head(5), data.tail(5)


[*********************100%%**********************]  1 of 1 completed

Pulling Data for: MOH from 2015-01-01 to 2024-08-12
data.len(): 2417 data.shape: (2417, 6)





(                 Open       High        Low      Close  Adj Close  Volume
 Date                                                                     
 2015-01-02  53.900002  54.299999  52.349998  52.430000  52.430000  471200
 2015-01-05  52.130001  52.750000  51.060001  51.430000  51.430000  623400
 2015-01-06  51.779999  52.160000  49.849998  50.360001  50.360001  784400
 2015-01-07  50.630001  51.070000  49.939999  50.459999  50.459999  846900
 2015-01-08  51.000000  51.689999  50.349998  51.509998  51.509998  884700,
                   Open        High         Low       Close   Adj Close  Volume
 Date                                                                          
 2024-08-05  350.859985  353.739990  345.589996  346.970001  346.970001  696400
 2024-08-06  346.769989  350.730011  337.850006  338.720001  338.720001  599500
 2024-08-07  338.269989  342.420013  332.929993  334.600006  334.600006  660000
 2024-08-08  333.070007  341.109985  331.670013  336.880005  336.880005  4

##  - Add indicators to the data


In [4]:
# Add indicators to the data...

import pandas_ta as ta

def add_indicators(data, feature_cnt_dl):

    feature_cnt = feature_cnt_dl

    data = data_orig.copy(deep=True)

    # print("= Before Adding Indicators ========================================================")
    # print(data.tail(10))

    data['RSI']=ta.rsi(data.Close, length=3); feature_cnt += 1
    # data['EMAF']=ta.ema(data.Close, length=3); feature_cnt += 1
    # data['EMAM']=ta.ema(data.Close, length=6); feature_cnt += 1
    data['EMAS']=ta.ema(data.Close, length=9); feature_cnt += 1
    data['DPO3']=ta.dpo(data.Close, length=3, centered=True); feature_cnt += 1
    data['DPO6']=ta.dpo(data.Close, length=6, centered=True); feature_cnt += 1
    data['DPO9']=ta.dpo(data.Close, length=9, centered=True); feature_cnt += 1

    # print("= After Adding DPO2 ========================================================")
    # print(data.tail(10))
    #
    # On Balance Volume
    if data['Volume'].iloc[-1] > 0:
        data = data.join(ta.aobv(data.Close, data.Volume, fast=True, min_lookback=3, max_lookback=9))
        feature_cnt += 7 # ta.aobv adds 7 columns

    # print("= After Adding APBV ========================================================")
    # print(data.tail(10))

    # Target is the difference between the adjusted close and the open price.
    data['Target'] = data['Adj Close']-data.Open
    feature_cnt += 1
    data['TargetH'] = data.High - data.Open
    feature_cnt += 1
    data['TargetL'] = data.Low - data.Open
    feature_cnt += 1
    # Shift the target up by one day.Target is the difference between the adjusted close and the open price.
    # That is, the target is the difference between the adjusted close and the open price.
    # Our model will predict the target close for the next day. So we shift the target up by one day.
    data['Target'] = data['Target'].shift(-1)
    data['TargetH'] = data['TargetH'].shift(-1)
    data['TargetL'] = data['TargetL'].shift(-1)

    # 1 if the price goes up, 0 if the price goes down.
    # Not a feature: Needed to test prediction accuracy.
    data['TargetClass'] = [1 if data['Target'].iloc[i]>0 else 0 for i in range(len(data))]
    target_cnt = 1
    # The TargetNextClose is the adjusted close price for the next day.
    # This is the value we want to predict.
    # Not a feature: Needed to test prediction accuracy.
    data['TargetNextClose'] = data['Adj Close'].shift(-1)
    target_cnt += 1
    # TargetNextHigh and TargetNextLow are the high and low prices for the next day.
    data['TargetNextHigh'] = data['High'].shift(-1)
    target_cnt += 1
    # TargetNextLow are the low prices for the next day.
    data['TargetNextLow'] = data['Low'].shift(-1)
    target_cnt += 1

    # Before scaling the data, we need to use the last good value for rows that have NaN values.
    data.ffill(inplace=True)

    # print("= After Adding Targets___ and ForwardFill ========================================================")
    # print(data.tail(10))

    # Reset the index of the dataframe.
    data.reset_index(inplace = True)
    data_date = data['Date'].copy()
    data.drop(['Date'], axis=1, inplace=True);   feature_cnt -= 1
    data.drop(['Close'], axis=1, inplace=True);  feature_cnt -= 1
    # data.drop(['Volume'], axis=1, inplace=True); feature_cnt -= 1

    # Add one more row to the data file, this will be our next day's prediction.
    data = pd.concat([data,data[-1:]])
    # And, reindex the dataframe.
    data.reset_index(inplace=True)

    # print("feature_cnt_d1:", feature_cnt_dl, " feature_cnt:", feature_cnt)
    return data, feature_cnt, data_date

data_aug, feature_cnt, data_date = add_indicators(data, feature_cnt_dl)

data_set = data_aug.copy(deep=True)
print("Len data:", len(data_aug), "Len data_set", len(data_set))
# data_set.tail(10)

Len data: 2418 Len data_set 2418


## - Scale the data

  The data is scaled to a range of 0 to 1.
  This is done to ensure that the data is normalized, so that the model can be trained on it

In [5]:
# Scale the data for training...

import sklearn
from sklearn.preprocessing import MinMaxScaler


def scale_data(data_set):
    sc = MinMaxScaler(feature_range=(0,1))
    data_set_scaled = sc.fit_transform(data_set)
    return sc, data_set_scaled

sc, data_set_scaled = scale_data(data_set)

print("data_set.shape:", data_set.shape, "data_set_scaled.shape", data_set_scaled.shape)
# print(data_set_scaled)


data_set.shape: (2418, 25) data_set_scaled.shape (2418, 25)


In [6]:
# Check of nan an/or 0 dataums...
# After scaling the data, we need to check for NaN and 0 values,
# and forward fill the data with the last known value.
# In the case of a series of 0 values, begining at index 0,
# we have to back fill those values with the first non-non/zero value available.
# *** Currently, we are not handling this case. ***
def scaled_data_cleanup(data_set):
    nan_indices = np.argwhere(np.isnan(data_set_scaled))
    zer_indices = np.argwhere(data_set_scaled == 0)

    # print("nan_indices:", nan_indices)
    # print('zer_indices:', zer_indices)

    for i in range(len(nan_indices)):
        j = nan_indices[i][1] - 1
        if j < 0:
            j = nan_indices[i][1] + 1
        data_set_scaled[nan_indices[i][0], nan_indices[i][1]] = data_set_scaled[nan_indices[i][0], j]
    for i in range(len(zer_indices)):
        j = zer_indices[i][1] - 1
        if j < 0:
            j = zer_indices[i][1] + 1
        data_set_scaled[zer_indices[i][0], zer_indices[i][1]] = data_set_scaled[zer_indices[i][0], j]

    nan_indices = np.argwhere(np.isnan(data_set_scaled))
    zer_indices = np.argwhere(data_set_scaled == 0)

    return data_set_scaled, nan_indices, zer_indices

data_set_scaled, nan_indices, zer_indices = scaled_data_cleanup(data_set)
# print("nan_indices:", nan_indices)
# print('zer_indices:', zer_indices)
pass

## - Prepare the scaled data for the model

  - The data is prepared for the model by creating a 3D array of the data.
  - The data is split into training and test data.
  - The training data is used to train the model.
  - The test data is used to test the model.
  - The model is then used to predict the next day's price.

In [7]:
# LSTM needs a rolling period of data for each feature to make predictions.
# multiple feature from data provided to the model
global X, backcandles, y, yi, data_set
backcandles = 15 # Set the rolling window size.

def prep_model_inputs(data_scaled, backcandles):
    X = []
    # print(data_set_scaled[0].size)
    # data_set_scaled=data_set.values
    # print(data_set_scaled.shape[0])

    # Create a 3D array of the data. X[features][periods][candles]
    # Where candles is the number of candles that rolls by 1 period for each period.
    for j in range(feature_cnt): # data_set_scaled[0].size):# last 2 columns are target not X
        X.append([])
        for i in range(backcandles, data_set_scaled.shape[0]): # backcandles+2
            X[j].append(data_set_scaled[i-backcandles:i, j])

    # print("X.shape:", np.array(X).shape)
    X = np.array(X)
    # Move axis from 0 to position 2
    X=np.moveaxis(X, [0], [2])
    # print("X.shape:", X.shape)

    # Erase first elements of y because of backcandles to match X length
    # del(yi[0:backcandles])
    #X, yi = np.array(X), np.array(yi)
    # Choose -1 for last column, classification else -2...
    # Choose -4 for last 4 columns, classification (up/down), TargetNextClose, TargetNextHigh, and TargetNextLow ...
    X, yi = np.array(X), np.array(data_set_scaled[backcandles:, -4:])
    y = np.reshape(yi,(len(yi),4))
    #y=sc.fit_transform(yi)
    #X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
    
    # print("data_set.shape:",data_set.shape,"X.shape:",X.shape)
    # print(X)
    # print("=========================================================")
    # print(y.shape)
    # print(y)

    return X, y, yi

X, y, yi = prep_model_inputs(data_set_scaled, backcandles)

pass

In [8]:
# split data into train test sets

global splitlimit
splitlimit = int(len(X)*0.8)

def split_data(X, y, splitlimit):
    # print("lenX:",len(X), "splitLimit:",splitlimit)
    X_train, X_test = X[:splitlimit], X[splitlimit:] # Training data, Test Data
    y_train, y_test = y[:splitlimit], y[splitlimit:] # Training data, Test Data

    # print("X_train.shape:", X_train.shape)
    # print("y_train.shape:", y_train.shape)
    # print("X_test.shape:", X_test.shape)
    # print("y_test.shape:", y_test.shape)
    # print("== X_train ===========================================")
    # print(X_train[-20:-10])
    # print("== _train ===========================================")
    # print(y_train[-20:-10])    

    return X_train, X_test, y_train, y_test

X_train, X_test, y_train, y_test = split_data(X, y, splitlimit)


# Load an Existing Model -or- Model Training
 - Using LSTM (Long Short Term Memory) Model
 - Using Keras (Tensorflow) Library
 - LSTM --> Dense Layer --> Activation Layer


In [9]:

import tensorflow as tf
import keras
import numpy as np
import os
from keras import optimizers
from keras.callbacks import History
from keras.models import Model
from keras.layers import Dense, Dropout, LSTM, Input, Activation, concatenate
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
from keras.layers import TimeDistributed

def create_fit_model(X_train, y_train, backcandels, model_path):
    # If the Keras model file exists, load it. Otherwise, create a new model.
    
    train_model = False
    if os.path.exists(model_path) and Selected_Model.value != CreateNewModel:
        # Load the model...
        model = keras.models.load_model(model_path)
        model_path = model_path.replace('~', '=')
        print("Model Loaded:", model_path)
        model_outputs = model.layers[3].output.shape[1]
        if model_outputs != 4:
            train_model = True
            print("** Model needs to be retrained.")
        else:
            train_model = False
            
    if train_model:
        # Create a new model...

        #tf.random.set_seed(20)
        np.random.seed(10)

        lstm_input = Input(shape=(backcandles, feature_cnt), name='lstm_input')
        inputs = LSTM(200, name='first_layer')(lstm_input)
        inputs = Dense(4, name='dense_layer')(inputs)
        output = Activation('linear', name='output')(inputs)
        model = Model(inputs=lstm_input, outputs=output)
        adam = optimizers.Adam()
        model.compile(optimizer=adam, loss='mse')
        # model.fit(x=X_train, y=y_train, batch_size=15, epochs=30, shuffle=True, validation_split = 0.1)
        model.fit(x=X_train, y=y_train, batch_size=250, epochs=150, shuffle=True, validation_split = 0.1)

        # Save the model...
        if chk_save_model.value:
            model_path = model_path.replace('=', '~')
            model.save(model_path)
    
    return model

model = create_fit_model(X_train, y_train, backcandles, model_path)


2024-08-12 17:05:13.998268: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-12 17:05:14.008084: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-12 17:05:14.019859: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-12 17:05:14.023307: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-12 17:05:14.032638: I tensorflow/core/platform/cpu_feature_guar

Model Loaded: models/MOH_2015-01-01_2024-08-08.keras
** Model needs to be retrained.
Epoch 1/150


2024-08-12 17:05:15.077315: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2024-08-12 17:05:15.077334: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:135] retrieving CUDA diagnostic information for host: motoko-7760
2024-08-12 17:05:15.077337: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:142] hostname: motoko-7760
2024-08-12 17:05:15.077472: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:166] libcuda reported version is: 555.42.6
2024-08-12 17:05:15.077485: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:170] kernel reported version is: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for x86_64  555.42.06  Release Build  (dvs-builder@U16-I3-A13-3-4)  Tue Jun  4 00:45:31 UTC 2024
GCC version:  gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04) 
"


[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 44ms/step - loss: 0.0895 - val_loss: 0.0181
Epoch 2/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step - loss: 0.0118 - val_loss: 0.0239
Epoch 3/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 0.0058 - val_loss: 0.0125
Epoch 4/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step - loss: 0.0056 - val_loss: 0.0162
Epoch 5/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step - loss: 0.0038 - val_loss: 0.0092
Epoch 6/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step - loss: 0.0030 - val_loss: 0.0061
Epoch 7/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step - loss: 0.0028 - val_loss: 0.0078
Epoch 8/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step - loss: 0.0031 - val_loss: 0.0059
Epoch 9/150
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0

# Model Testing

In [10]:
# Sow some y_pred values. If prediction is no good, we might see NaN values.
y_pred = model.predict(X_test)

# Rescaled the predicted values to dollars...
data_set_scaled_y = data_set_scaled[backcandles + splitlimit:, :].copy()
# Replace the last columns 4 in data_set_scaled_y with the predicted column values...
data_set_scaled_y[:, -4:] = y_pred

y_pred_rs = sc.inverse_transform(data_set_scaled_y)

pass

[1m16/16[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step


## - Adjust the prediction (Alignment with the actual previous prices)
  The ideal for the prediction is for it to be as close to the actual price as possible.
  What was observed is that the prediction is often off by a delta.
  The delta is the difference between the actual price and the predicted price.
  After applying an average of the delta to the prediction, the prediction is closer to the actual price.
  Usually within 1% of the actual price, which is an incredible improvement.  
  
  - Before we adjust the prediction, the data has to be rescaled to dollars because the each prediction target
    is scaled independently.
  
  The resulting numpy array y_p_adj is the adjusted prediction, which we will use as the prediction for the next day.
  
  TO OD: Add code to find extreme deltas and adjust the prediction to be similar to the largest price move in the recent past.
  

In [20]:
# Calculate the delta between actual price and prediction
# Bring the prediction closer to the price based on the delta
import math

def mov_avg(x, w):
    # return abs(np.convolve(np.flip(x), np.ones(w), 'valid') / w)
    return np.absolute(np.convolve(x, np.ones(w), 'valid') / w)

def apply_deltas(ac,pc,pac,p, t,w):
    """
    Calculate the delta between actual price and prediction
    :param ac:   # Actual close price or high or low
    :param pc:   # Centering close price or high or low
    :param pac:  # Predicted Adj Close price
    :param p:    # Predicted close price or high or low
    :param t:    # Type of prediction ('c'=close, 'h'=high, 'l'=low)
    :param w:    # Window size for moving average
    :return rl:  # Return List of adjusted prediction values 
    """
    al = list(ac)   # Actual
    cc = list(pc)   # Centering or Original Predicted Close
    if pac is None:
        pac = None
    else:
        pac = list(pac) # Predicted Adj Close
    pl = list(p)    # Predicted
    rl = []         # Return List (Adjusted Predictions)
    
    # Calculate the delta between the actual price and the centering value.
    # Get the absolute value of the difference between the actual price and the prediction.
    p_delta = (np.maximum(p, ac[:len(p)]) - np.minimum(p, ac[:len(p)])) # Price - Actual Close or High or Low

    # Shift delta to the left by 1 day so that the adjusted prediction
    # only uses know data, and duplicate the delta for the last day (which we don't know).
    dl = np.append(p_delta[1:], p_delta[-1])
    # Calculate the move average delta for the prediction.
    dl_ma = mov_avg(p_delta, w)

    if len(dl_ma) < len(pl):
        for i in range(len(pl) - len(dl_ma)):
            # Duplicate the first value in the list to make the list the same size as pl.
            dl_ma = np.insert(dl_ma, 0, dl_ma[0], axis=0)
            
    # We don't apply the average to our first predictions
    # fot the size of the moving average window.
    # For this area, we just return the predicted value.
    i = w - 1
    for j in range(0, i):
        rl.append(pl[j]) 

    for delta in dl_ma[:-1]:
        if i >= len(pl):
            break 
        # # print(i,"," ,end="")
        if t == 'c':
            if i >= len(dl_ma):
                break
            if al[i] > pl[i]:
                # If the actual price is greater than the predicted price, add the delta.
                val = pl[i] + delta
            else:
                # If the actual price is less than or equal to the predicted price, subtract the delta.
                val = pl[i] - delta
        elif t == 'h':
            if pac is None:
                # Use delta to adjust the predicted high.
                val = pl[i] + delta
            else:
                # Create a delta value based on the actual high and the predicted high
                # and subtract the delta from the predicted high.
                val = math.fabs(pl[i] + max(ac[i], cc[i]) - min(ac[i], cc[i]))
                if val < pac[i]:
                    # If high is less than the predicted high, set the high to the predicted high.
                    val = pac[i]
        elif t == 'l':
            if pac is None:
                # Use delta to adjust the predicted low.
                val = pl[i] - delta
            else:
                # Create a delta value based on the actual low and the predicted low
                # and subtract the delta from the predicted low.
                val = math.fabs(pl[i] - max(ac[i], cc[i]) - min(ac[i], cc[i]))
                if val > pac[i]:
                    # If low is greater than the predicted low, set the low to the predicted low.
                    val = pac[i]
        else:
            raise ValueError("Invalid type. Must be 'c', 'h', or 'l'.")

        # Append the adjusted value to the return list.
        rl.append(val)
        i += 1
        
    # Return the return list of adjusted values.
    return rl

def adjust_prediction(y_cntr, pc, pac, y_pred, type, avg_win=5):
    """
    Calculate the delta between actual price and prediction
    Bring the prediction closer to the price based on the delta
    """
        
    # print(y_cntr.shape, y_pred.shape, y_p_adj.shape, y_p_delta.shape)
    y_p_adj = apply_deltas(y_cntr, pc, pac, y_pred, type, avg_win)
    y_p_adj = np.array(y_p_adj)

    return y_p_adj


# Gather the target data for the test period.
target_close = np.array(data_set['Adj Close'].iloc[-len(y_pred_rs):])
target_high  = np.array(data_set['High'].iloc[-len(y_pred_rs):])
target_low   = np.array(data_set['Low'].iloc[-len(y_pred_rs):])

# Gather the predicted data for the test period.
pred_close = y_pred_rs[:, -3]
pred_high  = y_pred_rs[:, -2]
pred_low   = y_pred_rs[:, -1]

# Generate the adjusted predictions for Close, High, and Low.

# Adjust the predicted close, centered on the actual close.
pred_adj_close = adjust_prediction(target_close, target_close, None,           pred_close,    'c', avg_win=2)
# Adjust the predicted high, centered on the actual close.
pred_adj_high  = adjust_prediction(target_close, pred_close,   pred_adj_close, pred_high,     'h', avg_win=2)
# Readjust the adjusted predicted low, centered on the actual close.
pred_adj_high  = adjust_prediction(target_high,  target_high,  None,           pred_adj_high, 'h', avg_win=2)
# Adjust the predicted low, centered on the actual high.
pred_adj_low   = adjust_prediction(target_close, pred_close,   pred_adj_close, pred_low,      'l', avg_win=2)
# Readjust the adjusted predicted low, centered on the actual low.
pred_adj_low   = adjust_prediction(target_low,   target_low,   None,           pred_adj_low,  'l', avg_win=2)

pass

# - Plot the Test Results

### - Rescaled (Target vs Prediction)

In [17]:
%matplotlib notebook

def plot_pred_results(target_close, pred_close, pred_high, pred_low):
    # Plot the scaled test and predicted data.
    y_test_plot = y_test[:-2, 1]
    plt.figure(figsize=(16,8))

    # Original Adj Close
    plt.plot(target_close,   color = 'black',  label = 'Test',       marker='.')

    # Original Prediction
    plt.plot(pred_close[1:], color = 'orange', label = 'Pred Close', marker='1')
    # Original High and Low prediction
    plt.plot(pred_high[1:],  color = 'blue',   label = 'Pred High',  marker='1')
    plt.plot(pred_low[1:],   color = 'red',    label = 'Pred Low',   marker='1')

    title = f"{ticker.value} ({dateStart.value} - {dateEnd.value})" 
    plt.title(title)
    plt.legend()
    plt.grid()
    plt.show()
    
plot_pred_results(target_close[1:-1], pred_close, pred_high, pred_low)

pass

<IPython.core.display.Javascript object>

### - Rescaled (Target vs Prediction vs Adjusted Prediction)

In [None]:
# Use an interactive plot to see the results
%matplotlib notebook

def plot_adj_pred_results(y_test, y_pred, pred_adj_close, pred_adj_high, pred_adj_low):
    # Plot the scaled test and predicted data.
    y_test_plot = y_test[:]
    plt.figure(figsize=(16,8))

    # Original Adj Close
    plt.plot(y_test_plot, color = 'black', label = 'Test',     marker='.')

    # # Original Prediction
    # plt.plot(y_pred[1:, 1],   color = 'yellow',  label = 'Pred Close', marker='1')
    # # Original High and Low prediction
    # plt.plot(y_pred[1:, 2],   color = 'blue',  label = 'Pred High', marker='1')
    # plt.plot(y_pred[1:, 3],   color = 'red',  label = 'Pred Low', marker='1')

    # Adjusted Predictions
    plt.plot(pred_adj_close,   color = 'yellow',  label = 'Pred Close', marker='1')
    # Original High and Low prediction
    plt.plot(pred_adj_high,   color = 'blue',  label = 'Pred High', marker='1')
    plt.plot(pred_adj_low,   color = 'red',  label = 'Pred Low', marker='1')

    title = f"{ticker.value} ({dateStart.value} - {dateEnd.value})"
    plt.title(title)
    plt.legend()
    plt.grid()
    plt.show()
    return plt

y_p_adj = None
plt = plot_adj_pred_results(target_close[1:-1], pred_close[1:], pred_adj_close[1:], pred_adj_high[1:], pred_adj_low[1:])
# pd.DataFrame(y_test_plot).tail(5)

### - Rescaled Data ($USD) (Prediction vs Target)

In [18]:
# Plot using candlesticks using matplotlib...

# Use an interactive plot to see the results
%matplotlib notebook

def plot_candlesticks(plt, prices, title, width=1, width2=.25, clr_up='green', clr_dn='red', clr_wick='black'):
    """
    Plot Candlesticks...
    :param plt:    maplotlib.pyplot handle
    :param prices: The series of OHLC prices to plot
    :param title:  A title for the plot
    :param width:
    :param width2:
    :param clr_up:
    :param clr_dn:
    :param clr_wick:
    :return:
    """
    #define up and down prices
    up = prices[prices.Close>=prices.Open]
    down = prices[prices.Close<prices.Open]

    #plot up prices
    plt.bar(up.index,up.Close-up.Open,width,bottom=up.Open,color=clr_up, alpha=.30)
    # plt.bar(up.index,up.High-up.Close,width2,bottom=up.Close,color=clr_up, alpha=.25)
    # plt.bar(up.index,up.Low-up.Open,width2,bottom=up.Open,color=clr_up, alpha=.25)
    plt.bar(up.index,up.High-up.Low,width2,bottom=up.Low,color=clr_wick, alpha=.30)

    #plot down prices
    plt.bar(down.index,down.Open-down.Close,width,bottom=down.Close,color=clr_dn, alpha=.25)
    # plt.bar(down.index,down.High-down.Open,width2,bottom=down.Open,color=clr_dn, alpha=.25)
    # plt.bar(down.index,down.Low-down.Close,width2,bottom=down.Close,color=clr_dn, alpha=.25)
    plt.bar(down.index,down.High-down.Low,width2,bottom=down.Low,color=clr_wick, alpha=.25)

    #rotate x-axis tick labels
    plt.xticks(rotation=45, ha='right')

import math
import mplfinance as mfp

def plot_pred_results(target_close, pred_close, y_pred_adj_rs, y_high_adj_rs, y_low_adj_rs, data_set, data_date):
    # Plot the original Adj Cost data that is in dollars...
    # Remove the last Adj Price, because it just aopy of the last adj
    plt_test_plot = np.array(target_close[:].reshape(-1,1))

    # Calculate the mean of the delta between the prediction and the actual price...
    # Used to create price range circles around the prediction...
    plt_delta = np.array(pd.DataFrame([abs(target_close[i - 1] - y_pred_adj_rs[i - 1]) for i in range(1, len(target_close))]).rolling(window=6).mean())

    # Dates...
    data_date_str = [str(date) for date in data_date]
    plt_test_dates = np.array((data_date_str[backcandles + splitlimit - 1:]), ).reshape(-1,1)
    plt_pred_dates = np.array((data_date_str[backcandles + splitlimit - 1:]),   ).reshape(-1,1)

    # empty_arr = np.zeros(((len(plt_test_plot)+1, 18)))
    # plt_pred_adj = sc.inverse_transform(np.append(y_p_adj, empty_arr, axis=1))[0:, 1]

    # Plot the data...
    plt.figure(figsize=(16,8))
    plt.autoscale(tight=True)

    plt_test_plot = plt_test_plot.reshape(-1,1)

    # Plot the actual price data ($USD)...
    plt.plot(plt_test_plot, color = 'black', label = 'Actual Adj Close', marker='+')
    # Plot the rescaled price data ($USD)...
    # plt.plot(plt_test_rescaled, color = 'purple', label = 'Test', marker='+')
    # Plot the adj predicted price data ($USD)...
    # plt.plot(plt_pred_adj, color = 'green', label = 'Test', marker='.')
    # Plot the adj predicted price data ($USD)...
    plt.plot(pred_close, color = 'green', label = 'Orig Pred Close', marker='.')
    # Plot the adj predicted price data ($USD)...
    plt.plot(y_pred_adj_rs, color = 'orange', label = 'Adj Pred Close', marker='.')
    plt.plot(y_high_adj_rs, color = 'Blue', label = 'Adj Pred High', marker='.')
    plt.plot(y_low_adj_rs,  color = 'Red', label = 'Adj Pred Low', marker='.')

    # Plot the candlesticks...
    ds_tmp = data_set[backcandles + splitlimit + 1:-1]
    cs_data = ds_tmp.loc[backcandles + splitlimit:, ['Open','High','Low','Adj Close']].copy().reset_index()
    cs_data.rename(columns={'Adj Close':'Close'}, inplace=True)
    plot_candlesticks (plt, cs_data, 'Candlesticks')

    # Plot the text of the predicted price...
    plt.text(len(y_pred_adj_rs) -1, y_pred_adj_rs[-1], y_pred_adj_rs[-1].round(2), fontsize=8, color='black', ha='left', va='center')

    # plt.xlabel(range(0,len(plt_pred_dates)), plt_pred_dates.tolist(), rotation=45)
    #  .set_major_formatter(matplotlib.dates.DateFormatter('%Y')))

    # Plot a Price Range Circle Arounds the current Prediction...
    Plot_Pred_Circle = True
    if Plot_Pred_Circle:
        import matplotlib.patches as ptch
        lst_pred = len(y_pred_adj_rs) -1
        # ax.set_xlim = ([0, len(plt_pred_adj)])
        # ax.set_ylim = ([plt_pred_adj.min(), plt_pred_adj.max()])
        cpos = np.array([lst_pred, y_pred_adj_rs[-1]])
        circ_pred0 = ptch.Circle(cpos, radius=plt_delta[-1][0]/1,   color='grey',   fill=True, alpha=0.20)
        circ_pred1 = ptch.Circle(cpos, radius=plt_delta[-1][0]/1.5, color='blue',   fill=True, alpha=0.20)
        circ_pred2 = ptch.Circle(cpos, radius=plt_delta[-1][0]/2,   color='yellow', fill=True, alpha=0.20)
        circ_pred3 = ptch.Circle(cpos, radius=plt_delta[-1][0]/3,   color='red',    fill=True, alpha=0.20)
        ax = plt.gca()
        # ax.set_aspect("equal")
        ax.add_patch(circ_pred0)
        ax.add_patch(circ_pred1)
        ax.add_patch(circ_pred2)
        ax.add_patch(circ_pred3)

    title = f"{ticker.value} ({dateStart.value} - {dateEnd.value})" 
    plt.title(title)
    plt.grid(visible=True, which='both')
    plt.legend()
    plt.show()
    return plt

data_set_scaled_plus = np.append(data_set_scaled, data_set_scaled[-1].reshape(1, -1), axis=0)
plt = plot_pred_results(target_close[1:-1], pred_close[1:], pred_adj_close[1:], pred_adj_high[1:], pred_adj_low[1:], data_set, data_date)


<IPython.core.display.Javascript object>

### - Plot the Prediction Results (Using mplfinance)

In [21]:
# Plot using mlpfinance...
import mplfinance as mpf
print('Ticker:', ticker.value)
def plot_pred_results(Ticker, target_close, pred_close_rs, pred_close_adj_rs, pred_high_ad_rs, pred_low_ad_rs, pred_high_rs, pred_low_rs, data_date, y_pred_rs=None):
    # Createe a dataframe of the data for plt_test_usd, with a datetime index...
    df_plt_test_usd = pd.DataFrame()
    df_plt_test_usd.insert(0, 'Date', data_date[backcandles + splitlimit:])
    df_plt_test_usd.reset_index()
    ohlcv = data_orig.iloc[backcandles + splitlimit:, [0,1,2,3,4,5]].copy()
    ohlcv.reset_index()
    df_plt_test_usd = pd.concat([df_plt_test_usd, ohlcv.set_axis(df_plt_test_usd.index)], axis=1)

    # Add a place holder for the prediction...
    next_date = df_plt_test_usd.iloc[-1, 0] + pd.DateOffset(days=1)
    next_open      = df_plt_test_usd.iloc[-1, 4]
    next_high      = df_plt_test_usd.iloc[-1, 4]
    next_low       = df_plt_test_usd.iloc[-1, 4]
    next_close     = df_plt_test_usd.iloc[-1, 4]
    next_adj_close = df_plt_test_usd.iloc[-1, 4]
    next_volume    = 0
    
    # Prepare the prediction data for the plot...
    c_pred_rs       = pd.DataFrame(pred_close_rs[:]).rename({0:'pred_rs'}, axis=1)
    c_pred_adj_rs   = pd.DataFrame(pred_close_adj_rs[:]).rename({0:'pred_adj_rs'}, axis=1)
    c_pred_h_adj_rs = pd.DataFrame(pred_high_ad_rs[:]).rename({0:'pred_h_adj_rs'}, axis=1)
    c_pred_l_adj_rs = pd.DataFrame(pred_low_ad_rs[:]).rename({0:'pred_l_adj_rs'}, axis=1)
    
    if pred_high_rs is not None and pred_low_rs is not None:
        c_pred_h_rs = pd.DataFrame(pred_high_rs).rename({0:'pred_h_rs'},  axis=1)
        c_pred_l_rs = pd.DataFrame(pred_low_rs).rename({0:'pred_l_rs'}, axis=1)
    
    # Append a new row to the dataframe so it extends out to the prediction...
    new_row = { 'Date': next_date, "Open": next_open, "High": next_high, "Low": next_low, "Close": next_close,
                                   "Adj Close": next_adj_close, "Volume": next_volume,
                                   "pred_rs": c_pred_rs.iloc[-1], "pred_adj_rs": c_pred_adj_rs.iloc[-1],
                                   "pred_h_adj_rs": c_pred_h_adj_rs.iloc[-1], "pred_l_adj_rs": c_pred_l_adj_rs.iloc[-1] }
    if pred_high_rs is not None and pred_low_rs is not None:
        new_row = { 'Date': next_date, "Open": next_open, "High": next_high, "Low": next_low, "Close": next_close,
                        "Adj Close": next_adj_close, "Volume": next_volume,
                        "pred_rs": c_pred_rs.iloc[-1], "pred_adj_rs": c_pred_adj_rs.iloc[-1],
                        "pred_h_adj_rs": c_pred_h_adj_rs.iloc[-1], "pred_l_adj_rs": c_pred_l_adj_rs.iloc[-1],
                        "pred_h_rs": c_pred_h_rs.iloc[-1], "pred_l_rs": c_pred_l_rs.iloc[-1] }
    else:
        new_row = { 'Date': next_date, "Open": next_open, "High": next_high, "Low": next_low, "Close": next_close,
                        "Adj Close": next_adj_close, "Volume": next_volume,
                        "pred_rs": c_pred_rs.iloc[-1], "pred_adj_rs": c_pred_adj_rs.iloc[-1],
                        "pred_h_adj_rs": c_pred_h_adj_rs.iloc[-1], "pred_l_adj_rs": c_pred_l_adj_rs.iloc[-1] }


    new_row = pd.DataFrame(new_row, index=[6])
    df_plt_test_usd = pd.concat([df_plt_test_usd[1:], new_row], axis=0, ignore_index=True)
        
    # Add the prediction data to the OHLCV plot data...
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)],     axis=1)
    if pred_high_rs is not None and pred_low_rs is not None:
        df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_h_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)], axis=1)
        df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_l_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)], axis=1)
                                    
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_adj_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)], axis=1)
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_h_adj_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)], axis=1)
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_l_adj_rs[:len(df_plt_test_usd)].set_axis(df_plt_test_usd.index)], axis=1)
    
    df_plt_test_usd.ffill(inplace=True)
    df_plt_test_usd.set_index('Date', inplace=True, drop=True)
    df_plt_test_usd.ffill(inplace=True)
    df_plt_test_usd.rename(index={6:'pred_rs', 7:'pred_adj_rs'}, inplace=True)
    # Add a day row to df_plt_test_usd as a place holder for the prediction...
    print("df_plt_test_usd:", df_plt_test_usd.shape)
    
    kwargs = dict(type='candle', volume=True, figratio=(11,6), figscale=1.25, warn_too_much_data=10000, title='Ticker: '+ticker.value)
    if pred_high_rs is not None and pred_low_rs is not None:
        preds = [ mpf.make_addplot(df_plt_test_usd['pred_adj_rs'],     type='line', panel=0, color='g', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_h_adj_rs'],   type='line', panel=0, color='b', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_l_adj_rs'],   type='line', panel=0, color='r', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_rs'],         type='line', linestyle='-.', panel=0, color='y', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_h_rs'],       type='line', linestyle='-.', panel=0, color='b', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_l_rs'],       type='line', linestyle='-.', panel=0, color='r', secondary_y=False),
                  ]
    else:
        preds = [ mpf.make_addplot(df_plt_test_usd['pred_adj_rs'],     type='line', panel=0, color='g', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_h_adj_rs'],   type='line', panel=0, color='b', secondary_y=False),
                  mpf.make_addplot(df_plt_test_usd['pred_l_adj_rs'],   type='line', panel=0, color='r', secondary_y=False),
                  ]

    mpf.plot(df_plt_test_usd,**kwargs,style='binance', addplot=preds)
    return mpf

mpf = plot_pred_results(Ticker, target_close, pred_close[1:], pred_adj_close[1:], pred_adj_high[1:], pred_adj_low[1:], pred_high[1:], pred_low[1:], data_date)


Ticker: MOH
df_plt_test_usd: (480, 18)


<IPython.core.display.Javascript object>

# Analysis of the Prediction
 - Rescaled data (Prediction vs Target)

In [None]:
# We convert our data back to dollars...
# predval = y_pred[i] * scalefac

def analyze_pred(target_close, target_high, target_low, pred_close, pred_adj_close):
    # Convert back to dollar $values
    elements = len(target_close)
    print(elements)
    tot_deltas = 0
    tot_adj_deltas = 0
    tot_tradrng = 0
    for i in range(1, -elements, -1):
        actual = target_close[i - 1]
        
        predval = pred_close[i - 1]
        pred_delta = abs(predval - actual)
        tot_deltas += pred_delta

        predval = pred_adj_close[i - 1]
        pred_delta = abs(predval - actual)
        tot_deltas += pred_delta

        trd_rng = abs(target_high[i] - target_low[i])
        tot_tradrng += trd_rng
        print(f"i:{i}", "Close", actual.round(2), "Predicted: $", predval.round(2), "  Delta:$", pred_delta.round(6), "  Trade Rng: $", trd_rng.round(2))

    print("Mean Trading Range: $", round(tot_tradrng / elements, 2))
    print("Mean Delta (Actual vs Prediction): $", round((tot_deltas / elements), 2))
    print("Mean Delta (Actual vs Adj Prediction): $", round((tot_deltas / elements), 2))
    return tot_deltas, tot_tradrng, elements

tot_deltas, tot_tradrng, elements = analyze_pred(target_close, target_high, target_low, pred_close, pred_adj_close)


## - Rescaled data (0..1) (Adjusted Prediction vs Target)

In [None]:

def analyze_pred(target_close, target_high, target_low, pred_close):
    # Convert back to dollar $values
    elements = len(target_close)
    print(elements)
    tot_deltas = 0
    tot_tradrng = 0
    for i in range(1, -elements, -1):
        actual = target_close[i - 1]
        scalefac = target_close[i - 1] / target_close[i - 1]
        # print("Scaling Factor", scalefac)
        predval = pred_close[i] * scalefac
        predval = pred_close[i - 1] if np.isinf(predval) else predval
        pred_delta = abs(predval - actual)
        tot_deltas += pred_delta
        trd_rng = abs(target_high[i] - target_low[i])
        tot_tradrng += trd_rng
        print(f"i:{i}", "Close", actual.round(2), "Predicted: $", predval.round(2), "  Delta:$", pred_delta.round(6), "  Trade Rng: $", trd_rng.round(2))

    print("Mean Trading Range: $", round(tot_tradrng / elements, 2))
    print("Mean Delta (Actual vs Prediction): $", round((tot_deltas / elements), 2))
    return tot_deltas, tot_tradrng, elements

tot_deltas, tot_tradrng, elements = analyze_pred(target_close, target_high, target_low, adjust_prediction)

print("Mean Trading Range: $", round(tot_tradrng / elements, 2))
print("Mean Delta (Actual vs Prediction): $", round((tot_deltas[0] / elements), 2))

### - Deltas between predicted price and actual price

In [None]:
# Calculate an array of deltas between predicted price and actual price
print("Std Dev -     Pred: ", pred_delta.std())
y_pa_delta = abs(y_test - y_p_adj)
print("Std Dev - Pred Adj: ", y_pa_delta.std())
print("Better by", (y_p_delta.std() / y_pa_delta.std()), "Standard Deviations")

### - Deltas between predicted price and actual price

In [None]:
# Calculate an array of deltas between predicted price and actual price
print("Std Dev -     Pred: ", pred_delta.std())
y_pa_delta = abs(y_test - y_p_adj)
print("Std Dev - Pred Adj: ", y_pa_delta.std())
print("Better by", (y_p_delta.std() / y_pa_delta.std()), "Standard Deviations")

## - Trend Prediction Stats using Unscaled (v < 1) Adjusted 
** The stats are not entirely accurate a the adjusted prediction will change when the new close actually occurs 
   and the adjusting process includes the new close in the calculation


In [None]:
# How often is the prediction within 1% of the actual price?
pct = 0.05
pct_delta = abs(y_test * pct)
delta = abs(y_test - y_p_adj)
within_range = delta[delta < pct_delta]
# Get average price range for the instrument
avg_price_rng = data_set['High'] - data_set['Low']
# Get pct of average price range
pcnt_avg_price_rng = avg_price_rng.mean() * pct

print("Within",pct * 100,"% of Acutal Adj Close","[ Trading Range:$", round(tot_tradrng / elements, 2), 
      " Within Avg Price:", avg_price_rng.mean().round(2), " Average Difference to Actual: $", pcnt_avg_price_rng.round(2),"]", 
      " Pct within Range:", round((len(within_range) / len(y_test)),2) * 100,"%")

# How often is the predicted trend the same as the actual trend?
test_trend = y_test[1:] - y_test[:-1]
pred_trend = y_p_adj[1:] - y_p_adj[:-1]
delta = [1 if (test_trend < 0) and (pred_trend < 0) or (test_trend > 0) and (pred_trend > 0) else 0 for test_trend, pred_trend in zip(test_trend, pred_trend)]
# Print the number of elements in delta and the percent of delta that are 1.
print("Same Trend:", len(delta), "Wrong:", len(delta) - len(within_range)+1, "Right", np.count_nonzero(within_range+1), "%Right", round((len(within_range)-1) / len(delta) * 100,2))


# Model Details...

In [None]:
# Review feature importance
# importance = model.coef_

import shap
import pydot

print(X_train.shape)

print(model.summary())
print("model.outputs", model.layers[3].output.shape[1])
from keras.utils import plot_model
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)
from IPython.display import Image

pass

## - Model Plot
<img src="model_plot.png" alt="Drawing" style="width: 200px;"/>