# Price ML Predicion

## Price Prediction's Caveats

  When the characteristics of the instrument are far from stable, price prediction can be quite off.
  - Examples...
    - SERV
    - MEDS
    - XCUR
  
  Price Prediction should be combined with Prediction of market highs and lows because if you just entered the market based on predicted price, you could get washed out by manipulative Market Wales hitting your stops.
  
  Predicting High, Low, Market Open and Market Close Should be done using separate models, using the same model parameters.
  
  Before trading an instrument, one should review the median market price range, verses median prediction accuracy delta, to verify whether the instrument is worth trading. I have a hunch that the smaller the trading range as compared to the prediction delta, the less likely the instrument is worth engaging with.
<p>

## Current Enhancements

From a visual review, although the prediction strays from the price
   the trend of the prediction seems quite accurate.
   By pulling the prediction closer to the price, by an average of the delta,
   between the price and the prediction, the new prediction is quite accurate.
<p>

## Enhancements (to add)

We should be able to add prediction targets for high and low for the day by 
   adding the target high-difference and low-difference to the data set,
   similar to the way we added the target close-difference.
   
We can also apparently pull minute bars from Yahoo. And with that, we can
   perform more granular predictions, to the hour, 15min, 5min, and
   1min period.

# Choose an existing model -or- Create a New Model


In [1]:
# Create a list of available .keras models from model_dir...
# Restart the kernel
%reset -f
%gui asyncio

import ipywidgets as widgets
from ipywidgets import Output  
import re
import datetime
import os
import asyncio
import yfinance as yf

global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, Selected_Model

Ticker    = 'AMZN'

DateStart = '2015-06-03'
DateEnd   = '2024-07-30'
CreateNewModel = "Create New Model"

# Get list of file from the model directory...
model_dir = 'models/'
model_files = os.listdir(model_dir)
Selected_Model = None

def wait_for_change(widget, value):
    future = asyncio.Future()
    def getvalue(change):
        # make the new value available
        future.set_result(change.new)
        widget.unobserve(getvalue, value)
    widget.observe(getvalue, value)
    return future

def model_choices():
    global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, Selected_Model
    
    # Filter the list to only include .keras files...
    models_lst = [file for file in model_files if re.search(r'\.keras$', file)]
    models_lst.sort()
    model_files = [CreateNewModel] + models_lst
    # Display the list of files...
    Selected_Model = widgets.Select ( options=model_files,
                                      value=model_files[0],
                                      description='Model File: ',
                                      disabled=False )
    ticker     = widgets.Text(value=Ticker, description='TICKER: ')
    DateStart = "2015-01-01"
    dateStart = widgets.Text(value=DateStart, description='Start Date: ')
    # Set the DateEnd to today's date with the format 'YYYY-MM-DD, determined by the system date.
    if Selected_Model.value == CreateNewModel:
        DateEnd = datetime.datetime.now().strftime('%Y-%m-%d')
    dateEnd   = widgets.Text(value=DateEnd, description='End Date: ')
    out = Output()
    
    async def f0():
        for i in range(10):
            out.append_stdout('did work ' + str(i) + '\n')
            x = await wait_for_change(Selected_Model, 'value')
            if Selected_Model.value != CreateNewModel:
                ticker.value = Selected_Model.value.split('_')[0]
            out.append_stdout('async function continued with value ' + str(x) + '\n')
        asyncio.ensure_future(f0())
    
    async def f1():
        for i in range(10):
            out.append_stdout('did work ' + str(i) + '\n')
            x = await wait_for_change(ticker, 'value')
            tkr = ticker.value
            tkr = re.sub(r'Invalid: ', '', tkr)
            if Selected_Model.value == CreateNewModel:
                # Check yahoo if symbol is valid...
                try:
                    ticker_ = yf.Ticker(tkr).history(period='5d',interval='1d')
                except:
                    ticker_ = []
                if len(ticker_) == 0:
                    ticker.value = 'Invalid: ' + tkr
                else:
                    ticker.value = ticker.upper()
            out.append_stdout('async function continued with value ' + str(x) + '\n')
        asyncio.ensure_future(f1())
    
    asyncio.ensure_future(f0())
    asyncio.ensure_future(f1())
    
    inst0 = """
        +-------------------------------+" +
        | Choose or Create a New Model  |
        +-------------------------------+
    """
    inst1 = """Make you Model Selection and Updates the Start and End Dates.
    Or, Create a New Model by entering by leave the Model File: as
    "Create New Model", and entering the Ticker, Start Date and End Date.
    """
    print(inst0, inst1)
    display(Selected_Model)
    print()
    display(ticker, dateStart, dateEnd)
    print()
    # display(out)

    # if selected_model.value == CreateNewModel:
    # - Create a new model using the ticker.value, dateStart.value, DateEnd.vale values.
    return(Selected_Model, ticker, dateStart, dateEnd)

Selected_Model, ticker, dateStart, dateEnd = model_choices()



        +-------------------------------+" +
        | Choose or Create a New Model  |
        +-------------------------------+
     Make you Model Selection and Updates the Start and End Dates.
    Or, Create a New Model by entering by leave the Model File: as
    "Create New Model", and entering the Ticker, Start Date and End Date.
    


Select(description='Model File: ', options=('Create New Model', 'AAPL_2015-01-01_2024-07-30.keras', 'AMZN_2015…




Text(value='AMZN', description='TICKER: ')

Text(value='2015-01-01', description='Start Date: ')

Text(value='2024-08-07', description='End Date: ')




In [3]:
def print_model_choices(Selected_Model, ticker, dateStart, dateEnd):
    print( "Selected_Model:", Selected_Model.value)
    print("Ticker: ", ticker.value)
    print("DateStart: ", dateStart.value)
    print("DateEnd: ", dateEnd.value)

print_model_choices(Selected_Model, ticker, dateStart, dateEnd)

Selected_Model: ^N225_2015-01-01_2024-07-30.keras
Ticker:  ^N225
DateStart:  2015-01-01
DateEnd:  2024-08-07


#  Load the latest data for the selected Ticker...

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn
import pandas as pd
import yfinance as yf

global data, feature_cnt, feature_cnt_dl, model_path

def load_data():
    global Ticker, DataStart, DateEnd, ticker, dateStart, dateEnd, CreateNewModel, model_dir, model_files, model_path, Selected_Model
    global data_orig, data, feature_cnt, feature_cnt_dl

    model_dir = 'models/'

    if Selected_Model.value == CreateNewModel:
        modelDateStart = DateStart
        modelDateEnd = DateEnd
        model_file = ticker.value + "_" + dateStart.value + "_" + dateEnd.value + ".keras"
        model_path = model_dir + model_file
    else:
        model_path = model_dir + Selected_Model.value

    # If we are using an existing model, we still have to download the data as we need the
    # actual price data to adjust the prediction.

    # If the predicted value varies/swings-about too much from the price,
    # the days/period training period many be changed to exclude the prior period
    # That does not have present-time characteristics with regard to price movement.
    data_orig = yf.download(tickers = ticker.value, start = dateStart.value, end = dateEnd.value)
    data=data_orig.copy(deep=True)
    offset = pd.Timedelta(days=-30)
    # Resample to 'W'eekly or 'ME'(Month End)
    # logic = {'Open'  : 'first',
    #          'High'  : 'max',
    #          'Low'   : 'min',
    #          'Close' : 'last',
    #          'Adj Close': 'last',
    #          'Volume': 'sum'}
    # data = data.resample('W', offset=offset).apply(logic)
    print("Pulling Data for:", ticker.value, "from", dateStart.value, "to", dateEnd.value)
    print("data.len():", len(data), "data.shape:", data.shape)

    feature_cnt_dl = data.shape[1]
    feature_cnt = feature_cnt_dl + 1  # Actual count vs index count.

load_data()
# data.tail(10)
data.head(5), data.tail(5)


[*********************100%%**********************]  1 of 1 completed

Pulling Data for: ^N225 from 2015-01-01 to 2024-08-07
data.len(): 2346 data.shape: (2346, 6)





(                    Open          High           Low         Close  \
 Date                                                                 
 2015-01-05  17325.679688  17540.919922  17219.220703  17408.710938   
 2015-01-06  17101.580078  17111.359375  16881.730469  16883.189453   
 2015-01-07  16808.259766  16974.609375  16808.259766  16885.330078   
 2015-01-08  17067.400391  17243.710938  17016.089844  17167.099609   
 2015-01-09  17318.740234  17342.650391  17129.529297  17197.730469   
 
                Adj Close     Volume  
 Date                                 
 2015-01-05  17408.710938  116500000  
 2015-01-06  16883.189453  166000000  
 2015-01-07  16885.330078  138600000  
 2015-01-08  17167.099609  140600000  
 2015-01-09  17197.730469  155200000  ,
                     Open          High           Low         Close  \
 Date                                                                 
 2024-07-31  38140.769531  39188.371094  37954.378906  39101.820312   
 2024-08-01  3

##  - Add indicators to the data


In [5]:
# Add indicators to the data...

import pandas_ta as ta

def add_indicators(data, feature_cnt_dl):

    feature_cnt = feature_cnt_dl

    data = data_orig.copy(deep=True)

    # print("= Before Adding Indicators ========================================================")
    # print(data.tail(10))

    data['RSI']=ta.rsi(data.Close, length=3); feature_cnt += 1
    # data['EMAF']=ta.ema(data.Close, length=3); feature_cnt += 1
    # data['EMAM']=ta.ema(data.Close, length=6); feature_cnt += 1
    data['EMAS']=ta.ema(data.Close, length=9); feature_cnt += 1
    data['DPO3']=ta.dpo(data.Close, length=3, centered=True); feature_cnt += 1
    data['DPO6']=ta.dpo(data.Close, length=6, centered=True); feature_cnt += 1
    data['DPO9']=ta.dpo(data.Close, length=9, centered=True); feature_cnt += 1

    # print("= After Adding DPO2 ========================================================")
    # print(data.tail(10))
    #
    # On Balance Volume
    if data['Volume'].iloc[-1] > 0:
        data = data.join(ta.aobv(data.Close, data.Volume, fast=True, min_lookback=3, max_lookback=9))
        feature_cnt += 7 # ta.aobv adds 7 columns

    # print("= After Adding APBV ========================================================")
    # print(data.tail(10))

    # Target is the difference between the adjusted close and the open price.
    data['Target'] = data['Adj Close']-data.Open
    # Shift the target up by one day.Target is the difference between the adjusted close and the open price.
    # That is, the target is the difference between the adjusted close and the open price.
    # Our model will predict the target close for the next day. So we shift the target up by one day.
    data['Target'] = data['Target'].shift(-1)
    feature_cnt += 1

    # 1 if the price goes up, 0 if the price goes down.
    # Not a feature: Needed to test prediction accuracy.
    data['TargetClass'] = [1 if data['Target'].iloc[i]>0 else 0 for i in range(len(data))]

    # The TargetNextClose is the adjusted close price for the next day.
    # This is the value we want to predict.
    # Not a feature: Needed to test prediction accuracy.
    data['TargetNextClose'] = data['Adj Close'].shift(-1)

    # Before scaling the data, we need to use the last good value for rows that have NaN values.
    data.ffill(inplace=True)

    # print("= After Adding Targets___ and ForwardFill ========================================================")
    # print(data.tail(10))

    # Reset the index of the dataframe.
    data.reset_index(inplace = True)
    data_date = data['Date'].copy()
    data.drop(['Date'], axis=1, inplace=True);   feature_cnt -= 1
    data.drop(['Close'], axis=1, inplace=True);  feature_cnt -= 1
    # data.drop(['Volume'], axis=1, inplace=True); feature_cnt -= 1

    # Add one more row to the data file, this will be our next day's prediction.
    data = pd.concat([data,data[-1:]])
    # And, reindex the dataframe.
    data.reset_index(inplace=True)

    # print("feature_cnt_d1:", feature_cnt_dl, " feature_cnt:", feature_cnt)
    return data, feature_cnt, data_date

data, feature_cnt, data_date = add_indicators(data, feature_cnt_dl)

data_set = data.copy(deep=True)
print("Len data:", len(data), "Len data_set", len(data_set))
# data_set.tail(10)

Len data: 2347 Len data_set 2347


## - Scale the data

  The data is scaled to a range of 0 to 1.
  This is done to ensure that the data is normalized, so that the model can be trained on it

In [6]:
# Scale the data for training...

import sklearn
from sklearn.preprocessing import MinMaxScaler


def scale_data(data_set):
    sc = MinMaxScaler(feature_range=(0,1))
    data_set_scaled = sc.fit_transform(data_set)
    return sc, data_set_scaled

sc, data_set_scaled = scale_data(data_set)

print("data_set.shape:", data_set.shape, "data_set_scaled.shape", data_set_scaled.shape)
# print(data_set_scaled)


data_set.shape: (2347, 21) data_set_scaled.shape (2347, 21)


In [7]:
# Check of nan an/or 0 dataums...
# After scaling the data, we need to check for NaN and 0 values,
# and forward fill the data with the last known value.
# In the case of a series of 0 values, begining at index 0,
# we have to back fill those values with the first non-non/zero value available.
# *** Currently, we are not handling this case. ***
def scaled_data_cleanup(data_set):
    nan_indices = np.argwhere(np.isnan(data_set_scaled))
    zer_indices = np.argwhere(data_set_scaled == 0)

    # print("nan_indices:", nan_indices)
    # print('zer_indices:', zer_indices)

    for i in range(len(nan_indices)):
        j = nan_indices[i][1] - 1
        if j < 0:
            j = nan_indices[i][1] + 1
        data_set_scaled[nan_indices[i][0], nan_indices[i][1]] = data_set_scaled[nan_indices[i][0], j]
    for i in range(len(zer_indices)):
        j = zer_indices[i][1] - 1
        if j < 0:
            j = zer_indices[i][1] + 1
        data_set_scaled[zer_indices[i][0], zer_indices[i][1]] = data_set_scaled[zer_indices[i][0], j]

    nan_indices = np.argwhere(np.isnan(data_set_scaled))
    zer_indices = np.argwhere(data_set_scaled == 0)

    return data_set_scaled, nan_indices, zer_indices

data_set_scaled, nan_indices, zer_indices = scaled_data_cleanup(data_set)
# print("nan_indices:", nan_indices)
# print('zer_indices:', zer_indices)


## - Prepare the scaled data for the model

  - The data is prepared for the model by creating a 3D array of the data.
  - The data is split into training and test data.
  - The training data is used to train the model.
  - The test data is used to test the model.
  - The model is then used to predict the next day's price.

In [8]:
# LSTM needs a rolling period of data for each feature to make predictions.
# multiple feature from data provided to the model
global X, backcandles, y, yi, data_set
backcandles = 15 # Set the rolling window size.

def prep_model_inputs(data_scaled, backcandles):
    X = []
    # print(data_set_scaled[0].size)
    # data_set_scaled=data_set.values
    # print(data_set_scaled.shape[0])

    # Create a 3D array of the data. X[features][periods][candles]
    # Where candles is the number of candles that rolls by 1 period for each period.
    for j in range(feature_cnt): # data_set_scaled[0].size):# last 2 columns are target not X
        X.append([])
        for i in range(backcandles, data_set_scaled.shape[0]): # backcandles+2
            X[j].append(data_set_scaled[i-backcandles:i, j])

    # print("X.shape:", np.array(X).shape)
    X = np.array(X)
    # Move axis from 0 to position 2
    X=np.moveaxis(X, [0], [2])
    # print("X.shape:", X.shape)

    # Erase first elements of y because of backcandles to match X length
    # del(yi[0:backcandles])
    #X, yi = np.array(X), np.array(yi)
    # Choose -1 for last column, classification else -2...
    X, yi = np.array(X), np.array(data_set_scaled[backcandles:,-1])
    y = np.reshape(yi,(len(yi),1))
    #y=sc.fit_transform(yi)
    #X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
    
    # print("data_set.shape:",data_set.shape,"X.shape:",X.shape)
    # print(X)
    # print("=========================================================")
    # print(y.shape)
    # print(y)

    return X, y, yi

X, y, yi = prep_model_inputs(data_set_scaled, backcandles)


In [9]:
# split data into train test sets

global splitlimit
splitlimit = int(len(X)*0.8)

def split_data(X, y, splitlimit):
    # print("lenX:",len(X), "splitLimit:",splitlimit)
    X_train, X_test = X[:splitlimit], X[splitlimit:] # Training data, Test Data
    y_train, y_test = y[:splitlimit], y[splitlimit:] # Training data, Test Data

    adj_close_train, adj_close_test = data_set.loc[:splitlimit, 'Adj Close':'Adj Close'], data_set.loc[splitlimit:, 'Adj Close':'Adj Close'] # Training data, Test Data
    
    # print("X_train.shape:", X_train.shape)
    # print("y_train.shape:", y_train.shape)
    # print("X_test.shape:", X_test.shape)
    # print("y_test.shape:", y_test.shape)
    # print("adj_close_test.shape:", adj_close_test.shape)
    # print("== X_train ===========================================")
    # print(X_train[-20:-10])
    # print("== _train ===========================================")
    # print(y_train[-20:-10])    

    return X_train, X_test, y_train, y_test, adj_close_test

X_train, X_test, y_train, y_test, adj_close_test = split_data(X, y, splitlimit)


# Load an Existing Model -or- Model Training
 - Using LSTM (Long Short Term Memory) Model
 - Using Keras (Tensorflow) Library
 - LSTM --> Dense Layer --> Activation Layer


In [10]:

import tensorflow as tf
import keras
import numpy as np
import os
from keras import optimizers
from keras.callbacks import History
from keras.models import Model
from keras.layers import Dense, Dropout, LSTM, Input, Activation, concatenate
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
from keras.layers import TimeDistributed

def create_fit_model(X_train, y_train, backcandels, model_path):
    # If the Keras model file exists, load it. Otherwise, create a new model.
    if os.path.exists(model_path) and Selected_Model.value != CreateNewModel:
        # Load the model...
        model = keras.models.load_model(model_path)
        print("Model Loaded:", model_path)
    else:
        # Create a new model...

        #tf.random.set_seed(20)
        np.random.seed(10)

        lstm_input = Input(shape=(backcandles, feature_cnt), name='lstm_input')
        inputs = LSTM(200, name='first_layer')(lstm_input)
        inputs = Dense(1, name='dense_layer')(inputs)
        output = Activation('linear', name='output')(inputs)
        model = Model(inputs=lstm_input, outputs=output)
        adam = optimizers.Adam()
        model.compile(optimizer=adam, loss='mse')
        # model.fit(x=X_train, y=y_train, batch_size=15, epochs=30, shuffle=True, validation_split = 0.1)
        model.fit(x=X_train, y=y_train, batch_size=30, epochs=50, shuffle=True, validation_split = 0.1)
        print("adj_close_test.shape:", adj_close_test.shape)

        # Save the model...
        model.save(model_path)
    
    return model

model = create_fit_model(X_train, y_train, backcandles, model_path)


2024-08-07 14:31:59.611389: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Model Loaded: models/^N225_2015-01-01_2024-07-30.keras


2024-08-07 14:32:00.867040: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


# Model Testing

In [11]:
# Sow some y_pred values. If prediction is no good, we might see NaN values.
y_pred = model.predict(X_test)
# for i in range(10):
#     print(y_pred[i], y_test[i], )

[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 30ms/step


## - Adjust the prediction
  The ideal for the prediction is for it to be as close to the actual price as possible.
  What was observed is that the prediction is often off by a delta.
  The delta is the difference between the actual price and the predicted price.
  After applying an average of the delta to the prediction, the prediction is closer to the actual price.
  Usually within 1% of the actual price, which is an incredible improvement.  
  
  The resulting numpy array y_p_adj is the adjusted prediction, which we will use as the prediction for the next day.

In [12]:
# Calculate the delta between actual price and prediction
# Bring the prediction closer to the price based on the delta
def mov_avg(x, w):
    return abs(np.convolve(x, np.ones(w), 'valid') / w)

def apply_deltas(a,p,d, w):
    al = list(a) # Actual
    pl = list(p) # Predicted
    dl = list(d) # Delta
    rl = [] # Return List
    if len(dl) < len(pl):
        for i in range(len(pl) - len(dl)):
            # Duplidate the first value in the list to make the list the same size as pl.
            dl.insert(0, dl[0])
    # We don't apply the average to our first predictions
    # fot the size of the moving average window.
    # For this area, we just return the predicted value.
    i = w - 1
    for j in range(0, i):
        rl.append(pl[j]) 
    # print("al.len", len(al), "pl.len", len(pl), "dl.len", len(dl))
    # Beyond the window, we apply the average delta to the prediction.
    delta_factor = 1
    for delta in dl[:-1]:
        if i >= len(dl):
            break
        # print("i:", i, "dl:", len(dl), "al:", len(al), "pl:", len(pl))
        if al[i] > pl[i]:
            # If the actual price is greater than the predicted price, add the delta.
            val = pl[i] + (delta * delta_factor)
        else:
            # If the actual price is less than or equal to the predicted price, subtract the delta.
            val = pl[i] - (delta * delta_factor)
        rl.append(val)
        i += 1
    return rl

def adjust_prediction(y_test, y_pred, avg_win=5):
    """
    Calculate the delta between actual price and prediction
    Bring the prediction closer to the price based on the delta
    """

    y_p_delta = y_pred - y_test
    # Shift delta to the left by 1 day so that the adjusted prediction
    # only uses know data, and duplidate the delta for the last day (which we don't know).
    y_p_delta = np.append(y_p_delta[1:], y_p_delta[-1])
    # Calculate the average delta for the prediction.
    y_p_davg = mov_avg(y_p_delta, avg_win)

    # print(y_test.shape, y_pred.shape, y_p_adj.shape, y_p_delta.shape)
    y_p_adj = apply_deltas(y_test, y_pred, y_p_davg, avg_win)
    # We don't know the actual price for today, so we don't know the delta and average,
    # so for today, we will use the last known average delta (yesterday's) to adjust todays prediction.
    if y_test[-2] > y_pred[-1]:
        y_p_adj[-1] = y_pred[-1] + y_p_davg[-2]
    else:
        y_p_adj[-1] = y_pred[-1] - y_p_davg[-2]
    # Convert y_p_adj to a numpy array for plotting.
    y_p_adj = np.array(y_p_adj)

    return y_p_adj, y_p_delta

y_p_adj, y_p_delta = adjust_prediction(y_test, y_pred, avg_win=3)
# print("adj_close_test.shape:", adj_close_test.shape)
# print("y_pred:", y_pred.shape[0])


# - Plot the Test Results

### - Unscaled Data (0..1) (Prediction vs Target)

In [13]:
# Use an interactive plot to see the results
%matplotlib notebook

def plot_pred_results(y_test, y_pred, y_p_adj):
    # Plot the scaled test and predicted data.
    y_test_plot = y_test[:-2]
    plt.figure(figsize=(16,8))
    # Acj Close
    plt.plot(y_test_plot, color = 'black', label = 'Test',     marker='.')
    # Original Prediction
    plt.plot(y_pred[1:],      color = 'red',   label = 'Pred',     marker='1')
    # Adjusted Prediction
    plt.plot(y_p_adj[1:],     color = 'orange', label = 'Adj Pred', marker='1')
    plt.legend()
    plt.grid()
    plt.show()
    return plt

plt = plot_pred_results(y_test, y_pred, y_p_adj)
# pd.DataFrame(y_test_plot).tail(5)

<IPython.core.display.Javascript object>

### - Rescaled Data ($USD) (Prediction vs Target)

In [14]:
# Plot using candlesticks using matplotlib...

# Use an interactive plot to see the results
%matplotlib notebook

def plot_candlesticks(plt, prices, title, width=1, width2=.25, clr_up='green', clr_dn='red', clr_wick='black'):
    """
    Plot Candlesticks...
    :param plt:    maplotlib.pyplot handle
    :param prices: The series of OHLC prices to plot
    :param title:  A title for the plot
    :param width:
    :param width2:
    :param clr_up:
    :param clr_dn:
    :param clr_wick:
    :return:
    """
    #define up and down prices
    up = prices[prices.Close>=prices.Open]
    down = prices[prices.Close<prices.Open]

    #plot up prices
    plt.bar(up.index,up.Close-up.Open,width,bottom=up.Open,color=clr_up, alpha=.30)
    # plt.bar(up.index,up.High-up.Close,width2,bottom=up.Close,color=clr_up, alpha=.25)
    # plt.bar(up.index,up.Low-up.Open,width2,bottom=up.Open,color=clr_up, alpha=.25)
    plt.bar(up.index,up.High-up.Low,width2,bottom=up.Low,color=clr_wick, alpha=.30)

    #plot down prices
    plt.bar(down.index,down.Open-down.Close,width,bottom=down.Close,color=clr_dn, alpha=.25)
    # plt.bar(down.index,down.High-down.Open,width2,bottom=down.Open,color=clr_dn, alpha=.25)
    # plt.bar(down.index,down.Low-down.Close,width2,bottom=down.Close,color=clr_dn, alpha=.25)
    plt.bar(down.index,down.High-down.Low,width2,bottom=down.Low,color=clr_wick, alpha=.25)

    #rotate x-axis tick labels
    plt.xticks(rotation=45, ha='right')

import math
import mplfinance as mfp


def plot_pred_results(y_test, y_pred, y_p_adj, data_set, data_set_scaled, data_date):
    # Plot the original Adj Cost data that is in dollars...
    # Remove the last Adj Price, because it just aopy of the last adj
    plt_test_usd = np.array(data_set[backcandles + splitlimit:]['Adj Close'])
    plt_test_plot =np.array(plt_test_usd[:-1].reshape(-1,1))

    # Using Inverse_Transform to Rescale the predicted data to dollars...
    # To rescale the data, we need to use .inverse_transform with a similar tructure
    # to the original data_set_scaled returned by the .fit_transform function that did
    # the original scaling.
    # So we feed om the "test" portion of the data_set_scaled and replace the "TargetNextClose"
    # Column, which is the last column, with the predicted values (which we wan to rescale).
    # This is then fed to the .inverse_transform function to get the rescaled values, and we
    # will get accurate rescaled predicted values.
    data_set_scaled_y = data_set_scaled[backcandles + splitlimit:, :].copy()
    # Replace the last column in data_set_scaled_y with the predicted values...
    data_set_scaled_y[:, -1] = y_p_adj.reshape(-1)
    # Get the rescaled values for the adjusted prediction...
    y_pred_adj_rs = sc.inverse_transform(data_set_scaled_y)[:, -1]
    # Replace the last column in data_set_scaled_y with the predicted values...
    data_set_scaled_y[:, -1] = y_pred.reshape(-1)
    # Get the rescaled values for the adjusted prediction...
    y_pred_rs = sc.inverse_transform(data_set_scaled_y)[:, -1]

    # Calculate the mean of the delta between the prediction and the actual price...
    # Used to create price range circles around the prediction...
    plt_delta = np.array(pd.DataFrame([abs(plt_test_usd[i] - y_pred_adj_rs[i]) for i in range(len(y_p_adj)) ]).rolling(window=6).mean())

    # Dates...
    data_date_str = [str(date) for date in data_date]
    plt_test_dates = np.array((data_date_str[backcandles + splitlimit - 1:-1]), ).reshape(-1,1)
    plt_pred_dates = np.array((data_date_str[backcandles + splitlimit - 1:]),   ).reshape(-1,1)

    # empty_arr = np.zeros(((len(plt_test_plot)+1, 18)))
    # plt_pred_adj = sc.inverse_transform(np.append(y_p_adj, empty_arr, axis=1))[0:, 1]

    # Plot the data...
    plt.figure(figsize=(16,8))
    plt.autoscale(tight=True)

    plt_test_plot = plt_test_plot.reshape(-1,1)

    # Plot the actual price data ($USD)...
    plt.plot(plt_test_plot, color = 'black', label = 'Test', marker='+')
    # Plot the rescaled price data ($USD)...
    # plt.plot(plt_test_rescaled, color = 'purple', label = 'Test', marker='+')
    # Plot the adj predicted price data ($USD)...
    # plt.plot(plt_pred_adj, color = 'green', label = 'Test', marker='.')
    # Plot the adj predicted price data ($USD)...
    plt.plot(y_pred_rs, color = 'green', label = 'Test', marker='.')
    # Plot the adj predicted price data ($USD)...
    plt.plot(y_pred_adj_rs, color = 'orange', label = 'Test', marker='.')

    # Plot the candlesticks...
    ds_tmp = data_set[backcandles + splitlimit:-1]
    cs_data = ds_tmp.loc[backcandles + splitlimit:, ['Open','High','Low','Adj Close']].copy().reset_index()
    cs_data.rename(columns={'Adj Close':'Close'}, inplace=True)
    plot_candlesticks (plt, cs_data, 'Candlesticks')

    # Plot the text of the predicted price...
    plt.text(len(y_pred_adj_rs) -1, y_pred_adj_rs[-1], y_pred_adj_rs[-1].round(2), fontsize=8, color='black', ha='left', va='center')

    # plt.xlabel(range(0,len(plt_pred_dates)), plt_pred_dates.tolist(), rotation=45)
    #  .set_major_formatter(matplotlib.dates.DateFormatter('%Y')))

    # Plot a Price Range Circle Arounds the current Prediction...
    Plot_Pred_Circle = True
    if Plot_Pred_Circle:
        import matplotlib.patches as ptch
        lst_pred = len(y_pred_adj_rs) -1
        # ax.set_xlim = ([0, len(plt_pred_adj)])
        # ax.set_ylim = ([plt_pred_adj.min(), plt_pred_adj.max()])
        cpos = np.array([lst_pred, y_pred_adj_rs[-1]])
        circ_pred0 = ptch.Circle(cpos, radius=plt_delta[-1][0]/1,   color='grey',   fill=True, alpha=0.20)
        circ_pred1 = ptch.Circle(cpos, radius=plt_delta[-1][0]/1.5, color='blue',   fill=True, alpha=0.20)
        circ_pred2 = ptch.Circle(cpos, radius=plt_delta[-1][0]/2,   color='yellow', fill=True, alpha=0.20)
        circ_pred3 = ptch.Circle(cpos, radius=plt_delta[-1][0]/3,   color='red',    fill=True, alpha=0.20)
        ax = plt.gca()
        # ax.set_aspect("equal")
        ax.add_patch(circ_pred0)
        ax.add_patch(circ_pred1)
        ax.add_patch(circ_pred2)
        ax.add_patch(circ_pred3)

    plt.grid(visible=True, which='both')
    plt.legend()
    plt.show()
    return plt, y_pred_rs, y_pred_adj_rs

plt, y_pred_rs, y_pred_adj_rs = plot_pred_results(y_test, y_pred, y_p_adj, data_set, data_set_scaled, data_date)


<IPython.core.display.Javascript object>

### - Plot the Prediction Results (Using mplfinance)

In [22]:
# Plot using mlpfinance...
import mplfinance as mpf
print('Ticker:', ticker.value)
def plot_pred_results(Ticker, y_test, y_pred, y_p_adj, y_pred_adj_rs, data_set, data_set_scaled, data_date):
    # Createe a dataframe of the data for plt_test_usd, with a datetime index...
    df_plt_test_usd = pd.DataFrame()
    df_plt_test_usd.insert(0, 'Date', data_date[backcandles + splitlimit + 4:])
    df_plt_test_usd.reset_index()
    ohlcv = data_orig.iloc[backcandles + splitlimit + 4:, [0,1,2,3,4,5]].copy()
    ohlcv.reset_index()
    df_plt_test_usd = pd.concat([df_plt_test_usd, ohlcv.set_axis(df_plt_test_usd.index)], axis=1)
    
    # Add a place holder for the prediction...
    next_date = df_plt_test_usd.iloc[-1, 0] + pd.DateOffset(days=1)
    next_open      = df_plt_test_usd.iloc[-1, 4]
    next_high      = df_plt_test_usd.iloc[-1, 4]
    next_low       = df_plt_test_usd.iloc[-1, 4]
    next_close     = df_plt_test_usd.iloc[-1, 4]
    next_adj_close = df_plt_test_usd.iloc[-1, 4]
    next_volume    = 0
    
    # Prepare the prediction data for the plot...
    c_pred_rs     = pd.DataFrame(y_pred_rs[4:]).rename({0:'pred_rs'}, axis=1)
    c_pred_adj_rs = pd.DataFrame(y_pred_adj_rs[4:]).rename({0:'pred_adj_rs'}, axis=1)
    
    # Append a new row to the dataframe so it extends out to the prediction...
    new_row = { 'Date': next_date, "Open": next_open, "High": next_high, "Low": next_low, "Close": next_close,
                                   "Adj Close": next_adj_close, "Volume": next_volume,
                                   "pred_rs": c_pred_rs.iloc[-1], "pred_adj_rs": c_pred_adj_rs.iloc[-1] }
    new_row = pd.DataFrame(new_row, index=[6])
    df_plt_test_usd = pd.concat([df_plt_test_usd, new_row], axis=0, ignore_index=True)
    
    # Add the prediction data to the OHLCV plot data...
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_rs.set_axis(df_plt_test_usd.index)],     axis=1)
    df_plt_test_usd = pd.concat([df_plt_test_usd, c_pred_adj_rs.set_axis(df_plt_test_usd.index)], axis=1)
    
    df_plt_test_usd.ffill(inplace=True)
    df_plt_test_usd.set_index('Date', inplace=True, drop=True)
    df_plt_test_usd.ffill(inplace=True)
    df_plt_test_usd.rename(index={6:'pred_rs', 7:'pred_adj_rs'}, inplace=True)
    # Add a day row to df_plt_test_usd as a place holder for the prediction...
    print("df_plt_test_usd:", df_plt_test_usd.shape)
    
    kwargs = dict(type='candle', volume=True, figratio=(11,6), figscale=1.25, warn_too_much_data=10000, title='Ticker: '+ticker.value)
    preds = [ mpf.make_addplot(df_plt_test_usd['pred_rs'],     type='line', panel=0, color='g', secondary_y=False),
              mpf.make_addplot(df_plt_test_usd['pred_adj_rs'], type='line', panel=0, color='r', secondary_y=False)  ]
    mpf.plot(df_plt_test_usd,**kwargs,style='binance', addplot=preds)
    return mpf

mpf = plot_pred_results(Ticker, y_test, y_pred, y_p_adj, y_pred_adj_rs, data_set, data_set_scaled, data_date)


Ticker: ^N225
df_plt_test_usd: (463, 10)


<IPython.core.display.Javascript object>

# Analysis of the Prediction
 - Rescaled data (0..1) (Prediction vs Target)

In [16]:
# We convert our data back to dollars...
# predval = y_pred[i] * scalefac

def analyze_pred(y_test, y_pred, y_p_adj, data_set, data_set_scaled, data_date):
    # Convert back to dollar $values
    elements = len(y_pred)
    print(elements)
    tot_deltas = 0
    tot_tradrng = 0
    for i in range(-1, -elements, -1):
        actual = data_set['Adj Close'].iloc[i - 1]
        scalefac = data_set['Adj Close'].iloc[i - 1] / y_test[i - 1]
        # print("Scaling Factor", scalefac)
        predval = y_pred[i] * scalefac
        predval = y_pred[i - 1] if np.isinf(predval) else predval
        pred_delta = abs(predval - actual)
        tot_deltas += pred_delta
        trd_rng = abs(data_set['High'].iloc[i] - data_set['Low'].iloc[i])
        tot_tradrng += trd_rng
        print("Close", actual.round(2), "Predicted: $", predval.round(2), "  Actual: $", actual.round(2), "  Delta:$", pred_delta.round(6), "  Trade Rng: $", trd_rng.round(2))

    print("Mean Trading Range: $", round(tot_tradrng / elements, 2))
    print("Mean Delta (Actual vs Prediction): $", round((tot_deltas[0] / elements), 2))
    return tot_deltas, tot_tradrng, elements

tot_deltas, tot_tradrng, elements = analyze_pred(y_test, y_pred, y_p_adj, data_set, data_set_scaled, data_date)


467
Close 34675.46 Predicted: $ [6033.81]   Actual: $ 34675.46   Delta:$ [28641.646682]   Trade Rng: $ 2834.47
Close 31458.42 Predicted: $ [5627.38]   Actual: $ 31458.42   Delta:$ [25831.035403]   Trade Rng: $ 2834.47
Close 35909.7 Predicted: $ [8959.46]   Actual: $ 35909.7   Delta:$ [26950.237142]   Trade Rng: $ 4145.06
Close 38126.33 Predicted: $ [8490.14]   Actual: $ 38126.33   Delta:$ [29636.187105]   Trade Rng: $ 1591.37
Close 39101.82 Predicted: $ [10041.71]   Actual: $ 39101.82   Delta:$ [29060.107424]   Trade Rng: $ 1043.68
Close 38525.95 Predicted: $ [10533.39]   Actual: $ 38525.95   Delta:$ [27992.559652]   Trade Rng: $ 1233.99
Close 38468.63 Predicted: $ [11078.25]   Actual: $ 38468.63   Delta:$ [27390.378605]   Trade Rng: $ 454.46
Close 37667.41 Predicted: $ [10807.99]   Actual: $ 37667.41   Delta:$ [26859.416216]   Trade Rng: $ 709.69
Close 37869.51 Predicted: $ [10921.73]   Actual: $ 37869.51   Delta:$ [26947.779692]   Trade Rng: $ 494.77
Close 39154.85 Predicted: $ [1063

## - Rescaled data (0..1) (Adjusted Prediction vs Target)

In [17]:

def analyze_pred_adj(y_test, y_pred, y_p_adj):
    elements = len(y_p_adj)
    # print(elements)
    tot_deltas = 0
    tot_tradrng = 0
    for i in range(-1, -elements, -1):
        actual = data_set['Adj Close'].iloc[i - 1]
        scalefac = data_set['Adj Close'].iloc[i - 1] / y_test[i - 1]
        # print("Scaling Factor", scalefac)
        predval = y_p_adj[i] * scalefac
        predval = y_pred[i - 1] if np.isinf(predval) else predval
        pred_delta = abs(predval - actual)
        tot_deltas += pred_delta
        trd_rng = abs(data_set['High'].iloc[i] - data_set['Low'].iloc[i])
        tot_tradrng += trd_rng
        print("Close", actual.round(2), "Predicted: $", predval.round(2), "  Actual: $", actual.round(2), "  Delta:$", pred_delta.round(6), "  Trade Rng: $", trd_rng.round(2))

    return pred_delta, tot_deltas, tot_tradrng

pred_delta, tot_deltas, tot_tradrng = analyze_pred_adj(y_test, y_pred, y_p_adj)

print("Mean Trading Range: $", round(tot_tradrng / elements, 2))
print("Mean Delta (Actual vs Prediction): $", round((tot_deltas[0] / elements), 2))

Close 34675.46 Predicted: $ [34216.92]   Actual: $ 34675.46   Delta:$ [458.539773]   Trade Rng: $ 2834.47
Close 31458.42 Predicted: $ [28196.75]   Actual: $ 31458.42   Delta:$ [3261.672204]   Trade Rng: $ 2834.47
Close 35909.7 Predicted: $ [40444.23]   Actual: $ 35909.7   Delta:$ [4534.533982]   Trade Rng: $ 4145.06
Close 38126.33 Predicted: $ [38166.69]   Actual: $ 38126.33   Delta:$ [40.365109]   Trade Rng: $ 1591.37
Close 39101.82 Predicted: $ [38589.96]   Actual: $ 39101.82   Delta:$ [511.856373]   Trade Rng: $ 1043.68
Close 38525.95 Predicted: $ [37734.47]   Actual: $ 38525.95   Delta:$ [791.475768]   Trade Rng: $ 1233.99
Close 38468.63 Predicted: $ [38427.72]   Actual: $ 38468.63   Delta:$ [40.909565]   Trade Rng: $ 454.46
Close 37667.41 Predicted: $ [37369.79]   Actual: $ 37667.41   Delta:$ [297.622011]   Trade Rng: $ 709.69
Close 37869.51 Predicted: $ [38626.53]   Actual: $ 37869.51   Delta:$ [757.021521]   Trade Rng: $ 494.77
Close 39154.85 Predicted: $ [39473.69]   Actual: $ 

### - Deltas between predicted price and actual price

In [18]:
# Calculate an array of deltas between predicted price and actual price
print("Std Dev -     Pred: ", pred_delta.std())
y_pa_delta = abs(y_test - y_p_adj)
print("Std Dev - Pred Adj: ", y_pa_delta.std())
print("Better by", (y_p_delta.std() / y_pa_delta.std()), "Standard Deviations")

Std Dev -     Pred:  0.0
Std Dev - Pred Adj:  0.020635273407375147
Better by 6.361402744896186 Standard Deviations


### - Deltas between predicted price and actual price

In [19]:
# Calculate an array of deltas between predicted price and actual price
print("Std Dev -     Pred: ", pred_delta.std())
y_pa_delta = abs(y_test - y_p_adj)
print("Std Dev - Pred Adj: ", y_pa_delta.std())
print("Better by", (y_p_delta.std() / y_pa_delta.std()), "Standard Deviations")

Std Dev -     Pred:  0.0
Std Dev - Pred Adj:  0.020635273407375147
Better by 6.361402744896186 Standard Deviations


## - Trend Prediction Stats using Unscaled (v < 1) Adjusted 
** The stats are not entirely accurate a the adjusted prediction will change when the new close actually occurs 
   and the adjusting process includes the new close in the calculation


In [20]:
# How often is the prediction within 1% of the actual price?
pct = 0.05
pct_delta = abs(y_test * pct)
delta = abs(y_test - y_p_adj)
within_range = delta[delta < pct_delta]
# Get average price range for the instrument
avg_price_rng = data_set['High'] - data_set['Low']
# Get pct of average price range
pcnt_avg_price_rng = avg_price_rng.mean() * pct

print("Within",pct * 100,"% of Acutal Adj Close","[ Trading Range:$", round(tot_tradrng / elements, 2), 
      " Within Avg Price:", avg_price_rng.mean().round(2), " Average Difference to Actual: $", pcnt_avg_price_rng.round(2),"]", 
      " Pct within Range:", round((len(within_range) / len(y_test)),2) * 100,"%")

# How often is the predicted trend the same as the actual trend?
test_trend = y_test[1:] - y_test[:-1]
pred_trend = y_p_adj[1:] - y_p_adj[:-1]
delta = [1 if (test_trend < 0) and (pred_trend < 0) or (test_trend > 0) and (pred_trend > 0) else 0 for test_trend, pred_trend in zip(test_trend, pred_trend)]
# Print the number of elements in delta and the percent of delta that are 1.
print("Same Trend:", len(delta), "Wrong:", len(delta) - len(within_range)+1, "Right", np.count_nonzero(within_range+1), "%Right", round((len(within_range)-1) / len(delta) * 100,2))


Within 5.0 % of Acutal Adj Close [ Trading Range:$ 364.47  Within Avg Price: 270.32  Average Difference to Actual: $ 13.52 ]  Pct within Range: 96.0 %
Same Trend: 466 Wrong: 19 Right 448 %Right 95.92


# Model Details...

In [21]:
# Review feature importance
# importance = model.coef_

import shap
import pydot

print(X_train.shape)

print(model.summary())

from keras.utils import plot_model
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)
from IPython.display import Image

pass

(1865, 15, 17)


None


AttributeError: 'Functional' object has no attribute '_is_graph_network'

## - Model Plot
<img src="model_plot.png" alt="Drawing" style="width: 200px;"/>