## Time Series Forecasting using a LSTM Recurrent Neural Network 
#### In Comparison to a Feedfoward Network, this type of RNN has the ability to feed past outputs back to the input layer. This memory allows the network to utilize knowledge of the past when evaluating present inputs. This way it uses information that is kept in the sequence itself. For this reason RNNs are used to analyze sequential data like Time Series.

##### Imports

In [1]:
import pandas as pd
import os
import plotly.graph_objects as go
import numpy as np
from sklearn import preprocessing

ModuleNotFoundError: No module named 'plotly'

### Load and plot the Time Series:

In [2]:
# get path of datasets
cur_path = os.getcwd() 
base_data_path = os.path.abspath(os.path.join(cur_path, os.pardir, 'service/data'))
file_path = os.path.join(base_data_path, 'BTCUSDT.csv')

# load data set
df = pd.read_csv(file_path, sep=',', header=None)
df.columns = ['symbol', 'timestamp', 'open', 'high', 'low', 'close', 'volume']
# add price change of candle
df['%_change'] = ((df.close / df.open) - 1) * 100

In [66]:
# function to plot the time series using plotly
def plot_series(dataframe):
    fig = go.Figure()
    fig.add_trace(go.Scatter(x = pd.to_datetime(dataframe['timestamp'], unit='s'), y = dataframe['close']))
    fig.update_layout(
        title=str(dataframe['symbol'][0]),
        xaxis_rangeslider_visible=True,
        xaxis_title="Date",
        yaxis_title="Closing Price",
        font=dict(
            family="Courier New, monospace",
            size=18,
            color="#7f7f7f"
        )
    )
    fig.show()

In [67]:
plot_series(df)

### Preparing the dataset for our model step by step:
##### 1. Define a sequence length and drop unimportant features

In [105]:
# define the number of data points which our model bases its predictions of (e.g. last 50 days)
# in this case our model needs data of the last 50 days to make a prediction for the next day
SEQ_LENGTH = 50

# since our dataset is in sequencial order we can drop the timestamp column
# we also dont need the stock's name
model_data = df.drop(['symbol', 'timestamp'],axis=1)

##### 2. Normalize the data with MinMax-Scaling

In [106]:
# we normalize the data to bring all features to a common scale and let our model potentially converge faster
# normalize the data set with MinMax-Scaling
min_max_scaler = preprocessing.MinMaxScaler()
normalized_model_data = min_max_scaler.fit_transform(model_data)

##### 3. Construct the sequences and labels

In [109]:
# now we have to create the sequences of 50 days with an according label
# we will use the closing price of the day followed by the sequence of the 50 days as our label/target

# create a list with each item being a sequence of data of size SEQ_LENGTH (e.g. each item is a list of stock data for 50 days)
data_sequencies = np.array([normalized_model_data[i  : i + SEQ_LENGTH].copy() for i in range(len(normalized_model_data) - SEQ_LENGTH)])

# for each of these sequences we want to predict the closing price of the following day, so we save a list of those
next_day_close = np.array([normalized_model_data[:,3][i + SEQ_LENGTH].copy() for i in range(len(normalized_model_data) - SEQ_LENGTH)])
# add another dimension, since next_day_close has to be 2 dimensional
next_day_close = np.expand_dims(next_day_close, -1)

# --> so the first item in next_day_close is the price we want to predict with the data of the first 50 days(=data_sequencies[0])


# save unnormalized labels for plotting and the scaler for reverting our predicted labels
next_day_close_unnormalized = np.array([model_data.to_numpy()[:,3][i + SEQ_LENGTH].copy() for i in range(len(model_data.to_numpy()) - SEQ_LENGTH)])
next_day_close_unnormalized = np.expand_dims(next_day_close_unnormalized, -1)

label_min_max_scaler = preprocessing.MinMaxScaler()
label_min_max_scaler.fit(next_day_close_unnormalized)

MinMaxScaler(copy=True, feature_range=(0, 1))

##### 4. Splitting our dataset:

In [114]:
# define ratio for splitting (e.g. 90% train 10% test)
train_test_ratio = 0.9
split_idx = round((data_sequencies.shape)[0]*0.9)

# split data into train and test data
train_data = data_sequencies[:split_idx]
train_labels = next_day_close[:split_idx]

test_data = data_sequencies[split_idx:]
test_labels = next_day_close[split_idx:]

# also keep the unnormalized labels of our test set
test_labels_unnormalized = next_day_close_unnormalized[split_idx:]

### Constructing the Model: