### Predicting future Bitcoin stock price
----------------------------------
Bitcoin is one of the most famous cryptocurrency out there. It has reached the mainstream media when its market value has sky rocketed almost reading \$20,000. It's value has dropped since early 2018 due to different circumstances and as of 5/3/2018 it's market value is \$9,232.19 based on https://finance.yahoo.com/quote/BTC-USD . Because of the fluctuating nature it's stock value, we will try to see if a recurrent neural network could predict future prices. But first, let's load the required libraries. As you progress through the demo, you will see how each library is utilized.

In [None]:
# importing relevant modules
import matplotlib.pyplot   as plt

from pandas                import read_csv, to_datetime
from numpy                 import reshape, array
from datetime              import datetime
from sklearn.preprocessing import MinMaxScaler
from keras.layers          import Dense, Dropout, LSTM, Input, TimeDistributed
from keras.models          import Model
from keras_tqdm            import TQDMNotebookCallback
from IPython.display       import SVG, display
from keras.utils.vis_utils import model_to_dot
from keras.callbacks       import EarlyStopping

%matplotlib inline

The sample Data used for this demo contains bitcoin stock price from 12/4/2016 to 3/27/2018. The input attribute in order are Timestamp (in unix time), starting price(open), highest price(high), lowest price(low), closing price(close), volume of bitcoin exchanged(volume_BTC), volume of currency for the bitcoins(volume_currency), weighted price. All prices are in US dollars. Sample of the dataset is shown below.
All datasets were taken from https://www.kaggle.com/mczielinski/bitcoin-historical-data

In [None]:
#For Demonstration purpose only, 
rawData = read_csv('sample_data.csv')
rawData[0:5]  #first five records in the dataset

Now, let's define a function that can preprocess the sample dataset. In the first line we utilize the **read_csv()** function imported from **pandas**. This will return a **DataFrame** which will allow us to manipulate the sample dataset. If you've experience with database, a **DataFrame** is similar to a table. Once we read the file, we use another **pandas** imported function **to_datetime()** which will convert the unix timestampe to "human date". The result is appended to the **DataFrame** as a new column called (wait for it....) *date*.
The *date* is used to group the values in the **DataFrame** to calculate the average mean price of Bitcoin per day. 

In [None]:
def Read_File(file: int = 0, split: float = 0.05, reduce: int = 0):          #File size (in MB)
  
    # Read input sample data into a data frame.
    df = read_csv('sample_data.csv')
    
    # Append 'date' column to the sample dataframe.
    # Uses the Unix timestamp to create equivalent date
    # This attribute will be used to calculate the daily weighted average
    df['date'] = to_datetime(df['Timestamp'],unit='s').dt.date
    
    # Store the average weighted price per day (in USD).
    daily_average = [df.groupby('date')['Weighted_Price'].mean(),
                     df.groupby('date')['Weighted_Price'].mean()][0].values
    
    #reduce file, if desired.
    if reduce and reduce < len(daily_average):
        daily_average = daily_average[len(daily_average) - 1 - reduce:]
        
    #Gaurantees at least one testing example.
    partition     = min(-int(split*len(daily_average)), -1)
    
    training_set, testing_set = daily_average[:partition], daily_average[partition:]
    
    # Create instance of min-max scaler.
    sc                        = MinMaxScaler(feature_range = (0, 1))
    
    # Reshape training and testing sets and then perform min-max scaling.
    training_set, testing_set = sc.fit_transform(reshape(training_set, (len(training_set), 1))), \
                                sc.transform(reshape(testing_set, (len(testing_set), 1)))
    
    
    return training_set[:-1], training_set[1:], testing_set[:-1], testing_set[1:], sc

#### The function is defined to help choose the color and thickness of the lines that we will use to plot the graph later.

In [None]:
#Plot results.
def Plot(data1, data2, title: str = '', label1: str = '', label2: str = '', ylabel: str = '', scatter: bool = True):
    plt.figure(figsize   = (25,15),
               dpi       = 80,
               facecolor = 'w',
               edgecolor = 'k')
    
    ax = plt.gca()

    x  = [_ for _ in range(len(data1))]
    
    if scatter:
        plt.plot(data1, 
                 color = '#AA00FF',
                 ls    = 'dashed')
        plt.plot(data2,
                 color = '#000000',
                 ls    = 'dashed')
        plt.scatter(x,
                   data1,
                   label = label1,
                   color = '#AA00FF',
                   s     = 50)
        plt.scatter(x,
                   data2,
                   label = label2,
                   color = '#000000',
                   s     = 50)
    else:
        plt.bar(x,
                reshape(data1, (len(data1))),
                label = label1,
                color = '#AA00FF')
        plt.bar(x,
                reshape(data2, (len(data2))),
                label = label2,
                color = '#000000',
                alpha = 0.5)

    plt.title(title,
              fontsize = 40)

    for tick in ax.xaxis.get_major_ticks():
        tick.label1.set_fontsize(18)
    for tick in ax.yaxis.get_major_ticks():
        tick.label1.set_fontsize(18)

    plt.xlabel('Time (days)',
               fontsize = 40)
    plt.ylabel(ylabel,
               fontsize = 40)
    plt.legend(loc  = 'best',
               prop = {'size': 25})

    plt.show()

In [None]:
# Get percent change results for plotting.
Get_Percent_Change = lambda p: [100*(e2 - e1)/e1 for e1, e2 in zip(reshape(p[:-1], p[:-1].shape), reshape(p[1:], p[1:].shape))]

In [None]:
# Create training/testing arrays and store min-max scaler so we can perform inverse transform later.
x_train, y_train, x_test, y_test, sc = Read_File()

#Reshape the training and testing input so that it meets the specifications of the net input.
x_train                              = reshape(x_train, (len(x_train), 1, 1))
x_test                               = reshape(x_test, (len(x_test), 1, 1))

#### Let's Build the Network
The model uses Long Short Term Memory as it's first hidden layer then dropout and finally to a dense layer.It is crucial the dense layer to have a linear activation because the output won't get diminished.

Try changing the activation for the LSTM layers to **softmax** or **sigmoid** and see the difference in the behavior. Another thing to look at is how the network behaves when the dropout percentage is changed.

In [None]:
# Building the model
lstm_units  = 100
inputs      = Input(shape = x_train[0].shape)

layer       = LSTM(units            = lstm_units,
                   return_sequences = False,
                   activation       = 'selu')(inputs)

layer       = Dropout(0.2)(layer)

predictions = Dense(units      = 1,
                    activation = 'linear')(layer)

model       = Model(inputs, predictions)

model.compile(loss      = 'mse',
              optimizer = 'adadelta',
              metrics   = ['accuracy'])

# Visual representation of net.
SVG(model_to_dot(model).create(prog='dot', format='svg'))

#### Training time
The epoch size should be somewhere between 500 - 1500 (depends on which activation you choose). The learning curve will reach a plateau right around 1500 epochs so there is no need to train above that point. And also you don't want to shuffle while training because order matters for this specific situation.

In [None]:
# Fitting the RNN to the Training set

epoch_size= 500
history = model.fit(x                = x_train,
                    y                = y_train,
                    batch_size       = len(x_train),
                    epochs           = epoch_size,
                    validation_split = 0.2,
                    verbose          = 0,
                    shuffle          = False,
                    callbacks        = [TQDMNotebookCallback() ])#EarlyStopping(patience = 3)])

#### Plotting the loss-accuracy
Because we used a 0.2 validation split (meaning it will train on 80% of the data we provide it and will test on the 20%), we can plot how accurate it was. We accomplish this using the **matplotlib** package.

In [None]:
plt.figure()
# summarize history for accuracy
plt.subplot(211)
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
# summarize history for loss
plt.subplot(212)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.tight_layout()
plt.show()

#### Checking that actual values
It looks like it didn't perform good but let's do more investigation. Let's compute the **inverse_transform** of the actual price, and predicted price using our model for the trained dataset. Let's plot that graph and see if it tells us anything.

In [None]:
# Comparing our model's output, given the training data, to the expected output.
actual_prices_train       = sc.inverse_transform(y_train)[:-1]
predicted_BTC_price_train = sc.inverse_transform(model.predict(x_train))[1:]

In [None]:
Plot(actual_prices_train, predicted_BTC_price_train, \
     "BTC Price Prediction for Training Data", "Real BTC Price", "Predicted BTC Price", "BTC Price (USD)")

#### The Test Dataset
Looks like it did a decent job with the trained dataset. Now let's do the same thing but with data the network has never seen before. The actual value is retrieved from the preprocessed dataset, and the predicted value is computed using the model. After that we made the graph to compare.

In [None]:
#Predicting the future.
actual_prices_test       = sc.inverse_transform(y_test)[:-1]
predicted_BTC_price_test = sc.inverse_transform(model.predict(x_test))[1:]

In [None]:
Plot(actual_prices_test, predicted_BTC_price_test, "BTC Price Prediction for Testing Data", "Real BTC Price", "Predicted BTC Price", "BTC Price (USD)")

### Conclusion
Although the accuracy graph didn't show an outstanding result, checking the actual points gave us an insight on why it wasn't accurate. It looks like our prediction was shooting higher or lower than the actual price but not by too much. When we look at the bigger picture the network performed surprisingly well. In fact one might say too well. This leads us to ask questions on were there some unknown bias or did the network memorize the dataset... it requires more investigation. But for now the network did predict the future price of Bitcoin to reasonable accuracy.