# Cryptocurrency prediction deep learning LSTM

### Predicción del precio de criptomonedas con deep learning

"Projecto" : "Análisis_criptomonedas"  
"Título" : "Predicción del precio de cryptomonedas con deep learning"  
"Autor" : "Cristian García Díaz"  
"Fecha de creación" : "20180821"  
"Fecha de modificación" : "20180826"  
"Fuentes":  
>https://medium.com/activewizards-machine-learning-company/bitcoin-price-forecasting-with-deep-learning-algorithms-eb578a2387a3

## Índice
[1. Configuración del entorno y obtención de los datos](#1)  
[2. Indicadores del mercado Bitcoin](#2)  
[3. Correlación de los indicadores de Bitcoin](#3)  
[4. Obtención y análisis del volumen cambio entre *Bitcoin* y divisas](#4) 

## <a name="1"></a> 1. Configuración del entorno y obtención de los datos

   - Instalar Anaconda.  
   - Instalar las librerias, dependencias y paquetes necesarios.  
   - Crear un entorno de trabajo.
   - Estructura de carpetas para almacenar los datos.  
   - Configuración de la API Key.  
   - Función para obtención de datos desde las APIs. 

In [176]:
# Se importan las líbrerias, dependencias o paquetes necesarios
import numpy as np
import pandas as pd
import pickle
import quandl
from datetime import datetime
import plotly as py
import os
from time import time
from math import sqrt

# Se importa el paquete Plotly
import plotly.offline as py
import plotly.graph_objs as go
import plotly.figure_factory as ff
# Se configura el modo offline
py.init_notebook_mode(connected=True)

from sklearn.preprocessing import MinMaxScaler

In [177]:
# API Key Quandl
quandl.ApiConfig.api_key = "gjodR_eNGkrTQq24cufg"

In [178]:
# Comprobar si no esta creada la carpeta de archivos para almacenar los datos
if not os.path.exists("cryptocurrency_indicators_files"):
    os.mkdir('cryptocurrency_indicators_files')

In [179]:
# Se define una función Quandl para cargar los datos
"""pickle --> para no descargar de nuevo los mismo datos"""
"""La función devuelve un Dataframe Pandas"""

def get_quandl_data(quandl_id):
    """Descargamos en cache los datos de Quandl"""
    """Se almacena un fichero .pkl como cache de los datos"""
    cache_path='.\cryptocurrency_indicators_files\{}.pkl'.format(quandl_id).replace('/','-')
    try:
        f = open(cache_path,'rb')
        df = pickle.load(f)
        print('Dataset {} cargado del cache'.format(quandl_id))
    except (OSError,IOError)as e:
        print('Descargando {} de Quandl'.format(quandl_id))
        df = quandl.get(quandl_id, returns="pandas")
        df.to_pickle(cache_path)
        print('Cargado {} de {} en el cache'.format(quandl_id,cache_path))
    return df

In [180]:
# Se define la función para visualizar los datos
def df_scatter(df, title,seperate_y_axis=False, y_axis_label='',scale='linear',initial_hide=False):
    # Se definen la lista de los nombres de cada dataframe como una lista label_arr = ['BITSTAMP', 'COINBASE', 'ITBIT', 'KRAKEN']
    label_arr = list(df)
    # Aplicamos una función lambda para mapear cada columnas y asignar la etiqueta correspondiente
    # Se guarda como otra lista series_arr
    series_arr = list(map(lambda col:df[col],label_arr))
    
    # Se definen los parametros de la salida gráfica
    layout = go.Layout(
        title = title,
        legend = dict(orientation='h'),
        xaxis = dict(type='date'),
        yaxis = dict(
            title = y_axis_label, 
            showticklabels = not seperate_y_axis,
            type = scale
        )
    )
    
    # Se define la configuración del eje y
    y_axis_config = dict(
        overlaying = 'y',
        showticklabels = False,
        type = scale
    )
    
    # Se define la visibilidad
    visibility = 'visible'
    if initial_hide:
        visibility = 'legendonly'
        
    # Se define la forma para cada serie de datos
    trace_arr = []
    for index, series in enumerate(series_arr):
        trace = go.Scatter(
        x = series.index,
        y = series,
        name = label_arr[index],
        visible = visibility
        )
        
        #Añadir un eje separado para cada serie
        if seperate_y_axis:
            trace['yaxis'] = 'y{format}'.format(index + 1)
            layout['yaxis{}'.format(index + 1)] = y_axis_config
        trace_arr.append(trace)
    
    fig = go.Figure(data = trace_arr, layout = layout)
    py.iplot(fig)

## <a name="2"></a> 2. Indicadores del mercado Bitcoin
   - [2.1 Precio de Bitcoin](#2.1)
   - [2.2 Número total de bictoins](#2.2)
   - [2.3 Valor del mercado](#2.3) 
   - [2.4 Direcciones de bitcoin](#2.4) 
   - [2.5 Volumen de cambio de bitcoin a dóladores](#2.5) 
   - [2.6 Número de transacciones de bitcoin](#2.6) 
   - [2.7 Número de transacciones acumuladas de bitcoin](#2.7) 
   - [2.8 Hash rate de bitcoin](#2.8) 
   - [2.9 Dificultad de bitcoin](#2.9) 
   - [2.10 Recompensa de los mineros de bitcoin](#2.10) 
   
   "Fuentes":  
>https://www.quandl.com/data/BCHAIN-Blockchain

### <a name="2.1"></a> 2.1 Precio de Bitcoin

In [181]:
# Obtención de los datos del precio
price_btc = get_quandl_data("BCHAIN/MKPRU")

Dataset BCHAIN/MKPRU cargado del cache


### <a name="2.2"></a> 2.2 Número total de bictoins

In [182]:
# Se obtienen los datos históricos del número total de Bitcoins
total_number_btc = get_quandl_data("BCHAIN/TOTBC")

Dataset BCHAIN/TOTBC cargado del cache


### <a name="2.3"></a> 2.3 Valor del mercado

In [183]:
# Se obtienen los datos históricos de la capitalización del mercado del Bitcoin en USD. El valor de mercado del Bitcoin
market_capitalization_btc = get_quandl_data("BCHAIN/MKTCP")

Dataset BCHAIN/MKTCP cargado del cache


### <a name="2.4"></a> 2.4 Número de direcciones bitcoin

In [184]:
# Se obtienen los datos históricos del número de direcciones Bitcoin usadas por dia
address_btc = get_quandl_data("BCHAIN/NADDU")

Dataset BCHAIN/NADDU cargado del cache


### <a name="2.5"></a> 2.5 Volumen de cambio de USD/BTC

In [185]:
# Se obtienen los datos históricos del volumen de cambio USD/BTC
exchange_trade_btc = get_quandl_data("BCHAIN/TRVOU")

Dataset BCHAIN/TRVOU cargado del cache


### <a name="2.6"></a> 2.6 Número de transacciones Bitcoin

In [186]:
# Se obtienen los datos históricos de las transacciones de BTC
transactions_btc = get_quandl_data("BCHAIN/NTRAN")

Dataset BCHAIN/NTRAN cargado del cache


### <a name="2.8"></a> 2.8 Hash rate de bitcoin 

In [187]:
# Se obtienen los datos históricos de Hash rate de bitcoin
# Es el número estimado de hash rate de Bitcoun y se miden en TeraHashes por segundo TH/s.
# 1 TH/s = 10^12 = 1.000.000.000.000 hash/s = 1 billón de hashes por segundo.
hash_rate_btc = get_quandl_data("BCHAIN/HRATE")

Dataset BCHAIN/HRATE cargado del cache


### <a name="2.9"></a> 2.9 Dificultad de bitcoin

In [188]:
# Se obtienen la dificultad de Bitcoin. Es una medida de dificultad propia.
# cada 210.000 bloques se recalcula la dificultad para crear bloques en la cadena de bloques cada 10 minutos de media.
difficulty_btc = get_quandl_data("BCHAIN/DIFF")

Dataset BCHAIN/DIFF cargado del cache


### <a name="2.10"></a> 2.10 Recompensa de los mineros de bitcoin 

In [189]:
# Se obtienen los datos históricos de la recompensa de los mineros
miners_revenue_btc = get_quandl_data("BCHAIN/MIREV")

Dataset BCHAIN/MIREV cargado del cache


## <a name="3"></a>3. Transformación de los datos

In [190]:
# Se preparan los datos para unirlos en un único Dataframe para poder aplicar la correlación.
mesures_name= ["price_btc",
        "total_number_btc",
        "market_capitalization_btc",
        "address_btc",
        "exchange_trade_btc",
        "transactions_btc",
        "hash_rate_btc",
        "difficulty_btc",
        "miners_revenue_btc"]

mesures_data= [price_btc,
        total_number_btc,
        market_capitalization_btc,
        address_btc,
        exchange_trade_btc,
        transactions_btc,
        hash_rate_btc,
        difficulty_btc,
        miners_revenue_btc]

In [191]:
# Preparación de un Dataframe con todos los indicadores e igualar los días para construir para todos los indicadores
fecha=pd.Timestamp(2018, 8, 26)

for i in range(len(mesures_name)):
    if(mesures_data[i].index.max()== fecha):
        mesures_data[i] = mesures_data[i].drop(mesures_data[i].index[len(mesures_data[i])-1])
    print( i, mesures_name[i], mesures_data[i].index.max()-mesures_data[i].index.min(), mesures_data[i].index.max())

0 price_btc 3521 days 00:00:00 2018-08-25 00:00:00
1 total_number_btc 3521 days 00:00:00 2018-08-25 00:00:00
2 market_capitalization_btc 3521 days 00:00:00 2018-08-25 00:00:00
3 address_btc 3521 days 00:00:00 2018-08-25 00:00:00
4 exchange_trade_btc 3521 days 00:00:00 2018-08-25 00:00:00
5 transactions_btc 3521 days 00:00:00 2018-08-25 00:00:00
6 hash_rate_btc 3520 days 00:00:00 2018-08-25 00:00:00
7 difficulty_btc 3521 days 00:00:00 2018-08-25 00:00:00
8 miners_revenue_btc 3521 days 00:00:00 2018-08-25 00:00:00


In [237]:
# Se realiza una unión de los varios Dataframe en uno
mesures_bitcoin = pd.concat(
    [mesures_data[0],
     mesures_data[1],
     mesures_data[2],
     mesures_data[3],
     mesures_data[4],
     mesures_data[5],
     mesures_data[6],
     mesures_data[7],
     mesures_data[8]],axis=1
)

In [238]:
# Renombrar las columnas
for i in range(len(mesures_name)):
    mesures_bitcoin.columns.values[i] = mesures_name[i]

In [239]:
# mesures_bitcoin = mesures_bitcoin.drop(['market_capitalization_btc'],axis=1)

In [240]:
# Crear los datos de entreno, test y evaluación
# Entreno
train_from_date = '2016-01-01'
train_end_date =  '2018-06-22'
# Test
test_from_date = '2018-06-23'
test_end_date = '2018-08-16'
# Evaluación
# '2018-08-17'
evaluation_from_date = '2018-08-17'
evaluation_end_date = '2018-08-22'

df_train = mesures_bitcoin.loc[train_from_date:train_end_date]

df_test = mesures_bitcoin.loc[test_from_date:test_end_date]

df_evaluation = mesures_bitcoin.loc[evaluation_from_date:evaluation_end_date]



print(df_train.size," días de entreno\n",df_test.size," días de test\n",df_evaluation.size," días de evaluación\n")

train_days = mesures_bitcoin.loc[train_from_date:train_end_date].count()
test_days = mesures_bitcoin.loc[test_from_date:test_end_date].count()
evalutacion_days = mesures_bitcoin.loc[evaluation_from_date:evaluation_end_date].count()
print(train_days, " desde ",train_from_date," hasta ",train_end_date )
print(test_days, " desde ",test_from_date," hasta ",test_end_date )
print(evalutacion_days, " desde ",evaluation_from_date," hasta ",evaluation_end_date )
print("Días totales ",train_days.price_btc+test_days.price_btc+evalutacion_days.price_btc)

8136  días de entreno
 495  días de test
 54  días de evaluación

price_btc                    904
total_number_btc             904
market_capitalization_btc    904
address_btc                  904
exchange_trade_btc           904
transactions_btc             904
hash_rate_btc                904
difficulty_btc               904
miners_revenue_btc           904
dtype: int64  desde  2016-01-01  hasta  2018-06-22
price_btc                    55
total_number_btc             55
market_capitalization_btc    55
address_btc                  55
exchange_trade_btc           55
transactions_btc             55
hash_rate_btc                55
difficulty_btc               55
miners_revenue_btc           55
dtype: int64  desde  2018-06-23  hasta  2018-08-16
price_btc                    6
total_number_btc             6
market_capitalization_btc    6
address_btc                  6
exchange_trade_btc           6
transactions_btc             6
hash_rate_btc                6
difficulty_btc               6

# 4. Red neuronal LSTM


In [196]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
from time import time
from math import sqrt

from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.preprocessing import MinMaxScaler

import statsmodels.api as sm
from sklearn import linear_model
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error

from keras.models import Sequential
from keras.layers import Activation
from keras.layers import Dropout
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import GRU
from keras.callbacks import EarlyStopping
from keras import initializers


In [197]:
# Variable independientes
df_train_x = df_train.loc[:,"price_btc"]
# Variables predictoras
df_train_y = df_train.loc[:,"price_btc"]
observated_train = df_train.loc[:,"price_btc"]

# Variable independientes
df_test_x = df_test.loc[:,"price_btc"]
# Variables predictoras
df_test_y = df_test.loc[:,"price_btc"]
observed_test = df_test.loc[:,"price_btc"]

# Variable independientes
df_evaluation_x = df_evaluation.loc[:,"price_btc"]
# Variables predictoras
df_evaluation_y = df_evaluation.loc[:,"price_btc"]
observed_evaluation = df_evaluation.loc[:,"price_btc"]

In [198]:
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
    LSTM_training_inputs, LSTM_training_outputs = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), 0]
        LSTM_training_inputs.append(a)
        LSTM_training_outputs.append(dataset[i + look_back, 0])
    return np.array(LSTM_training_inputs), np.array(LSTM_training_outputs)

In [199]:
df_train = df_train_x.values
df_train = df_train.astype('float32')
df_test = df_test_x.values
df_test = df_test.astype('float32')
df_evaluation = df_evaluation_x.values
df_evaluation = df_evaluation.astype('float32')

In [200]:
df_train = np.reshape(df_train, (len(df_train), 1))
df_test = np.reshape(df_test, (len(df_test), 1))
df_evaluation = np.reshape(df_evaluation, (len(df_evaluation), 1))

In [201]:
scaler = MinMaxScaler(feature_range=(0, 1))
df_train = scaler.fit_transform(df_train)
df_test = scaler.fit_transform(df_test)
df_evaluation = scaler.fit_transform(df_evaluation)

In [202]:
# reshape into X=t and Y=t+1
look_back = 1
df_train_x, df_train_y = create_dataset(df_train, look_back)
df_test_x, df_test_y = create_dataset(df_test, look_back)
df_evaluation_x, df_evaluation_y = create_dataset(df_evaluation, look_back)

In [203]:
df_train_x.shape

(902, 1)

In [204]:
df_train_y.size

902

In [205]:
df_test_x.size

53

In [206]:
df_test_y.size

53

In [207]:
df_evaluation_x.size

4

In [208]:
df_evaluation_y.size

4

In [209]:
# Se remodelan los conjuntos de datos con la estructura necesaria para los requisitos del modelo LSTM en Keras
df_train_x = np.reshape(df_train_x, (len(df_train_x), 1))
df_test_x = np.reshape(df_test_x, (len(df_test_x), 1))

In [210]:
test = df_train_x.reshape(1,len(df_train_x),1)

In [211]:
df_train_x = np.reshape(df_train_x, (df_train_x.shape[0], 1, df_train_x.shape[1]))


In [212]:
df_test_x = np.reshape(df_test_x, (df_test_x.shape[0], 1, df_test_x.shape[1]))

# Entreno

In [279]:
# Implementación de LSTM
# Se define una función para construir el modelo de red neuronal
# Se construye un modelo vacío sequencial y se agrega una capa LSTM.
# El modelo se ha configurado para que se adapte a una entrada n x m.
# Se incluye la función de activación.
#Variables de la implentación del modelo LSTM
# Se define una semilla para generar los números pseudoaleatorios


#Construcción del modelo
def build_model(inputs, output_size, neurons, activ_func="sigmoid",
                dropout=0.25, loss="mae", optimizer="adam"):
    model = Sequential()

    model.add(LSTM(neurons, input_shape=(1, look_back)))
    model.add(Dropout(dropout))
    model.add(Dense(units=output_size))
    model.add(Activation(activ_func))

    model.compile(loss=loss, optimizer=optimizer)
    return model

In [280]:
#Inicialización de variables
np.random.seed(14)

# Se inicializa el modelo
model_btc = build_model(df_train_x, output_size=1, neurons = 1)

# Comprobar el tiempo
start_time = time()

#Se entrena al modelo. model_btc_history contiene información del error por entreno
model_btc_history = model_btc.fit(df_train_x, df_train_y, 
                            epochs=50, batch_size=1, verbose=2, shuffle=True)
# Comprobar el tiempo
final_time = time() - start_time

Epoch 1/50
 - 3s - loss: 0.2939
Epoch 2/50
 - 2s - loss: 0.2138
Epoch 3/50
 - 2s - loss: 0.1798
Epoch 4/50
 - 2s - loss: 0.1640
Epoch 5/50
 - 2s - loss: 0.1596
Epoch 6/50
 - 2s - loss: 0.1517
Epoch 7/50
 - 2s - loss: 0.1435
Epoch 8/50
 - 2s - loss: 0.1229
Epoch 9/50
 - 2s - loss: 0.0799
Epoch 10/50
 - 2s - loss: 0.0749
Epoch 11/50
 - 2s - loss: 0.0699
Epoch 12/50
 - 2s - loss: 0.0751
Epoch 13/50
 - 2s - loss: 0.0692
Epoch 14/50
 - 2s - loss: 0.0660
Epoch 15/50
 - 2s - loss: 0.0677
Epoch 16/50
 - 2s - loss: 0.0700
Epoch 17/50
 - 2s - loss: 0.0668
Epoch 18/50
 - 2s - loss: 0.0670
Epoch 19/50
 - 2s - loss: 0.0644
Epoch 20/50
 - 2s - loss: 0.0672
Epoch 21/50
 - 2s - loss: 0.0678
Epoch 22/50
 - 2s - loss: 0.0615
Epoch 23/50
 - 2s - loss: 0.0656
Epoch 24/50
 - 2s - loss: 0.0675
Epoch 25/50
 - 2s - loss: 0.0646
Epoch 26/50
 - 2s - loss: 0.0641
Epoch 27/50
 - 2s - loss: 0.0626
Epoch 28/50
 - 2s - loss: 0.0593
Epoch 29/50
 - 2s - loss: 0.0658
Epoch 30/50
 - 2s - loss: 0.0622
Epoch 31/50
 - 2s -

In [281]:
# Tiempo de ejecución
print('Tiempo de ejecución de la red neural es de: {0:.3f}'.format(final_time))

Tiempo de ejecución de la red neural es de: 99.903


In [282]:
# Gráfico del error MAE
history_error_btc = go.Scatter(x=model_btc_history.epoch, y=model_btc_history.history['loss'])
py.iplot([history_error_btc])

In [283]:
test = model_btc.predict(df_train_x)
test= test.reshape(-1)

In [284]:

predicted = test+1* observated_train[look_back:-look_back]
observated = observated_train[look_back:-look_back]

In [285]:
# Visualización de 
trace1 = go.Scatter(
    x = np.arange(0, len(predicted), 1),
    y = predicted,
    mode = 'lines',
    name = 'Predicted',
    line = dict(color=('rgb(244, 146, 65)'), width=2)
)
trace2 = go.Scatter(
    x = np.arange(0, len(predicted), 1),
    y = observated,
    mode = 'lines',
    name = 'Observaciones',
    line = dict(color=('rgb(66, 244, 155)'), width=2)
)

data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted',
             xaxis = dict(title = 'Day number'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating0')

In [286]:
# MSE
print("MSE: %.3f" % mean_squared_error(observated, predicted))
# RMSE Root Mean Square Error
RMSE = sqrt(mean_squared_error(observated, predicted))
print('RMSE: %.3f' % RMSE)
from sklearn.metrics import mean_absolute_error
# MAE
print("MAE: %.3f" % mean_absolute_error(observated, predicted))

MSE: 0.034
RMSE: 0.185
MAE: 0.128


In [287]:
observated_train.loc[observated_train.index[:]> '2018-06-14'].index

DatetimeIndex(['2018-06-15', '2018-06-16', '2018-06-17', '2018-06-18',
               '2018-06-19', '2018-06-20', '2018-06-21', '2018-06-22'],
              dtype='datetime64[ns]', name='Date', freq='D')

In [288]:
# Ahora se extrae con el formato de fechas y se traza 
Test_Dates = observated_train.loc[observated_train.index[:]> '2016-01-01'].index

trace1 = go.Scatter(x=Test_Dates, y=observated, name= 'Actual Price',
                   line = dict(color = ('rgb(66, 244, 155)'),width = 2))
trace2 = go.Scatter(x=Test_Dates, y=predicted, name= 'Predicted Price',
                   line = dict(color = ('rgb(244, 146, 65)'),width = 2))
data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted, by dates',
             xaxis = dict(title = 'Date'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating1')

In [259]:
observated_train.loc['2018-07-03':,]

IndexingError: Too many indexers

 ## Test

In [None]:
predicted_test = ((np.transpose(model_btc.predict(LSTM_test_inputs))+1) * df_test_y.values[:-window_len])[0]
observated_test = df_test_y.values[window_len:]

In [None]:
# Visualización de 
trace1 = go.Scatter(
    x = np.arange(0, len(predicted_test), 1),
    y = predicted_test,
    mode = 'lines',
    name = 'Predicted',
    line = dict(color=('rgb(244, 146, 65)'), width=2)
)
trace2 = go.Scatter(
    x = np.arange(0, len(observated_test), 1),
    y = observated_test,
    mode = 'lines',
    name = 'Observaciones',
    line = dict(color=('rgb(66, 244, 155)'), width=2)
)

data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted',
             xaxis = dict(title = 'Day number'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating0')

In [None]:
# MSE
print("MSE: %.3f" % mean_squared_error(observated_test, predicted_test))
# RMSE Root Mean Square Error
RMSE = sqrt(mean_squared_error(observated_test, predicted_test))
print('RMSE: %.3f' % RMSE)
from sklearn.metrics import mean_absolute_error
# MAE
print("MAE: %.3f" % mean_absolute_error(observated_test, predicted_test))


# TRAIN
# MSE: 102208.495
# RMSE: 319.701
# MAE: 134.233
# TEST
# MSE: 106684.829
# RMSE: 326.626
# MAE: 264.080

In [None]:
# Ahora se extrae con el formato de fechas y se traza
# Se desplaza 10 dias por la ventana
# df_test.loc['2018-06-23':,'total_number_btc':'total_number_btc'].index

# Test_Dates = df_test['total_number_btc'].index

Test_Dates = df_test.loc['2018-07-03':,'total_number_btc':'total_number_btc'].index

trace1 = go.Scatter(x=Test_Dates, y=observated_test, name= 'Actual Price',
                   line = dict(color = ('rgb(66, 244, 155)'),width = 2))
trace2 = go.Scatter(x=Test_Dates, y=predicted_test, name= 'Predicted Price',
                   line = dict(color = ('rgb(244, 146, 65)'),width = 2))
data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted, by dates',
             xaxis = dict(title = 'Date'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating1')

In [None]:
Test_Dates = df_test['total_number_btc'].index

In [None]:
T

## Evaluación

In [None]:
predicted_evaluation = ((np.transpose(model_btc.predict(LSTM_evaluation_inputs))+1) * df_evaluation_y.values[:-window_len])[0]
observated_evaluation = df_evaluation_y.values[window_len:]

In [None]:
# Visualización de 
trace1 = go.Scatter(
    x = np.arange(0, len(predicted_evaluation), 1),
    y = predicted_evaluation,
    mode = 'lines',
    name = 'Predicted',
    line = dict(color=('rgb(244, 146, 65)'), width=2)
)
trace2 = go.Scatter(
    x = np.arange(0, len(observated_evaluation), 1),
    y = observated_evaluation,
    mode = 'lines',
    name = 'Observaciones',
    line = dict(color=('rgb(66, 244, 155)'), width=2)
)

data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted',
             xaxis = dict(title = 'Day number'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating0')

In [None]:
# MSE
print("MSE: %.3f" % mean_squared_error(observated_evaluation, predicted_evaluation))
# RMSE Root Mean Square Error
RMSE = sqrt(mean_squared_error(observated_evaluation, predicted_evaluation))
print('RMSE: %.3f' % RMSE)
from sklearn.metrics import mean_absolute_error
# MAE
print("MAE: %.3f" % mean_absolute_error(observated_evaluation, predicted_evaluation))


# TRAIN
# MSE: 102208.495
# RMSE: 319.701
# MAE: 134.233
# TEST
# MSE: 106684.829
# RMSE: 326.626
# MAE: 264.080

In [None]:
# Ahora se extrae con el formato de fechas y se traza 
Test_Dates =df_evaluation_x['total_number_btc'].index

trace1 = go.Scatter(x=Test_Dates, y=observated_evaluation, name= 'Actual Price',
                   line = dict(color = ('rgb(66, 244, 155)'),width = 2))
trace2 = go.Scatter(x=Test_Dates, y=predicted_evaluation, name= 'Predicted Price',
                   line = dict(color = ('rgb(244, 146, 65)'),width = 2))
data = [trace1, trace2]
layout = dict(title = 'Comparison of true prices (on the test dataset) with prices our model predicted, by dates',
             xaxis = dict(title = 'Date'), yaxis = dict(title = 'Price, USD'))
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='results_demonstrating1')