In this project I will put into practice different time series forecasting techniques. As part of my inspiration, I have taken an example from the following article which I recommend to read:
https://ai.googleblog.com/2021/12/interpretable-deep-learning-for-time.html
As a summary, first of all we are going to use CryptoWatch, a portal that provides us with an API for obtaining cryptocurrency data. We will make a code that will allow us to evaluate the profitability of all cryptocurrencies in Coinbase Pro in the last week.

However, to ensure the usefulness of this code in the long term and to take into account an interesting case for the majority of the public, we will make predictions about the Bitcoin in USD.

We will use PyCaret as the main module to compare different modelling modes such as ARIMA, and we will use the best results to make a bundle of the best 3 to have a more reliable result.

Note: Still in Progress

In [1]:
%%capture
!pip install cryptowatch-sdk
!pip install sktime
!mkdir -p $HOME/.cw
!echo "apikey: XNS40GIEET88LR35UE2S" > $HOME/.cw/credentials.yml
!cat  $HOME/.cw/credentials.yml

In [2]:
import cryptowatch as cw
import pandas as pd
from datetime import datetime, timedelta
from matplotlib import pyplot
import logging
from sktime.utils.plotting import plot_series

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

In [None]:
#Reduce the Verbose for cryptowatch
logging.basicConfig()
logging.getLogger("cryptowatch").setLevel(logging.WARNING)

In [None]:
#If we would like to analyse the most profitable Cryptos of the last week:
umbral_benef = 25 #We establish a return of 25% in the last 24h.
MARKET="coinbase-pro" #We will use the marketplace of the well known Coinbase portal


coinbase = cw.markets.list(MARKET)
for market in coinbase.markets:

    try:
        ticker = "{}:{}".format(market.exchange, market.pair).upper()
        candles = cw.markets.get(ticker, ohlc=True, periods=["1w"]) 
#We set the period to 1 week ago


        close_ts, wkly_open, wkly_close = (
            candles.of_1w[-1][0],
            candles.of_1w[-1][1],
            candles.of_1w[-1][4],
        )

        if wkly_open == 0:
            continue
        perf = (wkly_open - wkly_close) * 100 / wkly_open

    
        if perf >= umbral_benef:
            open_ts = datetime.utcfromtimestamp(close_ts) - timedelta(days=7)
            print("{} changed {:.2f}% since {}".format(ticker, perf, open_ts))
    except:
        print("Execption captured, continue ..")

In [None]:
#We choose 1 cryptocurrency, in our case we will use the Bitcoin in dollars (USD).
TICKET="BTCUSD"
TABLE = "candles_15min_"+TICKET
candles = cw.markets.get(MARKET+":"+TICKET, ohlc=True)

In [None]:
#We create a dataframe with the chosen Crypto
rows_list = []
#In the next line we set the information to be every 15 minutes.
for x in candles.of_15m:
    close_ts = datetime.utcfromtimestamp(x[0])
    open_value = x[1]
    high_value = x[2]
    low_value = x[3]
    close_value = x[4]
    volume_base = x[5]
    volume_quote = x[6]
    rows_list.append([TICKET,close_ts , open_value , high_value , low_value ,close_value ,volume_base ,volume_quote])
df = pd.DataFrame(rows_list,columns = ["ticket","close_ts" , "open_value" , "high_value" , "low_value" ,"close_value" ,"volume_base" ,"volume_quote" ])
df.head()

# We are going to use the pyCaret Framework, which allows us to compare between the best regression models.

In [None]:
!pip install pycaret-ts-alpha &> /dev/null
!pip install pyyaml
from pycaret.datasets import get_data
from pycaret.time_series import *
!pip install --upgrade pandas
import pandas as pd
import numpy as np

In [None]:
from pandas.core.indexes.datetimes import date_range
index = pd.DatetimeIndex(df["close_ts"])
rng = date_range(index[0], periods = 1000, freq="15min")
data = df["close_value"].to_numpy()
#PyCaret works with pandas series instead of Dataframes, so we have to transform it.
df_series = pd.Series(data=data, index=rng)

In [None]:
!pip install matplotlib &> /dev/null
!pip install scipy
import matplotlib.pyplot as plt
_ = plot_series(df_series)
plt.grid()

In [None]:
#For better performance, we resample the data to take the mean for each hour, taking 4 samples of 15 minutes
df_series_t = df_series.resample("H").mean()
_ = plot_series(df_series_t)
plt.grid()

In [None]:
exp = TimeSeriesExperiment()
#fh is set in hours so we introduce 48h
exp.setup(data=df_series_t, fh=48, use_gpu = True)

In [None]:
best_baseline_models = exp.compare_models(n_select=4, sort="MAE")

In [None]:
#Here we have the 4 best models, but they are not tuned yet tho have their best performance
best_baseline_models

In [None]:
#Here we call this line to make an Hyperparameterm Tuning of the 4 better models
best_tuned_models = [exp.tune_model(model) for model in best_baseline_models]

In [None]:
best_baseline_models

In [None]:
#This line focus all the efforts, blending the 4 best models already tuned with the mean of the 4
mean_blender = exp.blend_models(best_tuned_models, method="mean")

In [None]:
y_train = exp.get_config("y_train")
y_predict = exp.predict_model(mean_blender)

plot_series( y_train, y_predict, labels=[ "Traín", "Test Predictions"])
plt.grid()