In [None]:
# Imports (External)
import numpy as np
import pandas as pd

import datetime as dt
import matplotlib as mpl
import matplotlib.pyplot as plt

import sklearn
import tensorflow as tf
import keras

import pywt

In [None]:
# Imports (Internal)

Model inputs:

1. Microeconomic variables 
    - OHLC data (Panel A)
    - Technicator indicators (Panel B)
3. Macroeconomic variables 
    - US Dollar Index, HIBOR (Panel C)

Fig 7. Continuous dataset arrangement for training, validating and testing during the whole sample period:

![fig%207%20dataset%20arrangement%20for%20train%20validate%20test.PNG](attachment:fig%207%20dataset%20arrangement%20for%20train%20validate%20test.PNG)

Table 2. Time interval of the six prediction years:
![table%202%20time%20interval%20of%206%20six%20year%20prediction.PNG](attachment:table%202%20time%20interval%20of%206%20six%20year%20prediction.PNG)

Panel A variables: (OHLC, H/L price, volume)

Panel B variables: (MACD, CCI, ATR, BOLL, EMA20, MA5/MA10, MTM5/MTM12, ROC, SMI, WVAD)

Panel A variables (OHLC) and Panel B variables (Technical Indicators) can all be calculated/recreated for bitcoin and other crypto (or even stocks/forex)

Panel C variables (Exchange rate and Interest rates) are macroeconomic variables I'd have to find a replacement, substitute, or rough equivalent (if possible) for crypto
possibly even something like the stablecoin exchange premium or overall marketcap per currency/ticker

In [None]:
Splitting data this particular way includes the previous observations into the next training set:

In [None]:
from sklearn.model_selection import TimeSeriesSplit

train_list = []
test_list = []
X = clean_data.values

splits = TimeSeriesSplit(n_splits=6)
index = 1
for train_index, test_index in splits.split(X):
    train = X[train_index]
    test = X[test_index]
    print('Observations: %d' % (len(train) + len(test)))
    print('Training Observations: %d' % (len(train)))
    print('Testing Observations: %d' % (len(test)))
    train_list.append(train)
    test_list.append(test)
    index += 1

In [None]:
len(train_list),len(test_list)

Fig 1. The flowchart of the proposed deep learning framework for financial time series:![fig%201%20flowchart%20of%20dl%20framework%20model.PNG](attachment:fig%201%20flowchart%20of%20dl%20framework%20model.PNG)

As defined by bao-yue-rao, the WSAE-LSTM model has three primary components:

1. Wavelet transform applied to denoise data as part of preprocessing
2. Stacked Autoencoders to generate high level features
3. LSTMs to forecast next day closing price

Wavelet Transform

"As a result, the two-level wavelet is applied twice in this study for data preprocessing as suggested in [23]"
"First, the denoised time series is generated via discrete wavelet transform using the Haar wavelet"

In [None]:
# Single-level wavelet transform
    # https://pywavelets.readthedocs.io/en/latest/ref/dwt-discrete-wavelet-transform.html
cA, cD = pywt.dwt(clean_data.values, 'haar')

In [None]:
# Multi-level wavelet transform
    # https://pywavelets.readthedocs.io/en/latest/ref/dwt-discrete-wavelet-transform.html#pywt.wavedec
from pywt import wavedec
coeffs = wavedec(clean_data.values, 'haar', level=2)
cA2, cD2, cD1 = coeffs