* check performance with sliding window on/off
* check performance with binary vs float labels
* check performance with different ticker price windows and corresponding feature windows

Other idea for catagorical label generation
* predict on x intervals for movement in the next y interval of time. e.g. using intervals of 6 hours predict the movement in the next 12-24 hours 


# Check that GPU is listed for tensorflow

In [3]:
import keras

Using TensorFlow backend.


In [4]:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13447409937165879144
]


# Load Ticker Data

In [5]:
import pandas as pd

In [6]:
eth_ticker_raw = pd.read_csv("data/ticker_data/USDT_ETH.csv",index_col=0).rename(columns={"Timestamp":"timestamp"})
btc_ticker_raw = pd.read_csv("data/ticker_data/USDT_BTC.csv",index_col=0).rename(columns={"Timestamp":"timestamp"})          

In [7]:
eth_ticker_raw[eth_ticker_raw.timestamp == 1439014500]

Unnamed: 0,Close,timestamp,High,Low,Open
0,1.75,1439014500,0.33,1.61,0.33


In [8]:
btc_ticker_raw[btc_ticker_raw.timestamp == 1439014500]

Unnamed: 0,Close,timestamp,High,Low,Open
48805,273.947811,1439014500,275.603572,273.947811,275.603572


In [9]:
btc_ticker_raw.head()

Unnamed: 0,Close,timestamp,High,Low,Open
0,225.0,1424373000,0.33,225.0,0.33
1,225.0,1424373300,225.0,225.0,225.0
2,225.0,1424373600,225.0,225.0,225.0
3,225.0,1424373900,225.0,225.0,225.0
4,225.0,1424374200,225.0,225.0,225.0


In [10]:
# sync the times of the two dataframes

# Shape ticker data for features

* align the btc and eth data
* write function that can create data point windows - 5 minutes, 20 minutes, 6 hours
* create features and outputs

## Align Data

In [11]:
ticker_data_merged = eth_ticker_raw.set_index("timestamp")\
                .join(
                        btc_ticker_raw.set_index("timestamp"),
                        on="timestamp",
                        how="inner",
                        lsuffix="_eth",
                        rsuffix="_btc")

In [12]:
ticker_data_merged.head()

Unnamed: 0_level_0,Close_eth,High_eth,Low_eth,Open_eth,Close_btc,High_btc,Low_btc,Open_btc
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1439014500,1.75,0.33,1.61,0.33,273.947811,275.603572,273.947811,275.603572
1439014800,1.85,1.85,1.85,1.85,273.905543,273.905543,273.626238,273.901814
1439015100,1.85,1.85,1.85,1.85,273.905543,273.905543,273.905543,273.905543
1439015400,1.85,1.85,1.85,1.85,273.917572,273.917572,273.917572,273.917572
1439015700,1.85,1.85,1.85,1.85,273.917572,273.917572,273.917572,273.917572


## Modify Time Spans

In [13]:
ticker_data_merged.dtypes

Close_eth    float64
High_eth     float64
Low_eth      float64
Open_eth     float64
Close_btc    float64
High_btc     float64
Low_btc      float64
Open_btc     float64
dtype: object

In [14]:
import numpy as np

# in minutes 
minutes = 10
data_point_bucket_size = str(minutes) + "T"

datetime = pd.to_datetime(ticker_data_merged.index,unit='s') 


agg_method = {'Close_eth': "last",
                "High_eth": np.max, 
                "Low_eth": np.min,
                "Open_eth": "first",
                "Close_btc": "last",
                "High_btc": np.max, 
                "Low_btc": np.min,
                "Open_btc": "first", 
                 }

ticker_data = ticker_data_merged.set_index(datetime)\
                                    .resample(data_point_bucket_size)\
                                    .agg(agg_method)

print("Shape of reshaped data: " + str(ticker_data.shape))
print("Shape of original data: " + str(ticker_data_merged.shape))

Shape of reshaped data: (150216, 8)
Shape of original data: (300430, 8)


In [15]:
ticker_data.head()

Unnamed: 0_level_0,Close_eth,High_eth,Low_eth,Open_eth,Close_btc,High_btc,Low_btc,Open_btc
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2015-08-08 06:10:00,1.75,0.33,1.61,0.33,273.947811,275.603572,273.947811,275.603572
2015-08-08 06:20:00,1.85,1.85,1.85,1.85,273.905543,273.905543,273.626238,273.901814
2015-08-08 06:30:00,1.85,1.85,1.85,1.85,273.917572,273.917572,273.917572,273.917572
2015-08-08 06:40:00,1.85,1.85,1.85,1.85,273.917572,273.917572,273.917572,273.917572
2015-08-08 06:50:00,1.71,1.71,1.71,1.71,274.15505,274.15505,274.15505,274.15505


## * Adding Sentiment information

From the research it looked like sentiments from 4-2 days ago yielded the best results.
* I need to consider different time intervals and how i will slide the data?

In [16]:
import pandas as pd

In [17]:
sentiment = pd.read_parquet("data/features/sentiment_features")
sentiment.head(5000).dropna()

Unnamed: 0,avg_reddit_eth_compound_vader,avg_reddit_eth_pos_vader,avg_reddit_eth_neg_vader,avg_reddit_eth_polarity_textblob,avg_reddit_eth_subjectivity_textblob,avg_reddit_btc_compound_vader,avg_reddit_btc_pos_vader,avg_reddit_btc_neg_vader,avg_reddit_btc_polarity_textblob,avg_reddit_btc_subjectivity_textblob,...,avg_4day_twitter_btc_compound_vader,avg_4day_twitter_btc_pos_vader,avg_4day_twitter_btc_neg_vader,avg_4day_twitter_btc_polarity_textblob,avg_4day_twitter_btc_subjectivity_textblob,avg_4day_twitter_compound_vader,avg_4day_twitter_pos_vader,avg_4day_twitter_neg_vader,avg_4day_twitter_polarity_textblob,avg_4day_twitter_subjectivity_textblob
2016-01-05 00:00:00,0.921400,0.802000,0.288000,0.629557,0.533333,0.539011,0.205667,0.085625,0.127882,0.417544,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
2016-01-05 00:10:00,0.222350,0.566000,0.314500,0.089779,0.641667,0.488052,0.182302,0.081590,0.140953,0.501798,...,0.609873,0.293585,0.153375,0.167216,0.262328,0.609355,0.293298,0.153375,0.182638,0.264452
2016-01-05 00:20:00,-0.476700,0.330000,0.341000,-0.450000,0.750000,0.437094,0.158937,0.077556,0.154024,0.586052,...,0.592523,0.292128,0.155138,0.182643,0.277187,-0.058188,0.288767,0.151118,0.250259,0.547621
2016-01-05 00:30:00,0.659700,0.094000,0.270250,0.104167,0.245833,0.109513,0.126000,0.172467,0.078175,0.506584,...,0.575172,0.290671,0.156902,0.198070,0.292047,-0.027525,0.229029,0.152429,0.285345,0.534792
2016-01-05 00:40:00,0.704967,0.097000,0.199500,0.093287,0.369213,0.155755,0.138048,0.152111,0.100183,0.483497,...,0.557821,0.289215,0.158665,0.213497,0.306906,0.351197,0.196920,0.198200,0.385520,0.537183
2016-01-05 00:50:00,0.750233,0.100000,0.128750,0.082407,0.492593,0.201996,0.150095,0.131756,0.122192,0.460409,...,0.540471,0.287758,0.160429,0.228924,0.321766,0.474006,0.252839,0.154000,0.296414,0.531020
2016-01-05 01:00:00,0.795500,0.103000,0.058000,0.071528,0.615972,0.248237,0.162143,0.111400,0.144201,0.437322,...,0.523120,0.286301,0.162192,0.244350,0.336625,0.405348,0.223650,0.150333,0.256415,0.445263
2016-01-05 01:10:00,0.819950,0.161000,0.016000,0.288907,0.627958,0.246671,0.154571,0.099600,0.082043,0.458287,...,0.505770,0.284844,0.163956,0.259777,0.351484,0.592418,0.272787,0.190800,0.163267,0.279567
2016-01-05 01:20:00,0.825612,0.183500,0.043800,0.288555,0.589719,0.013919,0.129429,0.133800,0.107683,0.475771,...,0.488419,0.283387,0.165719,0.275204,0.366344,0.315339,0.226346,0.313600,0.136164,0.496957
2016-01-05 01:30:00,0.831275,0.206000,0.071600,0.288203,0.551479,-0.218834,0.104286,0.168000,0.133324,0.493255,...,0.471068,0.281930,0.167483,0.290631,0.381203,0.383664,0.246414,0.162000,0.094220,0.507014


In [18]:
sentiment.index.name = "timestamp"

In [19]:
sentiment.head()

Unnamed: 0_level_0,avg_reddit_eth_compound_vader,avg_reddit_eth_pos_vader,avg_reddit_eth_neg_vader,avg_reddit_eth_polarity_textblob,avg_reddit_eth_subjectivity_textblob,avg_reddit_btc_compound_vader,avg_reddit_btc_pos_vader,avg_reddit_btc_neg_vader,avg_reddit_btc_polarity_textblob,avg_reddit_btc_subjectivity_textblob,...,avg_4day_twitter_btc_compound_vader,avg_4day_twitter_btc_pos_vader,avg_4day_twitter_btc_neg_vader,avg_4day_twitter_btc_polarity_textblob,avg_4day_twitter_btc_subjectivity_textblob,avg_4day_twitter_compound_vader,avg_4day_twitter_pos_vader,avg_4day_twitter_neg_vader,avg_4day_twitter_polarity_textblob,avg_4day_twitter_subjectivity_textblob
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2016-01-05 00:00:00,0.9214,0.802,0.288,0.629557,0.533333,0.539011,0.205667,0.085625,0.127882,0.417544,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2016-01-05 00:10:00,0.22235,0.566,0.3145,0.089779,0.641667,0.488052,0.182302,0.08159,0.140953,0.501798,...,0.609873,0.293585,0.153375,0.167216,0.262328,0.609355,0.293298,0.153375,0.182638,0.264452
2016-01-05 00:20:00,-0.4767,0.33,0.341,-0.45,0.75,0.437094,0.158937,0.077556,0.154024,0.586052,...,0.592523,0.292128,0.155138,0.182643,0.277187,-0.058188,0.288767,0.151118,0.250259,0.547621
2016-01-05 00:30:00,0.6597,0.094,0.27025,0.104167,0.245833,0.109513,0.126,0.172467,0.078175,0.506584,...,0.575172,0.290671,0.156902,0.19807,0.292047,-0.027525,0.229029,0.152429,0.285345,0.534792
2016-01-05 00:40:00,0.704967,0.097,0.1995,0.093287,0.369213,0.155755,0.138048,0.152111,0.100183,0.483497,...,0.557821,0.289215,0.158665,0.213497,0.306906,0.351197,0.19692,0.1982,0.38552,0.537183


## Construct % price change label

In [20]:
eth_close_percent_change = ticker_data.Close_btc.pct_change()
ticker_data["eth_close_percent_change"] = eth_close_percent_change

In [21]:
ticker_data.dtypes

Close_eth                   float64
High_eth                    float64
Low_eth                     float64
Open_eth                    float64
Close_btc                   float64
High_btc                    float64
Low_btc                     float64
Open_btc                    float64
eth_close_percent_change    float64
dtype: object

## * Construct Binary label to capture up or down movement between days

# specify the output
#close_ethb

In [22]:
nothing_changed = ticker_data.eth_close_percent_change.round(decimals=6) == 0
negative_change = ticker_data.eth_close_percent_change.round(decimals=6) < 0
positive_change = ticker_data.eth_close_percent_change.round(decimals=6) > 0



In [23]:
ticker_data["eth_close_movement"] = -9

ticker_data["eth_close_movement"][positive_change] = 1 
ticker_data["eth_close_movement"][nothing_changed] = 0
ticker_data["eth_close_movement"][negative_change] = -1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """


In [24]:
ticker_data = ticker_data[~ticker_data.eth_close_percent_change.isnull()]

In [25]:
ticker_data = ticker_data.join(sentiment,how="inner")

**Key**
* -1 went down
* 0 stayed the same
* 1 went up

# Construction of Features & Labels

The features I care about:
* eth closing
* btc closing
* eth closing 3 days ago
* eth closing 4 days ago
* btc closing 3 days ago
* btc closing 4 days ago
* sentiment for 3 days ago
* sentiment for 4 days ago

The ratio of features to labels will be 16. And 6 days worth of data needs to be read at a time. This is in line with the research on sentiment analysis. 

For example:
* If the 5 minute intervals are used then the number of features need to be +- 1728 (8640 minutes) and the vector size of the label will be 108 (540 minutes or 9 hours)

**Temporal Golden Rule 1:**
* Temporal order must be preserved. Your features can not be further in time then your labels. 

**NOTE** the above should be doubled as the btc and eth values will be in the input layer

In [26]:
data_point_window = 5
days = 6
feature_vector_size = 6*24*60/data_point_window
output_vector_size = feature_vector_size/16

output_vector_minutes_span = output_vector_size*5
output_vector_hour_span = output_vector_minutes_span/60

print("Number of days feature vector will cover: " + str(days))
print("Data Point Window Size: " + str(data_point_window) + " minutes")
print("Size of feature vector: " + str(feature_vector_size))
print()
print("Number of minutes output vector will cover: " + str(output_vector_minutes_span))
print("Number of hours output vector will cover: " + str(output_vector_hour_span))
print("Size of output vector: " + str(output_vector_size))


Number of days feature vector will cover: 6
Data Point Window Size: 5 minutes
Size of feature vector: 1728.0

Number of minutes output vector will cover: 540.0
Number of hours output vector will cover: 9.0
Size of output vector: 108.0


The following class was obtained from [the following blog](https://nicholastsmith.wordpress.com/2017/11/13/cryptocurrency-price-prediction-using-deep-learning-in-tensorflow/)

In [27]:
##QUESTION!!!!???? bias introduced in the label if there is overlap with the next training row?

import numpy as np
import pandas as pd
 
class PastSampler:
    '''
    Forms training samples for predicting future values from past value
    '''
     
    def __init__(self, N, K, sliding_window = True):
        '''
        Predict K future sample using N previous samples
        '''
        self.K = K
        self.N = N
        self.sliding_window = sliding_window
 
    def transform(self, A):
        M = self.N + self.K     #Number of samples per row (sample + target)
        #indexes
        if self.sliding_window:
            slide_windows_size = 6
            I = np.arange(M) + np.arange(A.shape[0] - M,step=slide_windows_size).reshape(-1, 1)
        else:
            if A.shape[0]%M == 0:
                I = np.arange(M)+np.arange(0,A.shape[0],M).reshape(-1,1)
                
            else:
                I = np.arange(M)+np.arange(0,A.shape[0] -M,M).reshape(-1,1)
            
        B = A[I].reshape(-1, M * A.shape[1], A.shape[2])
        ci = self.N * A.shape[1]    #Number of features per sample
        return B[:, :ci], B[:, ci:] #Sample matrix, Target matrix



In [28]:
K = NPS 
N = NFS
M = N + K
slide_windows_size = 6
I = np.arange(M) + np.arange(A.shape[0] - M,step=slide_windows_size).reshape(-1, 1)

NameError: name 'NPS' is not defined

In [29]:
np.arange(A.shape[0] - M).reshape(-1, 1)

NameError: name 'A' is not defined

In [96]:
np.arange(M,step=slide_windows_size)

array([  0,   6,  12,  18,  24,  30,  36,  42,  48,  54,  60,  66,  72,
        78,  84,  90,  96, 102, 108, 114, 120, 126, 132, 138, 144, 150,
       156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228,
       234, 240, 246, 252, 258, 264, 270, 276, 282, 288])

In [99]:
A[I]

array([[[[ 1.94673865e-05,  5.62830013e-03,  5.33333346e-01,
           8.01999986e-01,  6.52000010e-01,  1.83333332e-01]],

        [[ 1.94673865e-05,  5.62830013e-03,  6.41666673e-01,
           5.65999990e-01,  7.32100010e-01,  1.32666667e-01]],

        [[ 1.94673865e-05,  5.62830013e-03,  7.50000000e-01,
           3.29999993e-01,  8.12200010e-01,  8.20000023e-02]],

        ...,

        [[ 2.42678436e-06,  5.49338314e-03,  4.84609514e-01,
           1.21833333e-01,  6.59699976e-01,  9.39999968e-02]],

        [[ 2.42678436e-06,  5.49338314e-03,  4.83681671e-01,
           1.23250000e-01,  7.04966644e-01,  9.69999979e-02]],

        [[ 2.42678436e-06,  5.49338314e-03,  4.82753828e-01,
           1.24666667e-01,  7.50233312e-01,  9.99999990e-02]]],


       [[[ 1.94673865e-05,  5.62830013e-03,  6.15972221e-01,
           1.03000000e-01,  9.81899977e-01,  3.98000002e-01]],

        [[ 1.94673865e-05,  5.62830013e-03,  6.27958149e-01,
           1.61000002e-01,  8.74733314e-01,  3.6

In [98]:
import numpy as np
np.arange(6).reshape((-1, 1))

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5]])

In [101]:
list(ticker_data.columns)

['Close_eth',
 'High_eth',
 'Low_eth',
 'Open_eth',
 'Close_btc',
 'High_btc',
 'Low_btc',
 'Open_btc',
 'eth_close_percent_change',
 'eth_close_movement',
 'avg_reddit_eth_compound_vader',
 'avg_reddit_eth_pos_vader',
 'avg_reddit_eth_neg_vader',
 'avg_reddit_eth_polarity_textblob',
 'avg_reddit_eth_subjectivity_textblob',
 'avg_reddit_btc_compound_vader',
 'avg_reddit_btc_pos_vader',
 'avg_reddit_btc_neg_vader',
 'avg_reddit_btc_polarity_textblob',
 'avg_reddit_btc_subjectivity_textblob',
 'avg_reddit_compound_vader',
 'avg_reddit_pos_vader',
 'avg_reddit_neg_vader',
 'avg_reddit_polarity_textblob',
 'avg_reddit_subjectivity_textblob',
 'avg_twitter_eth_compound_vader',
 'avg_twitter_eth_pos_vader',
 'avg_twitter_eth_neg_vader',
 'avg_twitter_eth_polarity_textblob',
 'avg_twitter_eth_subjectivity_textblob',
 'avg_twitter_btc_compound_vader',
 'avg_twitter_btc_pos_vader',
 'avg_twitter_btc_neg_vader',
 'avg_twitter_btc_polarity_textblob',
 'avg_twitter_btc_subjectivity_textblob',
 'av

## Define Features

In [320]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
# normalization

df = ticker_data[["Close_eth","Close_btc",
                  "avg_reddit_eth_subjectivity_textblob",
                  "avg_reddit_eth_pos_vader",
                 'avg_4day_reddit_eth_compound_vader',
                 "avg_4day_reddit_eth_pos_vader"]].copy()
time_stamps_index = df.index

original_df = ticker_data.copy()

columns = ["Close_eth","Close_btc"]

for c in columns:
    df[c] = scaler.fit_transform(df[c].values.reshape(-1,1))

 


In [None]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
# normalization

df = ticker_data[["Close_eth",
                  "Close_btc",
                  "avg_twitter_eth_subjectivity_textblob",
                  "avg_twitter_eth_pos_vader",
                  "avg_twitter_eth_subjectivity_textblob",
                 'avg_4day_twitter_eth_compound_vader',
                  "avg_4day_twitter_eth_subjectivity_textblob",
                 'avg_4day_twitter_eth_compound_vader',
                 ]].copy()

time_stamps_index = df.index

original_df = ticker_data.copy()

columns = ["Close_eth","Close_btc"]

for c in columns:
    df[c] = scaler.fit_transform(df[c].values.reshape(-1,1))

 


In [288]:
#Features are input sample dimensions(channels)
A = np.array(df)[:,None,:]
original_A = np.array(original_df)[:,None,:]
time_stamps = np.array(time_stamps_index)[:,None,None]

##Make samples of temporal sequences of pricing data (channel)
#Number of past samples
NPS = 288 # 2 days

#Number of future samples
NFS = 6 #1 hours of movment         

ps = PastSampler(NPS, NFS, sliding_window=False)

X, Y = ps.transform(A)
original_X, original_Y = ps.transform(original_A)

input_times, output_times = ps.transform(time_stamps)

In [289]:
A.shape

(126277, 1, 6)

In [290]:
Y_eth = Y[:,:,0]

In [291]:
X.shape

(429, 288, 6)

In [292]:
### For market movement
#
##Features are input sample dimensions(channels)
#A = np.array(df)[:,None,:]
#original_A = np.array(original_df)[:,None,:]
#time_stamps = np.array(time_stamps_index)[:,None,None]
#
##Make samples of temporal sequences of pricing data (channel)
##Number of past samples
#NPS = 576 #(4 days)#144 #(24 hours)#10
#
##Number of future samples
#NFS = 36 #(6 hours)#2
#
#ps = PastSampler(NPS, NFS, sliding_window=False)
#
#X, Y = ps.transform(A)
#original_X, original_Y = ps.transform(original_A)
#
#input_times, output_times = ps.transform(time_stamps)

In [293]:
print("Shape of original_A" + str(original_A.shape))
print("Shape of time_stamps" + str(time_stamps.shape))
print("Shape of original_X" + str(original_X.shape))
print("Shape of original_Y" + str(original_Y.shape))
print("Shape of X" + str(X.shape))
print("Shape of Y" + str(Y.shape))

Shape of original_A(126277, 1, 100)
Shape of time_stamps(126277, 1, 1)
Shape of original_X(429, 288, 100)
Shape of original_Y(429, 6, 100)
Shape of X(429, 288, 6)
Shape of Y(429, 6, 6)


# Build CNN

In [321]:
# set sizes
training_p = 0.6

training_size = int(training_p* X.shape[0])
remaining_size = X.shape[0] - training_size
test_size = int(remaining_size/2) + training_size
validation_size = int(remaining_size/2) + test_size


#split training validation
training_features = X[:training_size,:]
training_labels = Y_eth[:training_size,:]

# test set
test_features = X[training_size:test_size,:]
test_labels = Y_eth[training_size:test_size,:]

# validation set
validation_features = X[test_size:validation_size,:]
validation_labels = Y_eth[test_size:validation_size,:]


In [322]:
#build model
from keras import Sequential
from keras.layers import Conv1D, Conv2D, Dropout, Dense, Flatten, Reshape, LeakyReLU

epochs = 100
deep_epochs = 200
step_size = X.shape[1]
batch_size= 8
nb_features = X.shape[2]

In [323]:
# 
model0 = Sequential()

model0.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model0.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))


model0.add(Flatten())
model0.add(Dense(6))

model0.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model0.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_360 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_361 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
flatten_97 (Flatten)         (None, 1120)              0         
_________________________________________________________________
dense_115 (Dense)            (None, 6)                 6726      
Total params: 7,390
Trainable params: 7,390
Non-trainable params: 0
_________________________________________________________________


In [324]:
# 
model1 = Sequential()

model1.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model1.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))

model1.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))


model1.add(Flatten())
model1.add(Dense(6))

model1.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_362 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_363 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_364 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
flatten_98 (Flatten)         (None, 1120)              0         
_________________________________________________________________
dense_116 (Dense)            (None, 6)                 6726      
Total params: 8,446
Trainable params: 8,446
Non-trainable params: 0
_________________________________________________________________


In [325]:
# 
model2 = Sequential()

model2.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model2.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))

model2.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))

model2.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=64, 
                 kernel_size=2))
#model.add(Dropout(0.1))

model2.add(Flatten())
model2.add(Dense(6))

model2.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model2.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_365 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_366 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_367 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
conv1d_368 (Conv1D)          (None, 17, 64)            4160      
_________________________________________________________________
flatten_99 (Flatten)         (None, 1088)              0         
_________________________________________________________________
dense_117 (Dense)            (None, 6)                 6534      
Total params: 12,414
Trainable params: 12,414
Non-trainable params: 0
_________________________________________________________________


In [326]:
# 
model3 = Sequential()

model3.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model3.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))

model3.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))

model3.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=64, 
                 kernel_size=2))

model3.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=128, 
                 kernel_size=2))


model3.add(Flatten())
model3.add(Dense(6))

model3.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model3.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_369 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_370 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_371 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
conv1d_372 (Conv1D)          (None, 17, 64)            4160      
_________________________________________________________________
conv1d_373 (Conv1D)          (None, 8, 128)            16512     
_________________________________________________________________
flatten_100 (Flatten)        (None, 1024)              0         
_________________________________________________________________
dense_118 (Dense)            (None, 6)                 6150      
Total para

In [327]:
# 
model4 = Sequential()

model4.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model4.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))

model4.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))

model4.add(Dropout(0.25))

model4.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=64, 
                 kernel_size=2))

model4.add(Dropout(0.25))

model4.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=128, 
                 kernel_size=2))


model4.add(Flatten())
model4.add(Dense(6))

model4.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model4.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_374 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_375 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_376 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
dropout_41 (Dropout)         (None, 35, 32)            0         
_________________________________________________________________
conv1d_377 (Conv1D)          (None, 17, 64)            4160      
_________________________________________________________________
dropout_42 (Dropout)         (None, 17, 64)            0         
_________________________________________________________________
conv1d_378 (Conv1D)          (None, 8, 128)            16512     
__________

In [328]:
# 
model5 = Sequential()

model5.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model4.add(Dropout(0.01))


model5.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))


model5.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))


model5.add(Flatten())
model5.add(Dense(6))

model5.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model5.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_379 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_380 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_381 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
flatten_102 (Flatten)        (None, 1120)              0         
_________________________________________________________________
dense_120 (Dense)            (None, 6)                 6726      
Total params: 8,446
Trainable params: 8,446
Non-trainable params: 0
_________________________________________________________________


In [329]:
# 
model6 = Sequential()

model6.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model6.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=16, 
                 kernel_size=2))


model6.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))

model6.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=32, 
                 kernel_size=2))


model6.add(Flatten())
model6.add(Dense(6))

model6.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model6.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_382 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
conv1d_383 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
conv1d_384 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
conv1d_385 (Conv1D)          (None, 17, 32)            2080      
_________________________________________________________________
flatten_103 (Flatten)        (None, 544)               0         
_________________________________________________________________
dense_121 (Dense)            (None, 6)                 3270      
Total params: 7,070
Trainable params: 7,070
Non-trainable params: 0
_________________________________________________________________


In [330]:
# 
model7 = Sequential()

model7.add(Conv1D(activation='relu', 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=6, 
                 kernel_size=2))

model7.add(Conv1D(activation='relu', 
                 strides=2, 
                 filters=6, 
                 kernel_size=2))

model7.add(Flatten())
model7.add(Dense(48))
model4.add(Dropout(0.01))
model7.add(Dense(24))
model4.add(Dropout(0.01))
model7.add(Dense(12))
model7.add(Dense(6))

model7.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model7.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_386 (Conv1D)          (None, 144, 6)            78        
_________________________________________________________________
conv1d_387 (Conv1D)          (None, 72, 6)             78        
_________________________________________________________________
flatten_104 (Flatten)        (None, 432)               0         
_________________________________________________________________
dense_122 (Dense)            (None, 48)                20784     
_________________________________________________________________
dense_123 (Dense)            (None, 24)                1176      
_________________________________________________________________
dense_124 (Dense)            (None, 12)                300       
_________________________________________________________________
dense_125 (Dense)            (None, 6)                 78        
Total para

In [344]:
from keras.layers import LeakyReLU

model8 = Sequential()

model8.add(Conv1D(
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

model8.add(LeakyReLU(alpha=0.1))



model8.add(Conv1D(
                 strides=2, 
                 filters=16, 
                 kernel_size=2))
model8.add(LeakyReLU(alpha=0.1))

model8.add(Conv1D(
                 strides=2, 
                 filters=32, 
                 kernel_size=2))
model8.add(LeakyReLU(alpha=0.1))
model8.add(Conv1D( 
                 strides=2, 
                 filters=64, 
                 kernel_size=2))
#model.add(Dropout(0.1))
model8.add(LeakyReLU(alpha=0.1))


model8.add(Flatten())
model8.add(Dense(6))

model8.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model8.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_402 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
leaky_re_lu_119 (LeakyReLU)  (None, 141, 8)            0         
_________________________________________________________________
conv1d_403 (Conv1D)          (None, 70, 16)            272       
_________________________________________________________________
leaky_re_lu_120 (LeakyReLU)  (None, 70, 16)            0         
_________________________________________________________________
conv1d_404 (Conv1D)          (None, 35, 32)            1056      
_________________________________________________________________
leaky_re_lu_121 (LeakyReLU)  (None, 35, 32)            0         
_________________________________________________________________
conv1d_405 (Conv1D)          (None, 17, 64)            4160      
__________

In [366]:
# 
model9 = Sequential()

model9.add(Conv1D( 
                 input_shape=(step_size, 
                            nb_features), 
                 strides=2, 
                 filters=8, 
                 kernel_size=8))

#model9.add(LeakyReLU(alpha=0.1))

#model9.add(Conv1D( 
#                 strides=2, 
#                 filters=16, 
#                 kernel_size=2))
#
##model9.add(LeakyReLU(alpha=0.1))
#
#model9.add(Conv1D(
#                 strides=2, 
#                 filters=32, 
#                 kernel_size=2))
#
##model9.add(LeakyReLU(alpha=0.1))
#
#model9.add(Conv1D( 
#                 strides=2, 
#                 filters=64, 
#                 kernel_size=2))
#
##model9.add(LeakyReLU(alpha=0.1))
#
#model9.add(Conv1D( 
#                 strides=2, 
#                 filters=128, 
#                 kernel_size=2))
##model9.add(LeakyReLU(alpha=0.1))

model9.add(Flatten())
model9.add(Dense(6))

model9.compile(loss='mse', optimizer='adam',metrics=['mape','acc'])
model9.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_416 (Conv1D)          (None, 141, 8)            392       
_________________________________________________________________
flatten_111 (Flatten)        (None, 1128)              0         
_________________________________________________________________
dense_132 (Dense)            (None, 6)                 6774      
Total params: 7,166
Trainable params: 7,166
Non-trainable params: 0
_________________________________________________________________


## Train

**Temporal Golden Rule 2:**
* Temporal Training Order: It can not train and predict on future data and then train and predict on past data.

In [370]:
from keras.callbacks import ModelCheckpoint  

checkpointer0 = ModelCheckpoint(filepath='model_weights/cnn0.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer1 = ModelCheckpoint(filepath='model_weights/cnn1.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer2 = ModelCheckpoint(filepath='model_weights/cnn2.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer3 = ModelCheckpoint(filepath='model_weights/cnn3.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer4 = ModelCheckpoint(filepath='model_weights/cnn4.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer5 = ModelCheckpoint(filepath='model_weights/cnn5.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer6 = ModelCheckpoint(filepath='model_weights/cnn6.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer7 = ModelCheckpoint(filepath='model_weights/cnn7.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer8 = ModelCheckpoint(filepath='model_weights/cnn8.weights.hdf5', 
                               verbose=1, save_best_only=True)

checkpointer9 = ModelCheckpoint(filepath='model_weights/cnn9.weights.hdf5', 
                               verbose=1, save_best_only=True)



In [334]:
trained_model0 = model0.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer0]
                         )


trained_model1 = model1.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer1]
                         )

trained_model2 = model2.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer2]
                         )

trained_model3 = model3.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer3]
                         )




Epoch 00001: val_loss improved from inf to 0.03442, saving model to model_weights/cnn0.weights.hdf5

Epoch 00002: val_loss improved from 0.03442 to 0.00487, saving model to model_weights/cnn0.weights.hdf5

Epoch 00003: val_loss did not improve from 0.00487

Epoch 00004: val_loss did not improve from 0.00487

Epoch 00005: val_loss did not improve from 0.00487

Epoch 00006: val_loss did not improve from 0.00487

Epoch 00007: val_loss did not improve from 0.00487

Epoch 00008: val_loss did not improve from 0.00487

Epoch 00009: val_loss did not improve from 0.00487

Epoch 00010: val_loss did not improve from 0.00487

Epoch 00011: val_loss did not improve from 0.00487

Epoch 00012: val_loss did not improve from 0.00487

Epoch 00013: val_loss did not improve from 0.00487

Epoch 00014: val_loss did not improve from 0.00487

Epoch 00015: val_loss did not improve from 0.00487

Epoch 00016: val_loss did not improve from 0.00487

Epoch 00017: val_loss did not improve from 0.00487

Epoch 00018: 


Epoch 00049: val_loss did not improve from 0.00044

Epoch 00050: val_loss did not improve from 0.00044

Epoch 00051: val_loss did not improve from 0.00044

Epoch 00052: val_loss did not improve from 0.00044

Epoch 00053: val_loss did not improve from 0.00044

Epoch 00054: val_loss did not improve from 0.00044

Epoch 00055: val_loss did not improve from 0.00044

Epoch 00056: val_loss did not improve from 0.00044

Epoch 00057: val_loss did not improve from 0.00044

Epoch 00058: val_loss did not improve from 0.00044

Epoch 00059: val_loss did not improve from 0.00044

Epoch 00060: val_loss did not improve from 0.00044

Epoch 00061: val_loss did not improve from 0.00044

Epoch 00062: val_loss did not improve from 0.00044

Epoch 00063: val_loss did not improve from 0.00044

Epoch 00064: val_loss did not improve from 0.00044

Epoch 00065: val_loss did not improve from 0.00044

Epoch 00066: val_loss did not improve from 0.00044

Epoch 00067: val_loss did not improve from 0.00044

Epoch 00068


Epoch 00098: val_loss did not improve from 0.00026

Epoch 00099: val_loss did not improve from 0.00026

Epoch 00100: val_loss did not improve from 0.00026

Epoch 00001: val_loss improved from inf to 0.01866, saving model to model_weights/cnn3.weights.hdf5

Epoch 00002: val_loss improved from 0.01866 to 0.00180, saving model to model_weights/cnn3.weights.hdf5

Epoch 00003: val_loss improved from 0.00180 to 0.00161, saving model to model_weights/cnn3.weights.hdf5

Epoch 00004: val_loss improved from 0.00161 to 0.00045, saving model to model_weights/cnn3.weights.hdf5

Epoch 00005: val_loss improved from 0.00045 to 0.00043, saving model to model_weights/cnn3.weights.hdf5

Epoch 00006: val_loss did not improve from 0.00043

Epoch 00007: val_loss did not improve from 0.00043

Epoch 00008: val_loss did not improve from 0.00043

Epoch 00009: val_loss did not improve from 0.00043

Epoch 00010: val_loss did not improve from 0.00043

Epoch 00011: val_loss did not improve from 0.00043

Epoch 0001

In [335]:
trained_model4 = model4.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer4]
                         )


Epoch 00001: val_loss improved from inf to 0.03831, saving model to model_weights/cnn4.weights.hdf5

Epoch 00002: val_loss improved from 0.03831 to 0.01302, saving model to model_weights/cnn4.weights.hdf5

Epoch 00003: val_loss improved from 0.01302 to 0.00074, saving model to model_weights/cnn4.weights.hdf5

Epoch 00004: val_loss did not improve from 0.00074

Epoch 00005: val_loss did not improve from 0.00074

Epoch 00006: val_loss did not improve from 0.00074

Epoch 00007: val_loss did not improve from 0.00074

Epoch 00008: val_loss did not improve from 0.00074

Epoch 00009: val_loss did not improve from 0.00074

Epoch 00010: val_loss did not improve from 0.00074

Epoch 00011: val_loss did not improve from 0.00074

Epoch 00012: val_loss did not improve from 0.00074

Epoch 00013: val_loss did not improve from 0.00074

Epoch 00014: val_loss did not improve from 0.00074

Epoch 00015: val_loss did not improve from 0.00074

Epoch 00016: val_loss did not improve from 0.00074

Epoch 00017:

In [336]:
trained_model5 = model5.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer5]
                         )


Epoch 00001: val_loss improved from inf to 0.03664, saving model to model_weights/cnn5.weights.hdf5

Epoch 00002: val_loss improved from 0.03664 to 0.02929, saving model to model_weights/cnn5.weights.hdf5

Epoch 00003: val_loss improved from 0.02929 to 0.01479, saving model to model_weights/cnn5.weights.hdf5

Epoch 00004: val_loss improved from 0.01479 to 0.00640, saving model to model_weights/cnn5.weights.hdf5

Epoch 00005: val_loss improved from 0.00640 to 0.00431, saving model to model_weights/cnn5.weights.hdf5

Epoch 00006: val_loss did not improve from 0.00431

Epoch 00007: val_loss improved from 0.00431 to 0.00259, saving model to model_weights/cnn5.weights.hdf5

Epoch 00008: val_loss improved from 0.00259 to 0.00257, saving model to model_weights/cnn5.weights.hdf5

Epoch 00009: val_loss improved from 0.00257 to 0.00174, saving model to model_weights/cnn5.weights.hdf5

Epoch 00010: val_loss improved from 0.00174 to 0.00171, saving model to model_weights/cnn5.weights.hdf5

Epoch 

In [337]:
trained_model6 = model6.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer6]
                         )


Epoch 00001: val_loss improved from inf to 0.04371, saving model to model_weights/cnn6.weights.hdf5

Epoch 00002: val_loss improved from 0.04371 to 0.03894, saving model to model_weights/cnn6.weights.hdf5

Epoch 00003: val_loss improved from 0.03894 to 0.03612, saving model to model_weights/cnn6.weights.hdf5

Epoch 00004: val_loss improved from 0.03612 to 0.02498, saving model to model_weights/cnn6.weights.hdf5

Epoch 00005: val_loss improved from 0.02498 to 0.01360, saving model to model_weights/cnn6.weights.hdf5

Epoch 00006: val_loss improved from 0.01360 to 0.00581, saving model to model_weights/cnn6.weights.hdf5

Epoch 00007: val_loss improved from 0.00581 to 0.00227, saving model to model_weights/cnn6.weights.hdf5

Epoch 00008: val_loss improved from 0.00227 to 0.00221, saving model to model_weights/cnn6.weights.hdf5

Epoch 00009: val_loss improved from 0.00221 to 0.00149, saving model to model_weights/cnn6.weights.hdf5

Epoch 00010: val_loss improved from 0.00149 to 0.00099, sa

In [338]:
trained_model7 = model7.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = deep_epochs,
                            callbacks=[checkpointer7]
                         )


Epoch 00001: val_loss improved from inf to 0.04591, saving model to model_weights/cnn7.weights.hdf5

Epoch 00002: val_loss improved from 0.04591 to 0.03087, saving model to model_weights/cnn7.weights.hdf5

Epoch 00003: val_loss did not improve from 0.03087

Epoch 00004: val_loss improved from 0.03087 to 0.03000, saving model to model_weights/cnn7.weights.hdf5

Epoch 00005: val_loss improved from 0.03000 to 0.02328, saving model to model_weights/cnn7.weights.hdf5

Epoch 00006: val_loss improved from 0.02328 to 0.02226, saving model to model_weights/cnn7.weights.hdf5

Epoch 00007: val_loss improved from 0.02226 to 0.01613, saving model to model_weights/cnn7.weights.hdf5

Epoch 00008: val_loss improved from 0.01613 to 0.01551, saving model to model_weights/cnn7.weights.hdf5

Epoch 00009: val_loss improved from 0.01551 to 0.00961, saving model to model_weights/cnn7.weights.hdf5

Epoch 00010: val_loss improved from 0.00961 to 0.00707, saving model to model_weights/cnn7.weights.hdf5

Epoch 


Epoch 00146: val_loss did not improve from 0.00251

Epoch 00147: val_loss did not improve from 0.00251

Epoch 00148: val_loss did not improve from 0.00251

Epoch 00149: val_loss did not improve from 0.00251

Epoch 00150: val_loss did not improve from 0.00251

Epoch 00151: val_loss did not improve from 0.00251

Epoch 00152: val_loss did not improve from 0.00251

Epoch 00153: val_loss did not improve from 0.00251

Epoch 00154: val_loss did not improve from 0.00251

Epoch 00155: val_loss did not improve from 0.00251

Epoch 00156: val_loss did not improve from 0.00251

Epoch 00157: val_loss did not improve from 0.00251

Epoch 00158: val_loss did not improve from 0.00251

Epoch 00159: val_loss did not improve from 0.00251

Epoch 00160: val_loss did not improve from 0.00251

Epoch 00161: val_loss did not improve from 0.00251

Epoch 00162: val_loss did not improve from 0.00251

Epoch 00163: val_loss did not improve from 0.00251

Epoch 00164: val_loss did not improve from 0.00251

Epoch 00165

In [339]:
trained_model8 = model8.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer8]
                         )


Epoch 00001: val_loss improved from inf to 0.04663, saving model to model_weights/cnn8.weights.hdf5

Epoch 00002: val_loss did not improve from 0.04663

Epoch 00003: val_loss did not improve from 0.04663

Epoch 00004: val_loss did not improve from 0.04663

Epoch 00005: val_loss did not improve from 0.04663

Epoch 00006: val_loss did not improve from 0.04663

Epoch 00007: val_loss did not improve from 0.04663

Epoch 00008: val_loss did not improve from 0.04663

Epoch 00009: val_loss did not improve from 0.04663

Epoch 00010: val_loss did not improve from 0.04663

Epoch 00011: val_loss improved from 0.04663 to 0.04663, saving model to model_weights/cnn8.weights.hdf5

Epoch 00012: val_loss improved from 0.04663 to 0.04663, saving model to model_weights/cnn8.weights.hdf5

Epoch 00013: val_loss improved from 0.04663 to 0.04663, saving model to model_weights/cnn8.weights.hdf5

Epoch 00014: val_loss improved from 0.04663 to 0.04663, saving model to model_weights/cnn8.weights.hdf5

Epoch 0001


Epoch 00095: val_loss did not improve from 0.04656

Epoch 00096: val_loss did not improve from 0.04656

Epoch 00097: val_loss did not improve from 0.04656

Epoch 00098: val_loss did not improve from 0.04656

Epoch 00099: val_loss did not improve from 0.04656

Epoch 00100: val_loss did not improve from 0.04656


In [371]:

trained_model9 = model9.fit(training_features, 
                            training_labels,
                            verbose=0, 
                            batch_size=batch_size,
                            validation_data=(test_features,
                                           test_labels), 
                            epochs = epochs,
                            callbacks=[checkpointer9]
                         )


Epoch 00001: val_loss improved from inf to 0.00231, saving model to model_weights/cnn9.weights.hdf5

Epoch 00002: val_loss did not improve from 0.00231

Epoch 00003: val_loss did not improve from 0.00231

Epoch 00004: val_loss did not improve from 0.00231

Epoch 00005: val_loss did not improve from 0.00231

Epoch 00006: val_loss did not improve from 0.00231

Epoch 00007: val_loss did not improve from 0.00231

Epoch 00008: val_loss improved from 0.00231 to 0.00231, saving model to model_weights/cnn9.weights.hdf5

Epoch 00009: val_loss improved from 0.00231 to 0.00231, saving model to model_weights/cnn9.weights.hdf5

Epoch 00010: val_loss did not improve from 0.00231

Epoch 00011: val_loss did not improve from 0.00231

Epoch 00012: val_loss improved from 0.00231 to 0.00228, saving model to model_weights/cnn9.weights.hdf5

Epoch 00013: val_loss did not improve from 0.00228

Epoch 00014: val_loss did not improve from 0.00228

Epoch 00015: val_loss did not improve from 0.00228

Epoch 00016

## Results

In [372]:
model0.load_weights('model_weights/cnn0.weights.hdf5')
model1.load_weights('model_weights/cnn1.weights.hdf5')
model2.load_weights('model_weights/cnn2.weights.hdf5')
model3.load_weights('model_weights/cnn3.weights.hdf5')
model4.load_weights('model_weights/cnn4.weights.hdf5')
model5.load_weights('model_weights/cnn5.weights.hdf5')
model6.load_weights('model_weights/cnn6.weights.hdf5')
model7.load_weights('model_weights/cnn7.weights.hdf5')
model8.load_weights('model_weights/cnn8.weights.hdf5')
model9.load_weights('model_weights/cnn9.weights.hdf5')

In [373]:
for i in model9.model.layers:
    try:
        print(i.activation)
    except:
        print()

<function linear at 0x7f38c93406a8>

<function linear at 0x7f38c93406a8>




In [374]:
import pandas as pd
df = pd.DataFrame(columns=["model","number of convolution layers","filters at each layer","activation function",
                           "number of dense layers","number of paramaters","drop out","mse test score",
                           "mse cv score"])


t0 = trained_model0.model.evaluate(test_features, test_labels, verbose=0)
t1 = trained_model1.model.evaluate(test_features, test_labels, verbose=0)
t2 = trained_model2.model.evaluate(test_features, test_labels, verbose=0)
t3 = trained_model3.model.evaluate(test_features, test_labels, verbose=0)
t4 = trained_model4.model.evaluate(test_features, test_labels, verbose=0)
t5 = trained_model5.model.evaluate(test_features, test_labels, verbose=0)
t6 = trained_model6.model.evaluate(test_features, test_labels, verbose=0)
t7 = trained_model7.model.evaluate(test_features, test_labels, verbose=0)
t8 = trained_model8.model.evaluate(test_features, test_labels, verbose=0)
t9 = trained_model9.model.evaluate(test_features, test_labels, verbose=0)


cv0 = trained_model0.model.evaluate(validation_features, validation_labels, verbose=0)
cv1 = trained_model1.model.evaluate(validation_features, validation_labels, verbose=0)
cv2 = trained_model2.model.evaluate(validation_features, validation_labels, verbose=0)
cv3 = trained_model3.model.evaluate(validation_features, validation_labels, verbose=0)
cv4 = trained_model4.model.evaluate(validation_features, validation_labels, verbose=0)
cv5 = trained_model5.model.evaluate(validation_features, validation_labels, verbose=0)
cv6 = trained_model6.model.evaluate(validation_features, validation_labels, verbose=0)
cv7 = trained_model7.model.evaluate(validation_features, validation_labels, verbose=0)
cv8 = trained_model8.model.evaluate(validation_features, validation_labels, verbose=0)
cv9 = trained_model9.model.evaluate(validation_features, validation_labels, verbose=0)


cv = [cv0,cv1,cv2,cv3,cv4,cv5,cv6,cv7,cv8,cv9]
ts = [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9]
cv_name = ["cnn0","cnn1","cnn2","cnn3","cnn4","cnn5","cnn6","cnn7","cnn8","cnn9"]
layers = ["2","3","4","5","5","3","4","2","4","5"]
drop_out = ["false","false","false","false","true","true","false","false","false","false"]
filter_sizing = ["8,16","8,16,32","8,16,32,64","8,16,32,64,128","8,16,32,64,128","8,16,32","8,16,32,32","6,6","8,16,32,64","8,16,32,64,128"]
no_dense = ["1","1","1","1","1","1","1","4","1","1"]
paramaters = ["7,390","8,446","12,414","28,542","28,542","8,446","7,070","22,494","12,414","28,542"]
activation = ["relu","relu","relu","relu","relu","relu","relu","relu","leakyrelu","linear"]


row = zip(cv,ts,cv_name,layers,drop_out,filter_sizing,no_dense,paramaters,activation)


for cv_scores,test_score,model_name,layers,drop_out,filter_sizing,no_dense,paramaters,activation_function in row :
    #metric_scores
    
    df = df.append({
         "model":model_name,
        "number of convolution layers":layers,
        "drop out":drop_out,
        "mse test score":test_score[0],
        "mse cv score":cv_scores[0],
        "filters at each layer":filter_sizing,
        "number of dense layers":no_dense,
        "number of paramaters":paramaters,
        "activation function":activation_function
          }, ignore_index=True)

df

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.004535,0.021905
1,cnn1,3,81632,relu,1,8446,False,0.000425,0.021027
2,cnn2,4,8163264,relu,1,12414,False,0.00026,0.034203
3,cnn3,5,8163264128,relu,1,28542,False,0.000432,0.039088
4,cnn4,5,8163264128,relu,1,28542,True,0.000733,0.036853
5,cnn5,3,81632,relu,1,8446,True,0.000728,0.088048
6,cnn6,4,8163232,relu,1,7070,False,0.000162,0.020722
7,cnn7,2,66,relu,4,22494,False,0.002514,0.015095
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.046555,0.291989
9,cnn9,5,8163264128,linear,1,28542,False,0.001021,0.002456


In [369]:
df #reddit 2 days

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.004535,0.021905
1,cnn1,3,81632,relu,1,8446,False,0.000425,0.021027
2,cnn2,4,8163264,relu,1,12414,False,0.00026,0.034203
3,cnn3,5,8163264128,relu,1,28542,False,0.000432,0.039088
4,cnn4,5,8163264128,relu,1,28542,True,0.000733,0.036853
5,cnn5,3,81632,relu,1,8446,True,0.000728,0.088048
6,cnn6,4,8163232,relu,1,7070,False,0.000162,0.020722
7,cnn7,2,66,relu,4,22494,False,0.002514,0.015095
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.046555,0.291989
9,cnn9,5,8163264128,leakyrelu,1,28542,False,0.000189,0.001155


In [285]:
df

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.001882,0.046002
1,cnn1,3,81632,relu,1,8446,False,0.001585,0.024897
2,cnn2,4,8163264,relu,1,12414,False,0.001786,0.004645
3,cnn3,5,8163264128,relu,1,28542,False,0.000586,0.044695
4,cnn4,5,8163264128,relu,1,28542,True,0.000648,0.004002
5,cnn5,3,81632,relu,1,8446,True,0.000342,0.010436
6,cnn6,4,8163232,relu,1,7070,False,0.000621,0.002787
7,cnn7,2,66,relu,4,22494,False,0.002092,0.004656
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.001139,0.004477
9,cnn9,5,8163264128,leakyrelu,1,28542,False,0.000814,0.005574


In [274]:
df

Unnamed: 0_level_0,Close_eth,Close_btc,avg_twitter_eth_subjectivity_textblob,avg_twitter_eth_pos_vader,avg_2day_twitter_eth_compound_vader,avg_2day_twitter_eth_pos_vader
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-05 00:00:00,0.000019,0.005628,0.783333,0.201333,0.322758,0.134250
2016-01-05 00:10:00,0.000019,0.005628,0.730556,0.200000,0.294200,0.128000
2016-01-05 00:20:00,0.000019,0.005628,0.677778,0.198667,0.260650,0.124375
2016-01-05 00:30:00,0.000019,0.005628,0.625000,0.197333,0.227100,0.120750
2016-01-05 00:40:00,0.000019,0.005628,0.572222,0.196000,0.193550,0.117125
2016-01-05 00:50:00,0.000019,0.005628,0.519444,0.194667,0.160000,0.113500
2016-01-05 01:00:00,0.000019,0.005628,0.466667,0.193333,0.126450,0.109875
2016-01-05 01:10:00,0.000019,0.005628,0.413889,0.192000,0.092900,0.106250
2016-01-05 01:20:00,0.000019,0.005628,0.361111,0.190667,0.059350,0.102625
2016-01-05 01:30:00,0.000019,0.005628,0.308333,0.189333,0.025800,0.099000


In [231]:
df

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.001422,0.012461
1,cnn1,3,81632,relu,1,8446,False,0.001042,0.101277
2,cnn2,4,8163264,relu,1,12414,False,0.000338,0.023004
3,cnn3,5,8163264128,relu,1,28542,False,0.000556,0.050852
4,cnn4,5,8163264128,relu,1,28542,True,0.000424,0.037896
5,cnn5,3,81632,relu,1,8446,True,0.000564,0.018048
6,cnn6,4,8163232,relu,1,7070,False,0.000409,0.046118
7,cnn7,2,66,relu,4,22494,False,0.002841,0.168737
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.045787,0.290047
9,cnn9,5,8163264128,leakyrelu,1,28542,False,0.000609,0.003464


In [154]:
df #2nd one

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.001357,0.006503
1,cnn1,3,81632,relu,1,8446,False,0.003279,0.013855
2,cnn2,4,8163264,relu,1,12414,False,0.001031,0.005121
3,cnn3,5,8163264128,relu,1,28542,False,0.002715,0.005289
4,cnn4,5,8163264128,relu,1,28542,True,0.001078,0.113907
5,cnn5,3,81632,relu,1,8446,True,0.004947,0.023261
6,cnn6,4,8163232,relu,1,7070,False,0.003029,0.013237
7,cnn7,2,66,relu,4,22494,False,0.011664,0.028382
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.046672,0.29213
9,cnn9,5,8163264128,leakyrelu,1,28542,False,0.000624,0.002338


In [133]:
df # 1st one

Unnamed: 0,model,number of convolution layers,filters at each layer,activation function,number of dense layers,number of paramaters,drop out,mse test score,mse cv score
0,cnn0,2,816,relu,1,7390,False,0.003485,0.921374
1,cnn1,3,81632,relu,1,8446,False,0.004851,0.028609
2,cnn2,4,8163264,relu,1,12414,False,0.002859,0.008542
3,cnn3,5,8163264128,relu,1,28542,False,0.000685,0.014294
4,cnn4,5,8163264128,relu,1,28542,True,0.000487,0.012543
5,cnn5,3,81632,relu,1,8446,True,0.00198,0.009537
6,cnn6,4,8163232,relu,1,7070,False,0.002311,0.006138
7,cnn7,2,66,relu,4,22494,False,0.000989,0.006568
8,cnn8,4,8163264,leakyrelu,1,12414,False,0.046626,0.292069
9,cnn9,5,8163264128,leakyrelu,1,28542,False,0.001142,0.005818


Do diffent activation functions. Then finished. 

In [365]:
r = trained_model9.model.predict(test_features)

x = test_labels.flatten()
y = r.flatten()
d = np.column_stack((x,y))
%matplotlib notebook
cv_time = output_times[training_size:test_size,:].flatten()
pd.DataFrame(index=cv_time,data=d).plot(kind=("line"))

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f37f65ed6a0>

**note to self: adding 2 day lag helps alot**

In [176]:
test_labels

array([[0.23852057, 0.23641018, 0.23500325, 0.24126399, 0.2412561 ,
        0.24442647],
       [0.24767729, 0.24836901, 0.24766561, 0.24713274, 0.24625868,
        0.24365587],
       [0.24534263, 0.24309309, 0.24282056, 0.24245998, 0.2399275 ,
        0.24133443],
       [0.24091235, 0.23869644, 0.23852057, 0.23753572, 0.23852052,
        0.23922404],
       [0.22061036, 0.22250965, 0.22163742, 0.22163742, 0.22163742,
        0.22212985],
       [0.21038199, 0.21108545, 0.21038199, 0.20756813, 0.20820124,
        0.20686471],
       [0.17840248, 0.17661568, 0.17942954, 0.18069577, 0.17412649,
        0.17626395],
       [0.217392  , 0.21861252, 0.21699455, 0.21952696, 0.21790712,
        0.21685386],
       [0.19258433, 0.19168839, 0.19163465, 0.18716765, 0.18777421,
        0.18716468],
       [0.18892631, 0.18997158, 0.18927804, 0.18853138, 0.18948205,
        0.19131814],
       [0.18477587, 0.18540794, 0.18646207, 0.1859393 , 0.18646419,
        0.18716765],
       [0.18372067, 0

In [82]:
import seaborn as sns

cv_time = output_times[test_size:validation_size,:].flatten()

validation_data = trained_model8.model.predict(validation_features)

x = validation_labels.flatten()
y = validation_data.flatten()
d = np.column_stack((x,y))

%matplotlib notebook
pd.DataFrame(index=cv_time,data=d).plot(kind=("line"))

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f381242e3c8>

In [91]:
import seaborn as sns

cv_time = output_times[test_size:validation_size,:].flatten()

validation_data = trained_model9.model.predict(validation_features)

x = validation_labels.flatten()
y = validation_data.flatten()
d = np.column_stack((x,y))

%matplotlib notebook
pd.DataFrame(index=cv_time,data=d).plot(kind=("line"))

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f381184fef0>

In [96]:
import seaborn as sns

cv_time = output_times[test_size:validation_size,:].flatten()

validation_data = trained_model4.model.predict(validation_features)

x = validation_labels.flatten()
y = validation_data.flatten()
d = np.column_stack((x,y))

%matplotlib notebook
pd.DataFrame(index=cv_time,data=d).plot(kind=("line"))

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f38744bb630>

# Plot Results

In [93]:
test_labels.shape

(74, 36, 10)

# Messing Around

In [32]:
Y_closing_eth_price = Y[:,0,0]
Y_market_movement = Y[:,0,9]
Y.shape

(245, 36, 10)

In [27]:
Y[:,0,0]

array([0.00098069, 0.00098069, 0.00098069, ..., 0.34931876, 0.34929688,
       0.34917204])

In [28]:
Y[:,35,0]

IndexError: index 35 is out of bounds for axis 1 with size 1

In [None]:
0.00082603/0.00062919

In [33]:
np.seterr(divide='ignore', invalid='ignore')
np.divide(Y[:,0,0],Y[:,35,0]) 

array([0.76170337, 0.99103138, 0.72173916, 0.92831543, 1.        ,
       1.        , 0.95933611, 1.        , 0.84570564, 1.        ,
       1.        , 1.13685732, 1.05023926, 1.        , 1.        ,
       0.96272123, 1.08987612, 0.99156113, 1.04733728, 0.94960618,
       0.92252675, 0.94071232, 0.95480225, 1.04692387, 1.00000046,
       0.93842781, 1.01155952, 0.99677818, 0.97314251, 0.98384047,
       0.98369361, 0.96399787, 1.00000001, 1.        , 1.01469298,
       0.99125992, 0.95898438, 0.95862624, 1.07647969, 1.01511382,
       1.00474268, 0.98181818, 0.99587629, 0.97036283, 1.03707137,
       1.07389749, 0.9747245 , 0.99832432, 0.88248892, 0.91174036,
       1.01297355, 0.9607204 , 1.00386742, 0.96307526, 0.9888395 ,
       1.00628617, 0.95584706, 1.01216344, 0.97194116, 0.97675893,
       1.01976349, 0.99021012, 0.97725981, 1.00443143, 1.0085701 ,
       0.96059384, 0.96871602, 0.99842379, 1.06438467, 0.95325248,
       1.00640741, 1.        , 0.98798595, 1.13367361, 0.91161

In [31]:
Y.shape

(149960, 1, 10)

In [126]:
np.average(np.divide(Y[0:100000,0,0],Y[0:100000,35,0]) < 1)

0.49387755102040815

In [125]:
np.average(np.divide(Y[0:100000,0,0],Y[0:100000,35,0]) > 1)

0.47346938775510206

In [96]:
r_1 = Y[0:100000,0,0]
r_2 = Y[0:100000,35,0]
r_3 = np.divide(r_1,r_2)

r_c_lt = r_3 < 1
r_c_gt = r_3 > 1


  This is separate from the ipykernel package so we can avoid doing imports until


In [97]:
np.average(r_3[r_c_lt])

0.9713363119724743

In [101]:
(np.average(r_3[r_c_gt][1:7054]) + np.average(r_3[r_c_gt][7055:]))/2

1.0422773815639128

In [90]:
r_3[r_c][7053:7055]
# point 7055 has an inf for what ever reason.....

array([1.00766735,        inf])

In [54]:
np.average(r_3[r_c])

inf

In [44]:
np.divide(Y[0:100000,0,0],Y[0:100000,35,0])

  """Entry point for launching an IPython kernel.


array([0.76170337, 0.76170337, 0.73662672, ..., 1.00539378, 0.99584584,
       0.99255389])

In [23]:
# 6 hours interval eth closing price % change
%matplotlib notebook
pd.DataFrame(np.divide(Y[:,0,0],Y[:,35,0])).plot(kind="line")

  This is separate from the ipykernel package so we can avoid doing imports until


<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f5b342f9668>

In [39]:

xx = Y[:,0,0] 

xxx = xx == 0
xx[xxx]

array([0.])

In [22]:
X.shape

(149605, 576, 8)

In [24]:
# labels for up/down behavior 
Y[:,0,0].shape

(149605,)