<a href="https://colab.research.google.com/github/ChristophRaab/NASDAQ-Dataset/blob/master/SentQs_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>SentQS Demo

---

This files shows how to download the SentQs dataset and train an 1D-CNN-Lstm Network on it 
Remarks: Set runtime to GPU for GPU acceleration

More information about the dataset can be found at:

https://github.com/ChristophRaab/NASDAQ-Dataset

Author: Christoph Raab 

<h2> Import modules and load data into program

In [1]:
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.layers import Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Conv1D, MaxPooling1D,Conv2D, BatchNormalization
from tensorflow.keras.datasets import imdb
import requests
import sys
from tensorflow.keras.utils import to_categorical
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn import preprocessing

In [2]:
# Download data
!wget --no-check-certificate https://cloud.fhws.de/index.php/s/4sJ69ocZW8epAke/download -O sentqs_dataset.npz

--2020-12-10 09:02:31--  https://cloud.fhws.de/index.php/s/4sJ69ocZW8epAke/download
Resolving cloud.fhws.de (cloud.fhws.de)... 193.174.83.161
Connecting to cloud.fhws.de (cloud.fhws.de)|193.174.83.161|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 98939622 (94M) [application/octet-stream]
Saving to: ‘sentqs_dataset.npz’


2020-12-10 09:03:03 (3.00 MB/s) - ‘sentqs_dataset.npz’ saved [98939622/98939622]



In [3]:
#Load data into program
file_name = "sentqs_dataset.npz"
data = np.load(file_name,allow_pickle=True)
Xs = data["arr_0"]
Ys = data["arr_1"]
Xt = data["arr_2"]
Yt = data["arr_3"]

<h2> Preprocess (Only one call per runtime)

In [4]:
# Standardize  data
Xs = (Xs - Xs.mean(0)) / Xs.std(0)
Xt = (Xt - Xt.mean(0)) / Xt.std(0)

# Make data compatible with conv1d layers
Xs = np.expand_dims(Xs, 2)
Xt = np.expand_dims(Xt, 2)

# Make labels comaptible with categorical cross-entropy 
Ys = to_categorical(Ys,3)
Yt = to_categorical(Yt,3)



<h2> Model paramters and defintion

In [5]:
# Model Parameters
# Convolution
kernel_size = 5
filters = 64
pool_size = 4
# LSTM
lstm_output_size = 70
# Training
batch_size = 128
epochs = 30
num_classes = Ys.shape[1]

# Define and compile CNN-LSTM-Network
model = Sequential()
model.add(Conv1D(filters,
                 kernel_size,
                 padding='valid',
                 activation='relu',
                 strides=1))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(lstm_output_size))
model.add(Dense(100))
model.add(Dense(35))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

<h2>  Train and Evaluate Network

In [6]:
# Test if data has valid shapes
assert Xs.shape == (21395,200,1)
assert Ys.shape == (21395,3)
assert Xt.shape == (40134,200,1)
assert Yt.shape == (40134,3)
model.fit(Xs, Ys,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(Xt, Yt))
score, acc = model.evaluate(Xt, Yt, batch_size=batch_size)
model.summary()
print("Evaluated CNN-LSTM Neural Network")
print('Test score:', score)
print('Test accuracy:', acc)


Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d (Conv1D)              (None, 196, 64)           384       
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 49, 64)            0         
_________________________________________________________________
lstm (LSTM)                  (None, 70)                37800     
_________________________________________________________________
dense (Dense)                (None, 100)               7100      
___________________________________