# Neural Network Application: Predicting SP500

###  We first import python libraries: NumPy and Pandas.

"NumPy is the fundamental package for scientific computing with Python."

"Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language."

In [25]:
import numpy as np
import pandas as pd

### We then import Keras models: "Sequential" and "Dense".

The "Sequential" model is a linear stack of layers.

The "Dense" class use fully connected layers. In a fully connected layer, each neuron receives input from every element of the previous layer.

In [26]:
from keras.models import Sequential
from keras.layers import Dense

### Now we import the dataset "SP500dailyinputdata.csv". 
This dataset includes a variable that indicates whether the S&P500 index goes up or down in a given day and the daily returns in the previous 5 days. The sample period is from January 2015 to December 2019. 

### YOUR TURN
To import "SP500dailyinputdata.csv", complete the next line by replacing "?".

In [27]:
dataset = pd.read_csv('SP500dailyinputdata.csv')
dataset.head(5) #confirm csv read

Unnamed: 0,UpDown,lag1return,lag2return,lag3return,lag4return,lag5return
0,0,-0.008094,-0.008404,0.017888,0.01163,-0.008893
1,0,-0.002579,-0.008094,-0.008404,0.017888,0.01163
2,0,-0.005813,-0.002579,-0.008094,-0.008404,0.017888
3,1,-0.009248,-0.005813,-0.002579,-0.008094,-0.008404
4,1,0.013424,-0.009248,-0.005813,-0.002579,-0.008094


Variable description:

UpDown = indicates whether the S&P index goes up or down in day t. (0: down, 1: up)

lag1return = the return of the S&P index in t-1.

lag2return = the return of the S&P index in t-2.

lag3return = the return of the S&P index in t-3.

lag4return = the return of the S&P index in t-4.

lag5return = the return of the S&P index in t-5.


### We generate two datasets X and y from the original dataset.  

X is the dataset that contains the independent (predictive) variables. 

y is the dataset that contains the outcome variable "UpDown". 

### YOUR TURN
To generate two datasets X and y, complete the next line by replacing "?".

In [28]:
X = dataset.drop(['UpDown'],axis=1)
y = dataset['UpDown']

### Now we define the neural network.  

In [29]:
model = Sequential()

## YOUR TURN

To add a layer with 8 neurons to your neural network, complete the next line by replacing "?".

In [30]:
model.add(Dense(8, input_dim=5, activation='relu'))

To add another layer with 6 neurons to your neural network, complete the next line by replacing "?".

In [31]:
model.add(Dense(6, activation='relu'))

In [32]:
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

In [33]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

"input_dim" is the number of variables in X dataset.

Two activation functions used here include  "ReLu" and "sigmoid". We have learned "sigmoid" in our class.  

If you are interested in learning "ReLu", here is a good website:

https://www.kaggle.com/dansbecker/rectified-linear-units-relu-in-deep-learning

("ReLu" is not required by this class). 

A sigmoid works well for a classifier. ReLu is less computationally expensive than sigmoid because it involves simpler mathematical operations.

The loss function is "binary_crossentropy".  The optimizer is "Adam".

### Fit the neural network model on the dataset

In [34]:
model.fit(X, y, epochs=150, verbose=0)

<keras.callbacks.callbacks.History at 0x2891ef10d08>

epochs: Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided.

verbose: display option. 0 = silent, 1 = progress bar, 2 = one line per epoch.

### Evaluate the neural network model

In [35]:
loss, accuracy = model.evaluate(X, y)



In [36]:
print(accuracy)

0.5379696488380432


### YOUR TURN
Just double click the text below and you will be able to fill in the blanks. 

The accuracy of the predictions is __0.537969__ .  This means that __53.7969__ % of cases are predicted correctly. 

Is market efficient? (you don't have to answer it here. Just give some thoughts...). 

Ans-> I think that, yes, the market is mostly effiecient because much of the available information on securities is already priced in. The outcome of our neural network learning model does support this because the result was predicted correctly only 50% of the time. Meaning we might as well just flip a coin. Mainly, The model shows that historic prices are a poor predictor of future performance. But in terms of market effieciency, I think we would have to incorporate more varibles like investor sentiment, and expectations for future dividend yield and expected financal activities of institutional investors before we can make a connection between the model and the efficiency of the market.

Is market efficient? (you don't have to answer it here. Just give some thoughts...). 

Ans-> I think that, yes, the market is effiecient because all much of the available information on securities is already priced into the market. The outcome of our neural network learning model supports this because the it shows that historic price flucuations is a poor predictor of future performance. 

### Congratulations! Now you have successfully built a neural network model!