# Stock prices dataset
The data is of tock exchange's stock listings for each trading day of 2010 to 2016.

## Description
A brief description of columns.
- open: The opening market price of the equity symbol on the date
- high: The highest market price of the equity symbol on the date
- low: The lowest recorded market price of the equity symbol on the date
- close: The closing recorded price of the equity symbol on the date
- symbol: Symbol of the listed company
- volume: Total traded volume of the equity symbol on the date
- date: Date of record

In this assignment, we will work on the stock prices dataset named "prices.csv". Task is to create a Neural Network to classify closing price for a stock based on some parameters.

In [40]:
# Initialize the random number generator
import random
random.seed(0)

# Ignore the warnings
import warnings
warnings.filterwarnings("ignore")

import pandas as pd

## Question 1

### Load the data
- load the csv file and read it using pandas
- file name is prices.csv

In [41]:
# run this cell to upload file using GUI if you are using google colab

from google.colab import files
files.upload()

{}

In [7]:
# run this cell to to mount the google drive if you are using google colab

from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [42]:
path = '/content/drive/My Drive/ColabNotebooks/prices.csv'
df = pd.read_csv(path)

## Question 2

### Drop null
- Drop null values if any

In [43]:
df.dropna()

Unnamed: 0,date,symbol,open,close,low,high,volume
0,2016-01-05 00:00:00,WLTW,123.430000,125.839996,122.309998,126.250000,2163600.0
1,2016-01-06 00:00:00,WLTW,125.239998,119.980003,119.940002,125.540001,2386400.0
2,2016-01-07 00:00:00,WLTW,116.379997,114.949997,114.930000,119.739998,2489500.0
3,2016-01-08 00:00:00,WLTW,115.480003,116.620003,113.500000,117.440002,2006300.0
4,2016-01-11 00:00:00,WLTW,117.010002,114.970001,114.089996,117.330002,1408600.0
...,...,...,...,...,...,...,...
851259,2016-12-30,ZBH,103.309998,103.199997,102.849998,103.930000,973800.0
851260,2016-12-30,ZION,43.070000,43.040001,42.689999,43.310001,1938100.0
851261,2016-12-30,ZTS,53.639999,53.529999,53.270000,53.740002,1701200.0
851262,2016-12-30 00:00:00,AIV,44.730000,45.450001,44.410000,45.590000,1380900.0


### Drop columns
- Now, we don't need "date", "volume" and "symbol" column
- drop "date", "volume" and "symbol" column from the data


In [44]:
df.drop(['date','volume','symbol'],axis=1,inplace=True)

## Question 3

### Print the dataframe
- print the modified dataframe

In [45]:
df.head(20)

Unnamed: 0,open,close,low,high
0,123.43,125.839996,122.309998,126.25
1,125.239998,119.980003,119.940002,125.540001
2,116.379997,114.949997,114.93,119.739998
3,115.480003,116.620003,113.5,117.440002
4,117.010002,114.970001,114.089996,117.330002
5,115.510002,115.550003,114.5,116.059998
6,116.459999,112.849998,112.589996,117.07
7,113.510002,114.379997,110.050003,115.029999
8,113.330002,112.529999,111.919998,114.879997
9,113.660004,110.379997,109.870003,115.870003


### Get features and label from the dataset in separate variable
- Let's separate labels and features now. We are going to predict the value for "close" column so that will be our label. Our features will be "open", "low", "high"
- Take "open" "low", "high" columns as features
- Take "close" column as label

In [46]:
y = df['close']

In [47]:
X=df.drop('close',axis=1)

## Question 4

### Create train and test sets
- Split the data into training and testing

In [57]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 7)

## Question 5

### Scaling
- Scale the data (features only)
- Use StandarScaler

In [58]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

In [59]:
scaler.fit(X_train)
scaledX_train = scaler.transform(X_train)
scaledX_test = scaler.transform(X_test)

## Question 6

### Convert data to NumPy array
- Convert features and labels to numpy array

In [60]:
import numpy as np
y_train = np.array(y_train)

In [61]:
y_test = np.array(y_test)

## Question 7

### Define Model
- Initialize a Sequential model
- Add a Flatten layer
- Add a Dense layer with one neuron as output
  - add 'linear' as activation function


In [66]:
X_train = scaledX_train.reshape(scaledX_train.shape[0], scaledX_train.shape[1], 1)
X_test = scaledX_test.reshape(scaledX_test.shape[0], scaledX_test.shape[1], 1)

In [67]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dropout, BatchNormalization, Reshape, Flatten, Dense


model = Sequential()
#model.add(Reshape((X_train.shape[0],X_train.shape[1],1)))

model.add(Flatten())
model.add(Dense (1, activation='linear'))

## Question 8

### Compile the model
- Compile the model
- Use "sgd" optimizer
- for calculating loss, use mean squared error

In [68]:
from tensorflow.keras import optimizers
model.compile(optimizer = 'sgd', loss='mean_squared_error' , metrics=['mse'])

## Question 9

### Fit the model
- epochs: 50
- batch size: 128
- specify validation data

In [69]:
model.fit(X_train, y_train, batch_size = 128, epochs = 50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7feed0271390>

## Question 10

### Evaluate the model
- Evaluate the model on test data

In [70]:
results = model.evaluate(X_test, y_test)



In [71]:
print(model.metrics_names)
print(results)   

['loss', 'mse']
[0.7005979418754578, 0.7005979418754578]


### Manual predictions
- Test the predictions on manual inputs
- We have scaled out training data, so we need to transform our custom inputs using the object of the scaler
- Example of manual input: [123.430000,	122.30999, 116.250000]

In [75]:
manual_test = scaler.transform([[123.430000, 122.30999, 116.250000]])

y_pred = model.predict(manual_test)
print(y_pred)

[[119.75714]]


# Build a DNN

### Collect Fashion mnist data from tf.keras.datasets 

### Change train and test labels into one-hot vectors

### Build the Graph

### Initialize model, reshape & normalize data

### Add two fully connected layers with 200 and 100 neurons respectively with `relu` activations. Add a dropout layer with `p=0.25`

### Add the output layer with a fully connected layer with 10 neurons with `softmax` activation. Use `categorical_crossentropy` loss and `adam` optimizer and train the network. And, report the final validation.