# The Iris Dataset
The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.

The dataset contains a set of 150 records under five attributes - petal length, petal width, sepal length, sepal width and species.

Firstly, let's select TensorFlow version 2.x in colab

In [7]:
pip install tensorflow==2.0

Collecting tensorflow==2.0
[?25l  Downloading https://files.pythonhosted.org/packages/46/0f/7bd55361168bb32796b360ad15a25de6966c9c1beb58a8e30c01c8279862/tensorflow-2.0.0-cp36-cp36m-manylinux2010_x86_64.whl (86.3MB)
[K     |████████████████████████████████| 86.3MB 104kB/s 
Collecting tensorboard<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/76/54/99b9d5d52d5cb732f099baaaf7740403e83fe6b0cedde940fabd2b13d75a/tensorboard-2.0.2-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 49.8MB/s 
[?25hCollecting tensorflow-estimator<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/fc/08/8b927337b7019c374719145d1dceba21a8bb909b93b1ad6f8fb7d22c1ca1/tensorflow_estimator-2.0.1-py2.py3-none-any.whl (449kB)
[K     |████████████████████████████████| 450kB 54.0MB/s 
[31mERROR: tensorflow-federated 0.11.0 requires enum34~=1.1, which is not installed.[0m
[31mERROR: tensorflow-federated 0.11.0 has requirement attrs~=18.2, but yo

In [0]:
%tensorflow_version 2.x
import tensorflow as tf

In [0]:
# Initialize the random number generator
import random
random.seed(0)

# Ignore the warnings
import warnings
warnings.filterwarnings("ignore")

## Question 1

### Import dataset
- Import iris dataset
- Import the dataset using sklearn library

In [0]:
import pandas as pd
import numpy as np

In [0]:
from sklearn.datasets import load_iris
df= load_iris()
type(df)

In [0]:
print(df)

## Question 2

### Get features and label from the dataset in separate variable
- you can get the features using .data method
- you can get the features using .target method

In [0]:
print (df.feature_names)

In [0]:
print (df.target)

## Question 3

### Create train and test data
- use train_test_split to get train and test set
- set a random_state: 1
- test_size: 0.25

In [0]:
from sklearn.model_selection import train_test_split
X=df.data
y=df.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

## Question 4

### One-hot encode the labels
- convert class vectors (integers) to binary class matrix
- convert labels
- number of classes: 3
- we are doing this to use categorical_crossentropy as loss

In [0]:
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
from keras.utils import to_categorical
from sklearn import datasets

In [0]:
encoded_y=tf.keras.utils.to_categorical(y,num_classes=3,dtype='float32')

## Question 5

### Initialize a sequential model
- Define a sequential model

In [0]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from keras.layers.advanced_activations import ReLU

In [0]:
model = Sequential()

## Question 6

### Add a layer
- Use Dense Layer  with input shape of 4 (according to the feature set) and number of outputs set to 3
- Apply Softmax on Dense Layer outputs

In [0]:
model.add(Dense(30, activation='relu', input_shape=(4,)))

In [0]:
model.add(Dense(3, activation='softmax'))

## Question 7

### Compile the model
- Use SGD as Optimizer
- Use categorical_crossentropy as loss function
- Use accuracy as metrics

In [0]:
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy']) #optimizer is the hyper parameter
                 

## Question 8

### Summarize the model
- Check model layers
- Understand number of trainable parameters

In [0]:
model.summary()

## Question 9

### Fit the model
- Give train data as training features and labels
- Epochs: 100
- Give validation data as testing features and labels

In [0]:
epochs = 100
batch_size =150
from keras.utils import to_categorical
y_binary=to_categorical(y_train)
y_test=to_categorical(y_test)
history = model.fit(X_train, y_binary, batch_size=batch_size, epochs=epochs, validation_split=.3, verbose=True)
loss,accuracy  = model.evaluate(X_test, y_test, verbose=False)

## Question 10

### Make predictions
- Predict labels on one row

In [0]:
print(history.history['val_accuracy'])

print(history.history['accuracy'])

ta = pd.DataFrame(history.history['accuracy'])
va = pd.DataFrame(history.history['val_accuracy'])

tva = pd.concat([ta,va] , axis=1)

tva.boxplot()

In [0]:
y_pred = np.round(model.predict(X_test))

In [0]:
y_pred[0:10]

In [0]:
loss, acc = model.evaluate(X_test, y_test, verbose=0)
print('Test Accuracy: %.3f' % acc)


loss, acc = model.evaluate(X_train, y_binary, verbose=0)
print('Test Accuracy: %.3f' % acc)


### Compare the prediction with actual label
- Print the same row as done in the previous step but of actual labels

In [0]:
y[0]



---



# Stock prices dataset
The data is of tock exchange's stock listings for each trading day of 2010 to 2016.

## Description
A brief description of columns.
- open: The opening market price of the equity symbol on the date
- high: The highest market price of the equity symbol on the date
- low: The lowest recorded market price of the equity symbol on the date
- close: The closing recorded price of the equity symbol on the date
- symbol: Symbol of the listed company
- volume: Total traded volume of the equity symbol on the date
- date: Date of record

In this assignment, we will work on the stock prices dataset named "prices.csv". Task is to create a Neural Network to classify closing price for a stock based on some parameters.

Firstly, let's select TensorFlow version 2.x in colab

In [0]:
%tensorflow_version 2.x
import tensorflow
tensorflow.__version__

In [0]:
# Initialize the random number generator
import random
random.seed(0)

# Ignore the warnings
import warnings
warnings.filterwarnings("ignore")

## Question 1

### Load the data
- load the csv file and read it using pandas
- file name is prices.csv

In [0]:
# run this cell to to mount the google drive if you are using google colab

from google.colab import drive
drive.mount('/content/drive')

In [0]:
import pandas as pd
import numpy as np
df=pd.read_csv("/prices (1).csv")

## Question 2

### Drop null
- Drop null values if any

In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 438305 entries, 0 to 438304
Data columns (total 7 columns):
date      438305 non-null object
symbol    438304 non-null object
open      438304 non-null float64
close     438304 non-null float64
low       438304 non-null float64
high      438304 non-null float64
volume    438304 non-null float64
dtypes: float64(5), object(2)
memory usage: 23.4+ MB


In [11]:
df.isnull().sum()

date      0
symbol    1
open      1
close     1
low       1
high      1
volume    1
dtype: int64

In [0]:
df.dropna(inplace=True)

### Drop columnns
- Now, we don't need "date", "volume" and "symbol" column
- drop "date", "volume" and "symbol" column from the data


In [0]:
df1=df.drop(['date', 'volume','symbol'], axis=1)

## Question 3

### Print the dataframe
- print the modified dataframe

In [15]:
df1.head()

Unnamed: 0,open,close,low,high
0,123.43,125.839996,122.309998,126.25
1,125.239998,119.980003,119.940002,125.540001
2,116.379997,114.949997,114.93,119.739998
3,115.480003,116.620003,113.5,117.440002
4,117.010002,114.970001,114.089996,117.330002


### Get features and label from the dataset in separate variable
- Let's separate labels and features now. We are going to predict the value for "close" column so that will be our label. Our features will be "open", "low", "high"
- Take "open" "low", "high" columns as features
- Take "close" column as label

In [0]:
X=df1[['open','low','high']]
y=df1[['close']]

## Question 4

### Create train and test sets
- Split the data into training and testing

In [0]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

## Question 5

### Scaling
- Scale the data (features only)
- Use StandarScaler

In [0]:
from sklearn.preprocessing import StandardScaler

# Define the scaler 
scaler = StandardScaler().fit(X_train)

# Scale the train set
X_train = scaler.transform(X_train)

# Scale the test set
X_test = scaler.transform(X_test)

## Question 6

### Convert data to NumPy array
- Convert features and labels to numpy array

In [0]:
X_train=np.array(X_train)
X_test=np.array(X_test)
y_train=np.array(y_train)
y_test=np.array(y_test)

### Reshape features
- Reshape the features to make it suitable for input in the model 

In [21]:
X_train.reshape(X_train.shape[0],X_train.shape[1],1)
X_test.reshape(X_test.shape[0],X_test.shape[1],1)
y_train.reshape(y_train.shape[0],y_train.shape[1],1)
y_test.reshape(y_test.shape[0],y_test.shape[1],1)

array([[[ 42.860001]],

       [[103.650002]],

       [[ 52.66    ]],

       ...,

       [[ 36.419998]],

       [[ 93.400002]],

       [[ 71.370003]]])

## Question 7

### Define Model
- Initialize a Sequential model
- Add a Flatten layer
- Add a Dense layer with one neuron as output
  - add 'linear' as activation function


In [22]:

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense,Flatten
from keras.layers.advanced_activations import ReLU

# define the model architecture

model = Sequential()

# Add an input layer 
model.add(Dense(30, activation='relu'))
model.add(Flatten())
model.add(Dense(20,activation="sigmoid"))
model.add(Dense(units=1,activation='linear'))



Using TensorFlow backend.


## Question 8

### Compile the model
- Compile the model
- Use "sgd" optimizer
- for calculating loss, use mean squared error

In [0]:
model.compile(loss='mean_squared_error',
              optimizer='sgd') #optimizer is the hyper parameter

## Question 9

### Fit the model
- epochs: 50
- batch size: 128
- specify validation data

In [24]:
epochs = 50
batch_size = 128
history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=.3, verbose=True)

Train on 230109 samples, validate on 98619 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Question 10

### Evaluate the model
- Evaluate the model on test data

In [0]:
loss= model.evaluate(X_test, y_test, verbose=False)

### Manual predictions
- Test the predictions on manual inputs
- We have scaled out training data, so we need to transform our custom inputs using the object of the scaler
- Example of manual input: [123.430000,	122.30999, 116.250000]

In [0]:
import numpy as np
y_pred=np.round(model.predict(X_test))

In [0]:
new_test=[123.430000, 122.30999, 116.250000]
ntest=np.array(new_test)

In [0]:
inp=scaler.transform(ntest.reshape(1,-1))

In [29]:
pred=model.predict(inp)
pred

array([[60.910503]], dtype=float32)