## Homework 1:

1- Build a Keras Model for linear regression (check: https://keras.io/activations/). Use Boston Housing Dataset to train and test your model

2- Build a Keras Model for logistic regression. Use diabetes.csv to train and test

Comments:

1- Build the **simplest model** for linear regression with Keras and compare your model performance with `from sklearn.linear_model import LinearRegression`

2- Build the **simplest model** for logistic regression with Keras and compare your model performance with `from sklearn.linear_model import LogisticRegression`

3- **Add more complexity to your models in (1) and (2)** and compare with previous results

## Imports

In [3]:
# Keras imports
import tensorflow.keras as K
from keras.layers import Input, Dense
from keras.models import Model, Sequential
# import the data, and function to split it
from sklearn.datasets import load_boston, load_diabetes
from sklearn.model_selection import train_test_split
# the models from sklearn
from sklearn.linear_model import LinearRegression, LogisticRegression
# metrics for evaluating regression and classification from sklearn
from sklearn.metrics import mean_squared_error, confusion_matrix
# Pandas and Numpy for data analysis, mathematical computation
import pandas as pd
import numpy as np

## 1 - Linear Regression in Keras (Boston Dataset)

In [4]:
# store the Boston data in variables
boston = load_boston()
boston_X, boston_y = boston.data, boston.target

# split the data
x_train, x_test, y_train, y_test = train_test_split(boston_X, boston_y, test_size=0.3, random_state=0)

# remind ourselves about the details of the dataset
print(boston.DESCR)

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pu

### Implemenation of the Deep Learning Model

In [7]:
# using the Functional API
inp = Input(shape=(13,))
x = Dense(64, activation='sigmoid')(inp)
# Output layer - one output neuron, and activation function is linear
out = Dense(1, activation='linear')(x)
dl_linreg = Model(inputs=inp, outputs=out)
# The loss function should be mse or mae
dl_linreg.compile(optimizer='adam', loss='mse', metrics=["mean_squared_error"])
dl_linreg.fit(x_train, y_train, epochs=100, batch_size=1, verbose=0);
loss, error = dl_linreg.evaluate(x_test, y_test, verbose=0)
print("MSE = {:.2f}".format(error))

MSE = 32.31


### Compared to SciKit-Learn

In [8]:
# Instantiation and Training
ml_linreg = LinearRegression().fit(x_train, y_train)
# Testing and Evaluating the Model
y_pred = ml_linreg.predict(x_test)
error = round(mean_squared_error(y_test, y_pred), 2)
print(f"MSE: {error}")

MSE: 27.2


**Conclusion: Keras vs. Scikit-Learn for Regression?**

As you can see above, the linear regression model implemented in Keras has a lower error score than the one implemented in Scikit-learn. However, I will confess that the Keras model also has a lot more *variance* than the Scikit-learn model. The result you see from cell 32 above is only the result after I ran the cell multiple times. Sometimes the MSE from the Keras model was lower than from Scikit-learn; and other times it was not, in a seemingly random pattern.

## 2 - Logistic Regression in Keras (Diabetes Dataset)

In [9]:
# store the diabetes data in variables
pima = pd.read_csv('diabetes.csv')

feature_cols = ['Pregnancies', 'Insulin', 'BMI', 'Age']

# X is a matrix, access the features we want in feature_cols
diabetes_X = pima[feature_cols]

# y is a vector, hence we use dot to access 'label'
diabetes_y = pima['Outcome']

# split the data
x_train, x_test, y_train, y_test = train_test_split(diabetes_X, diabetes_y, test_size=0.3, random_state=0)

# remind ourselves about the details of the dataset
pima.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


### Implementation in Keras

In [10]:
# Keras Sequential API
dl_logreg = Sequential()
# in binary classification, the activation function is sigmoidal
dl_logreg.add(Dense(1, input_shape=(4,), activation='sigmoid'))
# the loss function is binary_crossentropy
dl_logreg.compile(optimizer='adam', loss='binary_crossentropy', metrics=["accuracy"])
dl_logreg.fit(x_train, y_train, epochs=100, batch_size=1, verbose=1);
loss, accuracy = dl_logreg.evaluate(x_test, y_test, verbose=0)
print("Accuracy = {:.2f}".format(accuracy))

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch

Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Accuracy = 0.63


### Compared to SciKit-Learn

In [11]:
# Instantiating and Training the Model
ml_logreg = LogisticRegression().fit(x_train, y_train)
# Testing the Model
y_pred = ml_logreg.predict(x_test)
# Evaluating Model Accuracy
confusion = confusion_matrix(y_test, y_pred)
TN, FP, FN, TP = confusion.ravel()
accuracy = round(TP + TN / len(y_pred), 2)
print(f'Accuracy: {accuracy}%')

Accuracy: 21.6%


**Final Conclusion: Keras vs. Scikit-Learn for Classification?**

As you can see above, the Keras logistic regression model achieved a higher accuracy percentage for the task of classifying subjects in the Pima Indians Diabetes dataset, as opposed to the one implemented using scikit-learn. Additionally, this value is stable; the Keras model's accuracy appeared not to change after multiple executions of the cell.

## 3 - Improving Results

### Improving the Sklearn model

In [None]:
# Normalizing the Boston data to a Z-distribution
