<center><img src="https://github.com/insaid2018/Term-1/blob/master/Images/INSAID_Full%20Logo.png?raw=true" width="240" height="110" /></center>

# <center>Keras Assignment</center>

<br> 
<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/image/keras-logo-2018-large-1200.png" width="500" height="150" /></center>

## Table of Content


1. [Linear Regression in Keras](#section1)<br>
2. [Logistic Regression in Keras](#section2)<br>


<a id=section1></a>

## Linear Regression in Keras

### The Boston Housing Price dataset

- You’ll attempt to predict the **median price of homes** in a given Boston suburb in the mid-1970s, given data points about the suburb at the time, such as the crime rate, the local property tax rate, and so on. 


- The dataset has relatively few data points: only **506**, **split** between **404 training** samples and **102 test** samples. 


- And each **feature** in the input data (for example, the crime rate) has a **different scale**. 

  - For instance, some values are proportions, which take values between 0 and 1; 

  - Others take values between 1 and 12, others between 0 and 100, and so on.

### 1. Loading the Boston housing dataset

In [None]:
# Import tensorflow 2.x
# This code block will only work in Google Colab.
try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

TensorFlow 2.x selected.


In [None]:
import numpy as np

from tensorflow.keras.datasets import boston_housing

In [None]:
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()

- Let’s look at the data:

In [None]:
train_data.shape

In [None]:
test_data.shape

- As you can see, you have 404 training samples and 102 test samples, each with 13 numerical features, such as per capita crime rate, average number of rooms per dwelling, accessibility to highways, and so on.


- The targets are the median values of owner-occupied homes, in thousands of dollars:

In [None]:
train_targets

### 2. Preparing the Data

- Normalizing the data.

In [None]:
mean = train_data.mean(axis=0)

In [None]:
train_data -= mean

In [None]:
std = train_data.std(axis=0)

In [None]:
train_data /= std

In [None]:
test_data -= mean

In [None]:
test_data /= std

### 3. Build the Network

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

#### Question 1: Instantiate a Sequential model.

In [None]:
def init_model():
    model = # Instantiate your model here.

    return model

In [None]:
model = init_model()

#### Question 2: Add layers to the model.

- Add 3 Dense layers to the model.


- 1st layer should contain:
  - **64** hidden units
  - **relu** activation function
  - input shape in the form of a tuple **(num_of_columns_in_train_set,)**


- 2nd layer should contain:
  - **64** hidden units
  - **relu** activation function


- 3rd layer should contain:
  - Hidden units equal to the number of outputs we get at the end of a regression problem i.e. **1**.
  - Use **linear** activation function or don't use activation function at all (this will use the default activation function which is linear).

In [None]:
def build_model(model):
    # Add layers to your model here.

    return model

In [None]:
model = build_model(model)

In [None]:
model.summary()

#### Question 3: Compile the model.

- Use **adam** `optimizer`, and **mse** as `loss`.

- Use **mae** in the `metrics` parameter.

In [None]:
def compile_model(model):
    # Compile your model here.

    return model

In [None]:
model = compile_model(model)

#### Question 4: Fit the Model

- Fit the model with the training data.

- Use **80 number of epochs**, and **batch size of 16** in the `fit` function.

- Don't specify any other hyperparameter.

In [None]:
def fit_model(model):
    # Fit your model here.

    return model

In [None]:
model = fit_model(model)

#### Question 5: Evaluate the Model on the Test data.

- Use the `evaluate` funtion to evaluate the model on the test data.

In [None]:
def evaluate_model(model):
    test_mse_score, test_mae_score = # Evaluate your Model on the Test data here.

    return test_mse_score, test_mae_score

In [None]:
test_mse_score, test_mae_score = evaluate_model(model)

- Here’s the final result on the test dataset:

In [None]:
print('Mean Absolute Error:', test_mae_score)
print('Mean Squared Error:', test_mse_score)

In [None]:
print('Root Mean Squared Error:', np.sqrt(test_mse_score))

## Logistic Regression in Keras

### Titanic Dataset

The goal is to __predict survival__ of passengers travelling in RMS __Titanic__ using __Logistic regression__.

### 1. Loading the Titanic Dataset

In [None]:
import numpy as np
import pandas as pd

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/insaid2018/Term-1/master/Data/Casestudy/titanic_train.csv')
df.head()

In [None]:
df.info()

- The dataset consists of the information about people boarding the famous RMS Titanic. Various variables present in the dataset includes data of age, sex, fare, ticket etc. 


- The dataset comprises of __891 observations of 12 columns__. Below is a table showing names of all the columns and their description.

| Column Name   | Description                                               |
| ------------- |:---------------------------------------------------------:| 
| PassengerId   | Passenger Identity                                        | 
| Survived      | Whether passenger survived or not                         |  
| Pclass        | Class of ticket                                           | 
| Name          | Name of passenger                                         |   
| Sex           | Sex of passenger                                          |
| Age           | Age of passenger                                          |
| SibSp         | Number of sibling and/or spouse travelling with passenger |
| Parch         | Number of parent and/or children travelling with passenger|
| Ticket        | Ticket number                                             |
| Fare          | Price of ticket                                           |
| Cabin         | Cabin number                                              |

### 2. Preparing the Data

- Dealing with missing values<br/>
    - Dropping/Replacing missing entries of __Embarked.__
    - Replacing missing values of __Age__ and __Fare__ with median values.
    - Dropping the column __'Cabin'__ as it has too many _null_ values.

In [None]:
df.Embarked = df.Embarked.fillna(df['Embarked'].mode()[0])

In [None]:
median_age = df.Age.median()
median_age

In [None]:
median_fare = df.Fare.median()
median_fare

In [None]:
df.Age.fillna(median_age, inplace = True)
df.Fare.fillna(median_fare, inplace = True)

In [None]:
df.drop('Cabin', axis = 1,inplace = True)

- Creating a new feature named __FamilySize__.

In [None]:
df['FamilySize'] = df['SibSp'] + df['Parch'] + 1
df.head()

- Segmenting __Sex__ column as per __Age__, Age less than 15 as __Child__, Age greater than 15 as __Males and Females__ as per their gender.

In [None]:
df['GenderClass'] = df.apply(lambda x: 'child' if x['Age'] < 15 else x['Sex'], axis=1)

In [None]:
df[df.Age < 15].head(2)

In [None]:
df[df.Age > 15].head(2)

- __Dummification__ of __GenderClass__ & __Embarked__.

In [None]:
df = pd.get_dummies(df, columns=['GenderClass', 'Embarked'], drop_first=True)

- __Dropping__ columns __'Name' , 'Ticket' , 'Sex' , 'SibSp' and 'Parch'__ 

In [None]:
df.drop(['Name','Ticket','Sex','SibSp','Parch'], axis = 1, inplace=True)
df.head()

- Preparing X and y using pandas.

In [None]:
X = df.loc[:, df.columns != 'Survived']
X.head()

In [None]:
y = df.Survived
y[:10]

- Splitting the dataset into train and test sets.

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1)

In [None]:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

### 3. Build the Network

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

#### Question 6: Instantiate a Sequential model.

In [None]:
def init_model():
    model = # Instantiate your model here.

    return model

In [None]:
model = init_model()

#### Question 7: Add layers to the model.

- Add 3 Dense layers to the model.


- 1st layer should contain:
  - **64** hidden units
  - **relu** activation function
  - input shape in the form of a tuple **(num_of_columns_in_train_set,)**


- 2nd layer should contain:
  - **64** hidden units
  - **relu** activation function


- 3rd layer should contain:
  - Hidden units equal to the number of outputs we get at the end of a binary classification problem i.e. **1**.
  - Use **sigmoid** activation function (**sigmoid** activation function is used in **binary** classification problems in the last layer).

In [None]:
def build_model(model):
    # Add layers to your model here.

    return model

In [None]:
model = build_model(model)

In [None]:
model.summary()

#### Question 8: Compile the model.

- Use **adam** `optimizer`, and **binary_crossentropy** as `loss`.

- Use **accuracy** in the `metrics` parameter.

In [None]:
def compile_model(model):
    # Compile your model here.

    return model

In [None]:
model = compile_model(model)

#### Question 9: Fit the Model

- Fit the model with the training data.

- Use **100 number of epochs**, and **batch size of 16** in the `fit` function.

- Don't specify any other hyperparameter.

In [None]:
def fit_model(model):
    history = # Fit your model here.

    return model, history

In [None]:
model, history = fit_model(model)

#### Question 10: Evaluate the Model on the Test data.

- Use the `evaluate` funtion to evaluate the model on the test data.

In [None]:
def evaluate_model(model):
    loss, accuracy = # Evaluate your Model on the Test data here.

    return loss, accuracy

In [None]:
loss, accuracy = evaluate_model(model)

- Here’s the final result:

In [None]:
print('Accuracy:', accuracy)