# Intro to Deep Learning with Keras

This Jupyter notebook contains code and explanations for the 2018 AIS Intro to Deep Learning workshop.

## How to Use This Notebook
This notebook has several cells, some with markdown and others with runnable Python code. To run a cell, click on the cell and then use the **SHIFT + ENTER** keyboard shortcut or navigate to **Cell** in the top menu bar and click on **Run Cells** in the dropdown menu.

## Software Prerequisites
Make sure to install the following software/libraries:

- **Anaconda** - Python distribution with many useful libraries
- **TensorFlow** - deep learning library, acts as a backend for Keras
- **Keras** - a high-level deep learning library that runs on top of TensorFlow

## Libraries Used

- **Numpy** - for handling linear algebra and numerical computations in machine learning.
- **Pandas** - for reading in, preprocessing, and analyzing data.
- **SciKit-Learn** - a general-purpose ML library.
- **Keras** - features a simple API for deep learning

## Importing Numpy and Pandas

In [36]:
import numpy as np
import pandas as pd

## Reading in the Data

In [37]:
data = pd.read_csv('housing_data.csv')
data.head()

Unnamed: 0,Id,MSSubClass,MSZoning,LotArea,Utilities,LotConfig,Neighborhood,Condition1,Condition2,BldgType,...,KitchenAbvGr,Fireplaces,GarageCars,PoolArea,YrSold,SaleType,SaleCondition,SalePrice,Bathrooms,PorchSF
0,1,60,RL,8450,AllPub,Inside,CollgCr,Norm,Norm,1Fam,...,1,0,2,0,2008,WD,Normal,208500,3,61
1,2,20,RL,9600,AllPub,FR2,Veenker,Feedr,Norm,1Fam,...,1,1,2,0,2007,WD,Normal,181500,2,0
2,3,60,RL,11250,AllPub,Inside,CollgCr,Norm,Norm,1Fam,...,1,1,2,0,2008,WD,Normal,223500,3,42
3,4,70,RL,9550,AllPub,Corner,Crawfor,Norm,Norm,1Fam,...,1,1,3,0,2006,WD,Abnorml,140000,1,307
4,5,60,RL,14260,AllPub,FR2,NoRidge,Norm,Norm,1Fam,...,1,1,3,0,2008,WD,Normal,250000,3,84


In [38]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1460 entries, 0 to 1459
Data columns (total 36 columns):
Id               1460 non-null int64
MSSubClass       1460 non-null int64
MSZoning         1460 non-null object
LotArea          1460 non-null int64
Utilities        1460 non-null object
LotConfig        1460 non-null object
Neighborhood     1460 non-null object
Condition1       1460 non-null object
Condition2       1460 non-null object
BldgType         1460 non-null object
HouseStyle       1460 non-null object
OverallQual      1460 non-null int64
OverallCond      1460 non-null int64
YearBuilt        1460 non-null int64
YearRemodAdd     1460 non-null int64
RoofStyle        1460 non-null object
Exterior1st      1460 non-null object
ExterQual        1460 non-null object
ExterCond        1460 non-null object
Foundation       1460 non-null object
BsmtFinSF2       1460 non-null int64
HeatingQC        1460 non-null object
CentralAir       1460 non-null object
1stFlrSF         1460 non-n

### Label Encoding the Data

In [39]:
object_cols = list(data.select_dtypes(include=['object']).columns)

In [40]:
from sklearn.preprocessing import LabelEncoder
for col in object_cols:
    lbl = LabelEncoder()
    data[col] = lbl.fit_transform(data[col])

### Scaling the Data

In [41]:
from sklearn.preprocessing import StandardScaler
X = data.drop('SalePrice', axis=1)
y = data['SalePrice']
for col in X.columns:
    scaler = StandardScaler()
    X[col] = scaler.fit_transform(X[col].values.reshape(-1, 1))



### Splitting the Data into Training and Testing Sets

In [42]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [46]:
y_train

135     174000
1452    145000
762     215200
932     320000
435     212000
629     168500
1210    189000
1118    140000
1084    187500
158     254900
967     135000
1259    151000
551     112500
497     184000
1031    197000
1262    161500
1013     85000
1311    203000
566     325000
610     313000
1278    237000
1263    180500
816     137000
438      90350
940     150900
96      214000
560     121500
1182    745000
471     190000
1004    181000
         ...  
747     265979
252     173000
21      139400
1337     52500
459     110000
1184    186700
276     201000
955     145000
1215    125000
385     192000
805     227680
1437    394617
343     266000
769     538000
1332    100000
130     226000
871     200500
1123    118000
1396    160000
87      164500
330     119000
1238    142500
466     167000
121     100000
1044    278000
1095    176432
1130    135000
1294    115000
860     189950
1126    174000
Name: SalePrice, Length: 1022, dtype: int64

### Building the Structure of a Neural Network in Keras

In [44]:
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(35, input_dim=35, activation='sigmoid'))
model.add(Dense(70, activation='sigmoid'))
model.add(Dense(35, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_13 (Dense)             (None, 35)                1260      
_________________________________________________________________
dense_14 (Dense)             (None, 70)                2520      
_________________________________________________________________
dense_15 (Dense)             (None, 35)                2485      
_________________________________________________________________
dense_16 (Dense)             (None, 1)                 36        
Total params: 6,301
Trainable params: 6,301
Non-trainable params: 0
_________________________________________________________________
None


### Adding a Loss Function and Optimizer to the Neural Network

In [45]:
model.compile(loss='mse', optimizer='adam')

### Training the Model