# TensorFlow & Keras - Basics of Deep Learning

### Most importantly... resources

https://www.tensorflow.org/api_docs

https://keras.io/

https://www.tensorflow.org/tutorials/

https://www.google.com

## TF overview

* #### "End-to-end machine learning platform" 

    - Not the only one! Check out PyTorch, Theano, Cognitive Toolkit.
   
* #### Integrates with high-level APIs like Keras
* #### Plays nice with Pandas
* #### Makes deep learning *fast* and *easy* *
    *<sup>"easy"</sup>

## Tasks for TensorFlow:

* #### Regression
    - Predict house prices
    - Predict drug metabolic rates
    - Predict stock trends *
    
    *<sup>this is super hard</sup>
    
    

* #### Classification
    - Cat or dog?
    - Malignant or benign cancer from images
    ![](media/dr.png)
    <span style="font-size:0.75em;">Google AI Blog: Diabetic Retinopathy</span>



* #### Dimensionality reduction
    - Visualize high-dimensional data in 2 or 3-D space
    - Compress representations for successive ML



* #### Generative models
    - Create new molecules with desirable properties
    - Artificially enhance image resolution
    ![](media/molecular_gan.png)
    <span style="font-size:0.75em;">Kadurin et al., 2017</span>


* #### Reinforcement learning
    - Can't beat your friends at chess? Make your computer do it



* #### Much more...
    - Generic math
    - Probabilistic programming with TFP
    - Automatic differentiation
    - ...


## Let's Regress

### Imports!

In [None]:
import numpy as np
import pandas as pd

Name a more iconic duo, I'll wait

#### New imports -- TF and Keras

In [None]:
import keras
import tensorflow as tf

Check our versions for good measure -- these programs may have very different behavior version-to-version

In [None]:
print(keras.__version__)
print(tf.__version__)

#### Loading in housing data as with SKLearn

In [None]:
data = pd.read_csv('kc_house_data.csv')
data

In [None]:
data["yr_built"].unique()

In [None]:
column_selection = ["bedrooms","bathrooms","sqft_living","sqft_lot",
                    "floors","condition","grade","sqft_above",
                    "sqft_basement","sqft_living15","sqft_lot15",
                    "lat", "long","yr_built","yr_renovated","waterfront"]

selected_feature = np.array(data[column_selection])
price = np.array(data["price"])
selected_feature_train = selected_feature[:20000]
price_train = price[:20000]

selected_feature_test = selected_feature[20000:]
price_test = price[20000:]

In [None]:
def score(y,y_pred):
    return np.mean(np.abs(y-y_pred)/y)

In [None]:
model = keras.Sequential()

In [None]:
input_len = len(column_selection)
model.add(keras.layers.Dense(50, input_dim=input_len, activation='relu'))
model.add(keras.layers.Dense(50, activation='relu'))
model.add(keras.layers.Dense(1))

In [None]:
model.compile(loss='mean_squared_error', optimizer='adam')

In [None]:
history = model.fit(selected_feature_train, price_train,
                        epochs=50, batch_size=128)

In [None]:
preds = model.predict(selected_feature_test)
score(preds, price_test)

### Like SKLearn, it's easy to train and evaluate simple models.
#### ... but we should try to do better

## Practical Deep Learning -- What you need to know
### Train, Validation, Test:
   * Optimize parameters with Train (weights, biases)
   * Optimize hyperparameters with Validation (layer width & depth, activation functions, etc.)
   * Optimize NOTHING with Test

In [None]:
# Split out a validation set for hyperparameter optimization

selected_feature_train = selected_feature[:18000]
price_train = price[:18000]
selected_feature_val = selected_feature[18000:20000]
price_val = price[18000:20000]
selected_feature_test = selected_feature[20000:]
price_test = price[20000:]

### Try a hyperparameter optimization:

### Try three activation functions to use for dense layers in the neural network above. Save the model that achieves the best validation loss 

#### Hint: [activation functions](http://letmegooglethat.com/?q=keras+activation+functions)

#### Hint: `model.fit` has argument "`validation_data`" which takes a tuple of features and targets

#### Hint: Use `model.save("filename.h5")` to save a model locally. If you want to use it later, just call `keras.models.load_model("filename.h5")`

In [None]:
# For easy looping, define neural network model as a function
def nn_model(optimizer='adam',
             activation='relu',
             layers=[20,20],
             loss='mean_squared_error'):
    
    model = keras.Sequential()
    model.add(keras.layers.Dense(50, input_dim=input_len, activation=activ))
    model.add(keras.layers.Dense(50, activation=activ))
    model.add(keras.layers.Dense(1))

    model.compile(loss='mean_absolute_error', optimizer='adam')
    
    return model

In [None]:
best_score = 1000.0 # bad

# loop over chosen activation functions, train, evaluate on validation
for activ in ['sigmoid', 'tanh', 'relu']:
    model = nn_model(activation=activ)

    history = model.fit(selected_feature_train, price_train,
                epochs=50, batch_size=128,
                validation_data=(selected_feature_val, price_val))
    model_score = score(model.predict(selected_feature_val), price_val)

    if model_score < best_score:
        best_score = model_score
        best_activ = activ
        best_model = model
        best_train = history

print(f"BEST ACTIVATION FUNCTION {best_activ} WITH SCORE {best_score}")
best_model.save("awesome_model.h5")


### Visualize your training:

In [None]:
import matplotlib.pyplot as plt

# plot loss during training
def plot_loss(hist):
    %matplotlib inline
    plt.title('Training Curve')
    plt.plot(hist.history['loss'], label='train')
    plt.plot(hist.history['val_loss'], label='validation')
    plt.xlabel("Epochs")
    plt.ylabel("Mean squared error")
    plt.legend()
    plt.show()

plot_loss(best_train)

#### In the future, try better validation schemes like [k-fold cross validation](https://chrisalbon.com/deep_learning/keras/k-fold_cross-validating_neural_networks/), though 80/20 or 90/10 train/val like this works in a pinch

### Standardize your features:
* Typically assumes normally distributed feature, shifting mean to 0 and standard deviation to 1
* In theory does not matter for neural networks
* In practice tends to matter for neural networks
* Scale if using:
    - Logistic regression
    - Support vector machines
    - Perceptrons
    - Neural networks
    - Principle component analysis
* Don't bother if using:
    - "Forest" methods
    - Naive Bayes

In [None]:
from sklearn.preprocessing import StandardScaler

# Instantiate StandardScaler
in_scaler = StandardScaler()

# Fit scaler to the training set and perform the transformation
selected_feature_train = in_scaler.fit_transform(selected_feature_train)

# Use the fitted scaler to transform validation and test features
selected_feature_val = in_scaler.transform(selected_feature_val)
selected_feature_test = in_scaler.transform(selected_feature_test)

# Check appropriate scaling
print(np.mean(selected_feature_train[:,0]))
print(np.std(selected_feature_train[:,0]))

print(np.mean(selected_feature_val[:,0]))
print(np.std(selected_feature_val[:,0]))

print(np.mean(selected_feature_test[:,0]))
print(np.std(selected_feature_test[:,0]))

In [None]:
model = nn_model()

model.compile(loss='mean_squared_error', optimizer='adam')

history = model.fit(selected_feature_train, price_train,
            epochs=200, batch_size=128,
            validation_data=(selected_feature_val, price_val))
model_score = score(model.predict(selected_feature_val), price_val)
print(model_score)

plot_loss(history)

#### In the future, consider standardizing outputs as well

### Regularize:
* Heavily parameterized models like neural networks are prone to overfitting
* Popular off-the-shelf tools exist to regularize models and prevent overfitting:
    - L2 regularization (weight decay)
    - Dropout
    - Batch normalization
    
#### These tools come as standard Keras/TF layers!
`model.add(keras.layers.Dropout(rate)`
`model.add(keras.layers.ActivityRegularization(l1=0.0, l2=0.0)`
`model.add(keras.layers.BatchNormalization())`

### Early stopping and model checkpointing:
#### It's unlikely the last iteration is the best, and who knows how long until the thing is converged. Just grab the best validation error.

In [None]:
# Set callback functions to early stop training and save the 
# best model so far
from keras.callbacks import EarlyStopping, ModelCheckpoint

callbacks = [EarlyStopping(monitor='val_loss', patience=2),
            ModelCheckpoint(filepath='best_model.h5',
                            monitor='val_loss',
                            save_best_only=True,
                           verbose=1)]

model = nn_model(layers=[20,20,20])

model.compile(loss='mean_squared_error', optimizer='adam')

history = model.fit(selected_feature_train, price_train,
            epochs=400, callbacks=callbacks, batch_size=128,
            validation_data=(selected_feature_val, price_val))

model_score = score(model.predict(selected_feature_val), price_val)
print(f"Model score: {model_score}")
plot_loss(history)

### You don't have to remember these resources because they're here when you need them
https://www.tensorflow.org/api_docs

https://keras.io/

https://www.tensorflow.org/tutorials/

https://www.google.com

### Don't trust me, trust your validation errors
### Don't look at your test set until you're actually going to test