<a href="https://colab.research.google.com/github/deltorobarba/machinelearning/blob/master/optimizer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Optimizer

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd

mpl.rcParams['figure.figsize'] = (8, 6)
mpl.rcParams['axes.grid'] = False

## Summary

![Optimizer](https://raw.githubusercontent.com/deltorobarba/repo/master/optimizer_1.png)

# Activation Functions

Comparison of Activation Functions:
[Wikipedia](https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions)


[tf.keras Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)

## Stochastic Gradient Descent

* xxx


## Adam

https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c

https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/

* xxx


## Adadelta

* xxx


## Adagrad

* xxx


## RMSprop

* xxx


# LSTM Model

## Import Data

This tutorial uses a <a href="https://www.bgc-jena.mpg.de/wetter/" class="external">[weather time series dataset</a> recorded by the <a href="https://www.bgc-jena.mpg.de" class="external">Max Planck Institute for Biogeochemistry</a>.

In [3]:
# "https://www.bgc-jena.mpg.de/wetter/" (weather time series dataset)
zip_path = tf.keras.utils.get_file(
    origin='https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip',
    fname='jena_climate_2009_2016.csv.zip',
    extract=True)
csv_path, _ = os.path.splitext(zip_path)

# Read file
df = pd.read_csv(csv_path)

# Select Univariate Data
uni_data = df['T (degC)']
uni_data.index = df['Date Time']

# Define Window Size
def univariate_data(dataset, start_index, end_index, history_size, target_size):
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i)
    # Reshape data from (history_size,) to (history_size, 1)
    data.append(np.reshape(dataset[indices], (history_size, 1)))
    labels.append(dataset[i+target_size])
  return np.array(data), np.array(labels)

# Train test Split (first 300,000 rows of the data will be the training dataset, 
# there remaining will be the validation dataset. 
# This amounts to ~2100 days worth of training data.
TRAIN_SPLIT = 300000
tf.random.set_seed(13)
uni_data = uni_data.values

# Compute mean and standard deviation of training data
uni_train_mean = uni_data[:TRAIN_SPLIT].mean()
uni_train_std = uni_data[:TRAIN_SPLIT].std()

# Standardize the data
uni_data = (uni_data-uni_train_mean)/uni_train_std

# Create Data Pipeline (the model will be given the last 20 recorded temperature observations, and needs to learn to predict the temperature at the next time step.)
univariate_past_history = 20
univariate_future_target = 0

x_train_uni, y_train_uni = univariate_data(uni_data, 0, TRAIN_SPLIT,
                                           univariate_past_history,
                                           univariate_future_target)
x_val_uni, y_val_uni = univariate_data(uni_data, TRAIN_SPLIT, None,
                                       univariate_past_history,
                                       univariate_future_target)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip


## Choose Optimizer

In [0]:
optimizer = 'adam'
# optimizer = 'adamax'
# optimizer = 'adadelta'
# optimizer = 'adagrad'
# optimizer = 'ftrl'
# optimizer = 'nadam'
# optimizer = 'optimizer'
# optimizer = 'RMSprop'
# optimizer = 'sgd'

## Define Model

In [0]:
# tf.data to shuffle, batch, and cache the dataset. 
BATCH_SIZE = 256
BUFFER_SIZE = 10000

train_univariate = tf.data.Dataset.from_tensor_slices((x_train_uni, y_train_uni))
train_univariate = train_univariate.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()

val_univariate = tf.data.Dataset.from_tensor_slices((x_val_uni, y_val_uni))
val_univariate = val_univariate.batch(BATCH_SIZE).repeat()

# LSTM requires the input shape of the data it is being given.

simple_lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(8, input_shape=x_train_uni.shape[-2:]),
    tf.keras.layers.Dense(1)
])

simple_lstm_model.compile(optimizer=optimizer, loss='mae')

## Train & Evaluate Loss

In [6]:
EVALUATION_INTERVAL = 200
EPOCHS = 10

simple_lstm_model.fit(train_univariate, epochs=EPOCHS,
                      steps_per_epoch=EVALUATION_INTERVAL,
                      validation_data=val_univariate, validation_steps=50)

Train for 200 steps, validate for 50 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fe25b4b5d30>

The recurrent activation function 'relu' has a much lower evaluation loss than 'sigmoid', but it takes longer to compute.