# Learning Rate Scheduler

Run this notebook on Google Colab:

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AG-Peter/encodermap/blob/main/tutorials/notebooks_customization/learning_rate_schedulers.ipynb)

Find the documentation of EncoderMap:

https://ag-peter.github.io/encodermap

### For Google colab only:

If you're on Google colab, please uncomment these lines and install EncoderMap.

In [1]:
# !wget https://raw.githubusercontent.com/AG-Peter/encodermap/main/tutorials/install_encodermap_google_colab.sh
# !sudo bash install_encodermap_google_colab.sh

If you're on Google Colab, you also want to download the data we will use:

In [2]:
# !wget https://raw.githubusercontent.com/AG-Peter/encodermap/main/tutorials/notebooks_starter/asp7.csv

## Primer

In this tutorial you will learn how to use the `LearningRateScheduler` to dynamically alter the learning rate of your encodermap trainings. As usual we will begin by importing some modules.

In [3]:
import numpy as np
import encodermap as em
import tensorflow as tf
import pandas as pd

2023-02-06 10:08:06.472774: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-06 10:08:06.615423: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
2023-02-06 10:08:06.615446: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


2023-02-06 10:08:07.340373: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
2023-02-06 10:08:07.340468: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib




We wil work in the directory `runs/lr_scheduler`. So we will create it and then worry about tensorflow later.

In [4]:
import os
os.makedirs('runs/lr_scheduler', exist_ok=True)

## Log the current learning rate to Tensorboard

Before we implement some dynamic learning rates we want to find a way to log the learning rate to tensorboard. We will log the training to `runs/lr_scheduler` so navigate to this directory with

```bash
$ cd runs/lr_scheduler
```

and start tensorboard.

```bash
$ tensorboard --logdir . --reload_multifile True
```
You should now be able to open TensorBoard in your webbrowser on port 6006.
0.0.0.0:6006 or 127.0.0.1:6006

If you're on Google colab, you can start tensorboard by uncommenting the next cell:

In [5]:
# %load_ext tensorboard
# %tensorboard --logdir .

To write our current learning rate to tensorboard we will use `EncoderMap`'s `EncoderMapBaseCallback` and create a subclass of it.

We don't even need to define an `__init__()` method, as we can just use the parent's class `__init__()` method. We only need to overwrite either the `on_summary_step(self, batch, logs={})` or the `on_checkpoint_step(self, batch, logs={})` class methods. This depends, whether you want to execute the method every `summary_step` steps or every `checkpoint_steps`. The difference between the two can be taken from `EncoderMap`'s `Parameter` classes.

In [6]:
print(em.Parameters.defaults_description())

    Parameter            | Default Value            | Description                                         
    ---------------------+--------------------------+---------------------------------------------------  
    n_neurons            | [128, 128, 2]            | List containing number of neurons for each layer    
                         |                          | up to the bottleneck layer. For example [128, 128,  
                         |                          | 2] stands for an autoencoder with the following     
                         |                          | architecture {i, 128, 128, 2, 128, 128, i} where i  
                         |                          | is the number of dimensions of the input data.      
                         |                          | These are Input/Output Layers that are not          
                         |                          | trained.                                            
    ---------------------+-----------

Here's the Logger:

In [7]:
class LearningRateLogger(em.EncoderMapBaseCallback):
    def on_summary_step(self, step, logs=None):
        with tf.name_scope("Learning Rate"):
            tf.summary.scalar('current learning rate', self.model.optimizer.lr, step=step)

We can now create an `EncoderMap` class and add our new callback. Note, how we instantiate our subclass by providing an instance of the `Parameters` class. That's how the callback knows what `summary_step` is.

In [8]:
df = pd.read_csv('asp7.csv')
dihedrals = df.iloc[:,:-1].values.astype(np.float32)
cluster_ids = df.iloc[:,-1].values

parameters = em.Parameters(
tensorboard=True,
periodicity=2*np.pi,
main_path=em.misc.run_path('runs/lr_scheduler'),
n_steps=100,
summary_step=5
)

# create an instance of EncoderMap
e_map = em.EncoderMap(parameters, dihedrals)

# Add an instance of the new Callback
e_map.callbacks.append(LearningRateLogger(parameters))

Output files are saved to runs/lr_scheduler/run0 as defined in 'main_path' in the parameters.


You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.


Saved a text-summary of the model and an image in runs/lr_scheduler/run0, as specified in 'main_path' in the parameters.


2023-02-06 10:08:10.068448: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
2023-02-06 10:08:10.068480: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-02-06 10:08:10.068502: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fv-az175-660): /proc/driver/nvidia/version does not exist
2023-02-06 10:08:10.068734: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


**train**

In [9]:
e_map.train()

  0%|          | 0/100 [00:00<?, ?it/s]

  0%|          | 0/100 [00:00<?, ?it/s, Loss after step ?=?]

  1%|          | 1/100 [00:06<11:31,  6.98s/it, Loss after step ?=?]

  4%|▍         | 4/100 [00:07<11:10,  6.98s/it, Loss after step 5=85.2]

  6%|▌         | 6/100 [00:07<01:22,  1.14it/s, Loss after step 5=85.2]

  9%|▉         | 9/100 [00:07<01:19,  1.14it/s, Loss after step 10=52.5]

 11%|█         | 11/100 [00:07<00:35,  2.50it/s, Loss after step 10=52.5]

 14%|█▍        | 14/100 [00:07<00:34,  2.50it/s, Loss after step 15=44.4]

 16%|█▌        | 16/100 [00:07<00:19,  4.31it/s, Loss after step 15=44.4]

 19%|█▉        | 19/100 [00:07<00:18,  4.31it/s, Loss after step 20=42.4]

 21%|██        | 21/100 [00:07<00:11,  6.67it/s, Loss after step 20=42.4]

 24%|██▍       | 24/100 [00:07<00:11,  6.67it/s, Loss after step 25=44.9]

 27%|██▋       | 27/100 [00:07<00:07, 10.24it/s, Loss after step 25=44.9]

 29%|██▉       | 29/100 [00:07<00:06, 10.24it/s, Loss after step 30=41]  

 32%|███▏      | 32/100 [00:07<00:04, 13.75it/s, Loss after step 30=41]

 34%|███▍      | 34/100 [00:07<00:04, 13.75it/s, Loss after step 35=46.2]

 37%|███▋      | 37/100 [00:07<00:03, 17.79it/s, Loss after step 35=46.2]

 39%|███▉      | 39/100 [00:07<00:03, 17.79it/s, Loss after step 40=43.3]

 42%|████▏     | 42/100 [00:07<00:02, 22.01it/s, Loss after step 40=43.3]

 44%|████▍     | 44/100 [00:07<00:02, 22.01it/s, Loss after step 45=40.7]

 48%|████▊     | 48/100 [00:07<00:01, 27.70it/s, Loss after step 45=40.7]

 49%|████▉     | 49/100 [00:07<00:01, 27.70it/s, Loss after step 50=40.2]

 53%|█████▎    | 53/100 [00:08<00:01, 31.81it/s, Loss after step 50=40.2]

 54%|█████▍    | 54/100 [00:08<00:01, 31.81it/s, Loss after step 55=39.3]

 58%|█████▊    | 58/100 [00:08<00:01, 35.59it/s, Loss after step 55=39.3]

 59%|█████▉    | 59/100 [00:08<00:01, 35.59it/s, Loss after step 60=39.2]

 64%|██████▍   | 64/100 [00:08<00:00, 39.50it/s, Loss after step 60=39.2]

 64%|██████▍   | 64/100 [00:08<00:00, 39.50it/s, Loss after step 65=40.9]

 69%|██████▉   | 69/100 [00:08<00:00, 41.69it/s, Loss after step 65=40.9]

 69%|██████▉   | 69/100 [00:08<00:00, 41.69it/s, Loss after step 70=35.4]

 74%|███████▍  | 74/100 [00:08<00:00, 43.63it/s, Loss after step 70=35.4]

 74%|███████▍  | 74/100 [00:08<00:00, 43.63it/s, Loss after step 75=37.1]

 79%|███████▉  | 79/100 [00:08<00:00, 43.63it/s, Loss after step 80=34.5]

 80%|████████  | 80/100 [00:08<00:00, 45.98it/s, Loss after step 80=34.5]

 84%|████████▍ | 84/100 [00:08<00:00, 45.98it/s, Loss after step 85=35.2]

 86%|████████▌ | 86/100 [00:08<00:00, 46.03it/s, Loss after step 85=35.2]

 89%|████████▉ | 89/100 [00:08<00:00, 46.03it/s, Loss after step 90=33.8]

 92%|█████████▏| 92/100 [00:08<00:00, 47.74it/s, Loss after step 90=33.8]

 94%|█████████▍| 94/100 [00:08<00:00, 47.74it/s, Loss after step 95=35.4]

 98%|█████████▊| 98/100 [00:08<00:00, 48.68it/s, Loss after step 95=35.4]

 99%|█████████▉| 99/100 [00:08<00:00, 48.68it/s, Loss after step 100=34.3]

100%|██████████| 100/100 [00:08<00:00, 11.13it/s, Loss after step 100=34.3]








Here's what Tensorboard should look like:

<img src="lr_scheduler_1.png" width="800">

A constant learning rate of 0.001

## Write a learning rate scheduler.

We can write a learning rate scheduler either by providing intervals of training steps and the associated learning rate:

```python
def lr_schedule(step):
    """
    Returns a custom learning rate that decreases as steps progress.
    """
    learning_rate = 0.2
    if step > 10:
        learning_rate = 0.02
    if step > 20:
        learning_rate = 0.01
    if step > 50:
        learning_rate = 0.005
```

Or by using a function that gives us a learning rate:

```python
def scheduler(step, lr=1, n_steps=1000):
    """
    Returns a custom learning rate that decreases based on an exp function as steps progress.
    """
    if step < 10:
        return lr
    else:
        return lr * tf.math.exp(-step / n_steps)
```

Below, is an example combining both:

In [10]:
def scheduler(step, lr=1):
    """
    Returns a custom learning rate that decreases based on an exp function as steps progress.
    """
    if step < 10:
        return lr
    else:
        return lr * tf.math.exp(-0.1)

This scheduler function can simply be provided to the builtin `keras.callbacks.LearningRateScheduler` callback.

In [11]:
callback = tf.keras.callbacks.LearningRateScheduler(scheduler)

And appended to the list of `callbacks` in the EncoderMap class.

In [12]:
parameters = em.Parameters(
tensorboard=True,
periodicity=2*np.pi,
main_path=em.misc.run_path('runs/lr_scheduler'),
n_steps=50,
summary_step=5
)

e_map = em.EncoderMap(parameters, dihedrals)
e_map.callbacks.append(LearningRateLogger(parameters))
e_map.callbacks.append(callback)

Output files are saved to runs/lr_scheduler/run1 as defined in 'main_path' in the parameters.
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.


Saved a text-summary of the model and an image in runs/lr_scheduler/run1, as specified in 'main_path' in the parameters.


In [13]:
e_map.train()

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s, Loss after step ?=?]

  2%|▏         | 1/50 [00:07<05:45,  7.04s/it, Loss after step ?=?]

  8%|▊         | 4/50 [00:07<05:24,  7.04s/it, Loss after step 5=82.5]

 12%|█▏        | 6/50 [00:07<00:38,  1.13it/s, Loss after step 5=82.5]

 18%|█▊        | 9/50 [00:07<00:36,  1.13it/s, Loss after step 10=44.6]

 22%|██▏       | 11/50 [00:07<00:15,  2.47it/s, Loss after step 10=44.6]

 28%|██▊       | 14/50 [00:07<00:14,  2.47it/s, Loss after step 15=46]  

 32%|███▏      | 16/50 [00:07<00:07,  4.26it/s, Loss after step 15=46]

 38%|███▊      | 19/50 [00:07<00:07,  4.26it/s, Loss after step 20=41.9]

 42%|████▏     | 21/50 [00:07<00:04,  6.59it/s, Loss after step 20=41.9]

 48%|████▊     | 24/50 [00:07<00:03,  6.59it/s, Loss after step 25=40.1]

 52%|█████▏    | 26/50 [00:07<00:02,  9.46it/s, Loss after step 25=40.1]

 58%|█████▊    | 29/50 [00:07<00:02,  9.46it/s, Loss after step 30=41.9]

 62%|██████▏   | 31/50 [00:07<00:01, 13.00it/s, Loss after step 30=41.9]

 68%|██████▊   | 34/50 [00:07<00:01, 13.00it/s, Loss after step 35=42.8]

 72%|███████▏  | 36/50 [00:07<00:00, 17.02it/s, Loss after step 35=42.8]

 78%|███████▊  | 39/50 [00:07<00:00, 17.02it/s, Loss after step 40=44]  

 82%|████████▏ | 41/50 [00:07<00:00, 21.42it/s, Loss after step 40=44]

 88%|████████▊ | 44/50 [00:07<00:00, 21.42it/s, Loss after step 45=41.5]

 92%|█████████▏| 46/50 [00:08<00:00, 25.58it/s, Loss after step 45=41.5]

 98%|█████████▊| 49/50 [00:08<00:00, 25.58it/s, Loss after step 50=41.5]

100%|██████████| 50/50 [00:08<00:00,  6.19it/s, Loss after step 50=41.5]








Here's what Tensorboard should look like:

<img src="lr_scheduler_2.png" width="800">

## Conclusion

Learning rate schedulers are helpful to prevent overtraining, but still slightly increase the predictive power of your NN model. EncoderMap's modularity allows for them to be simple Plug-In solutions.