# Learning rate scheduler
In PyTorch, a "scheduler" refers to a learning rate scheduler, which is used to adjust the learning rate during training. It's strategy to modify the learning rate in a predefined manner or in response to certain metrics such as the validation loss not improving, or simply as the training progresses. This can help in fine-tuning the training process, potentially leading to faster convergence or better final performance by overcoming hurdles like getting stuch in local minima or dealing with teh vanishing/exploding gradient problem.


## Types of Learning Rate Schedulers
PyTorch provides several built-in schedulers through its **torch.optim.lr_scheduler** module, including:

* **StepLR**: Decays the learning rate of each parameter group by a factor of **gamma** every **step_size** spochs.
* **MultiStepLR**: Allows decaying the learning rate at specific epochs.
* **ExponentialLR**: Decays the learning rate of each parameter group by a factor of **gamma** every epoch.
* **ReduceLROnPlateau**: Reduces learning rate when a metric has stopped improving, which is particularly useful when you're monitoring validation loss.
* **CosineAnnealingLR**: Adjusts the learning rate according to acosine curve between initial lr set in teh optimizer to **eta_min**, on a schedule defined by **T_max** (half a cycle).

## Example: Using a StepLR Scheduler
Here's a simple example of how to use a learing rate scheduler in PyTorch, specificalyy the **StepLR** scheduler, which, is straightforward and commonly used for demonstration purpose:

In [26]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import StepLR
import numpy as np
import pandas as pd
from torch.utils.data import Dataset, DataLoader
from logging import currentframe

np.random.seed(42)  # For reproducibility
date_rng = pd.date_range(start='1/1/2020', end='31/12/2020', freq='H')
df = pd.DataFrame(date_rng, columns=['date'])
df['electricity_consumption'] = np.random.rand(len(df)) * 100
df['last_consumption'] = df['electricity_consumption'].shift(1)
df = df.set_index('date')
df.dropna(inplace=True)
df = df.iloc[:100]
df = df/ df.iloc[0]
df

Unnamed: 0_level_0,electricity_consumption,last_consumption
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-01 01:00:00,1.000000,1.000000
2020-01-01 02:00:00,0.769941,2.538351
2020-01-01 03:00:00,0.629693,1.954381
2020-01-01 04:00:00,0.164107,1.598383
2020-01-01 05:00:00,0.164081,0.416561
...,...,...
2020-01-05 00:00:00,0.549832,1.318405
2020-01-05 01:00:00,0.449705,1.395666
2020-01-05 02:00:00,0.026737,1.141509
2020-01-05 03:00:00,0.113485,0.067868


In [31]:
class ElectricityDataset(Dataset):
    def __init__(self, df):
        self.df = df
    def __len__(self):
        return len(self.df)
    def __getitem__(self, idx):
        return torch.tensor(self.df.iloc[idx, 0], dtype=torch.float32), torch.tensor(self.df.iloc[idx, 1], dtype=torch.float32)

dataset = ElectricityDataset(df)
# set DataLoader
train_loader = DataLoader(dataset, batch_size=20, shuffle=True)

In [32]:
# Define a simple model
model = nn.Linear(1, 1)
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1) # Start with a learning rate of 0.1
# optimizer = optim.SGD(model.parameters(), lr=0.01)

# Define the learning rate scheduler
scheduler = StepLR(optimizer, step_size=30, gamma=0.1)

# Number of epochs
num_epochs = 40

# Training loop
for epoch in range(num_epochs):
    for inputs, labels in train_loader:
        # Forward pass
        inputs = inputs.unsqueeze(1)
        labels = labels.unsqueeze(1)
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Step the scheduler
        scheduler.step()

        # Pring learning rate and other details
        if (epoch+1) % 10 == 0:
            current_lr = optimizer.param_groups[0]['lr']
            print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}, Learning Rate: {current_lr:.6f}')

Epoch [10/40], Loss: 0.4685, Learning Rate: 0.010000
Epoch [10/40], Loss: 0.6227, Learning Rate: 0.010000
Epoch [10/40], Loss: 0.6729, Learning Rate: 0.010000
Epoch [10/40], Loss: 0.7129, Learning Rate: 0.010000
Epoch [10/40], Loss: 0.6510, Learning Rate: 0.010000
Epoch [20/40], Loss: 0.8358, Learning Rate: 0.000100
Epoch [20/40], Loss: 0.5540, Learning Rate: 0.000100
Epoch [20/40], Loss: 0.4726, Learning Rate: 0.000100
Epoch [20/40], Loss: 0.7488, Learning Rate: 0.000100
Epoch [20/40], Loss: 0.5084, Learning Rate: 0.000100
Epoch [30/40], Loss: 0.6440, Learning Rate: 0.000010
Epoch [30/40], Loss: 0.7428, Learning Rate: 0.000010
Epoch [30/40], Loss: 0.3615, Learning Rate: 0.000010
Epoch [30/40], Loss: 0.6602, Learning Rate: 0.000010
Epoch [30/40], Loss: 0.7111, Learning Rate: 0.000001
Epoch [40/40], Loss: 0.8687, Learning Rate: 0.000000
Epoch [40/40], Loss: 0.6064, Learning Rate: 0.000000
Epoch [40/40], Loss: 0.5072, Learning Rate: 0.000000
Epoch [40/40], Loss: 0.5385, Learning Rate: 0.

In this example, the **StepLR** scheduler is initiated with a **step_size** of 30 and a **gamma** of of 0.1. This configuration means that every 30 epochs, the learning rate will be multiplied by 0.1 (i.e., reduced to 10% of its current value).

## When to Use a Scheduler
The choice of when to adjust the learning rate and by how much depends largely on empirical evidence and specific problem characteristics. Common strategies include reducing the learning rate when progress on a validation set stalls or methodically reducing it after certain epochs have passed. Experimeting with different schedulers and their parameters is often necessary to find the best training setup for your specific model and data.