## param_groups

In this notebook we understand the param_groups of the optimizer subclass.

In [1]:
## Importing necessary packages ##

import torch
import torch.optim as optim

In [2]:
## Defining 3 tensors ## 

tensor_1 = torch.randn(3, 3 , requires_grad = True)
tensor_2 = torch.randn(3, 3 , requires_grad = True)
tensor_3 = torch.randn(3, 3 , requires_grad = True)

## Checking the values ##

print('-----------------------------------------------')
print('----------------Tensor 1-----------------------')
print(tensor_1)
print('-----------------------------------------------')
print('----------------Tensor 2-----------------------')
print(tensor_2)
print('-----------------------------------------------')
print('----------------Tensor 3-----------------------')
print(tensor_3)

-----------------------------------------------
----------------Tensor 1-----------------------
tensor([[ 0.8125,  0.1984,  0.9420],
        [-0.4620,  0.5406, -1.3698],
        [ 0.4793,  0.2847,  0.4569]], requires_grad=True)
-----------------------------------------------
----------------Tensor 2-----------------------
tensor([[-1.2722,  0.8374, -0.6923],
        [-1.0081,  0.3518, -1.0024],
        [ 0.6581, -0.1415,  0.2916]], requires_grad=True)
-----------------------------------------------
----------------Tensor 3-----------------------
tensor([[ 1.1640, -0.8308,  0.1520],
        [-0.7118,  1.5370, -0.1858],
        [ 1.0072, -1.2591, -2.5712]], requires_grad=True)


Now lets define our optimizer.

We will be looking at two examples.

**Example 1**

In [3]:
## Setting first optimizer ##

optimizer_1 = optim.SGD([tensor_1, tensor_2], lr = 3e-4)

## printing the parameter groups ##

print(optimizer_1.param_groups)

[{'params': [tensor([[ 0.8125,  0.1984,  0.9420],
        [-0.4620,  0.5406, -1.3698],
        [ 0.4793,  0.2847,  0.4569]], requires_grad=True), tensor([[-1.2722,  0.8374, -0.6923],
        [-1.0081,  0.3518, -1.0024],
        [ 0.6581, -0.1415,  0.2916]], requires_grad=True)], 'lr': 0.0003, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}]


Okay, so it gives a list of dictionaries.

The list has only one set of values. This is because we set uniform attribute for both the tensors.

**Example 2**

In [5]:
## Setting optimizer##

optimizer_2 = optim.Adam([{'params': tensor_1, 'lr': 2e-4},
                          {'params': tensor_2, 'lr': 3e-4},
                          {'params': tensor_3, 'lr': 1e-2}])

## printing the parameter groups ##

print(optimizer_2.param_groups)

[{'params': [tensor([[ 0.8125,  0.1984,  0.9420],
        [-0.4620,  0.5406, -1.3698],
        [ 0.4793,  0.2847,  0.4569]], requires_grad=True)], 'lr': 0.0002, 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, {'params': [tensor([[-1.2722,  0.8374, -0.6923],
        [-1.0081,  0.3518, -1.0024],
        [ 0.6581, -0.1415,  0.2916]], requires_grad=True)], 'lr': 0.0003, 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, {'params': [tensor([[ 1.1640, -0.8308,  0.1520],
        [-0.7118,  1.5370, -0.1858],
        [ 1.0072, -1.2591, -2.5712]], requires_grad=True)], 'lr': 0.01, 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}]


Now this has 3 values because we have set separate params with separate lr attribute.

So, to get like the attributes for the tensor_2 we need to index to that point.

```python
## indexing to Tensor_2 attributes ##
optimizer_2.param_groups[1]
```

In [6]:
## indexing to Tensor_2 attributes ##

optimizer_2.param_groups[1]

{'params': [tensor([[-1.2722,  0.8374, -0.6923],
          [-1.0081,  0.3518, -1.0024],
          [ 0.6581, -0.1415,  0.2916]], requires_grad=True)],
 'lr': 0.0003,
 'betas': (0.9, 0.999),
 'eps': 1e-08,
 'weight_decay': 0,
 'amsgrad': False}

Now we can see that this is a dictionary. 

So we can get into the values of any of the element by just indexing to it and changing the value.

So, lets do that.

In [7]:
## Changing the value of learning rate ##

optimizer_2.param_groups[1]['lr'] = 1

optimizer_2.param_groups[1]

{'params': [tensor([[-1.2722,  0.8374, -0.6923],
          [-1.0081,  0.3518, -1.0024],
          [ 0.6581, -0.1415,  0.2916]], requires_grad=True)],
 'lr': 1,
 'betas': (0.9, 0.999),
 'eps': 1e-08,
 'weight_decay': 0,
 'amsgrad': False}

And its changed.

This concept is neccessary when we are implementing learning rate scheduler.

In the adjacent python file we are going to implement the learning rate finder much like that of the `fast.ai` package.