# L2-B - Linear Quantization I: Get the Scale and Zero Point

In this lesson, continue to learn about fundamentals of linear quantization, and implement your own Linear Quantizer.

Run the next cell to import all of the functions you have used before in the previous lesson(s) of `Linear Quantization I` to follow along with the video.

- To access the `helper.py` file, you can click `File --> Open...`, on the top left.

In [3]:
import torch

from helper import linear_q_with_scale_and_zero_point, linear_dequantization, plot_quantization_errors

### a dummy tensor to test the implementation
test_tensor=torch.tensor(
    [[191.6, -13.5, 728.6],
     [92.14, 295.5,  -184],
     [0,     684.6, 245.5]]
)

ModuleNotFoundError: No module named 'helper'

## Finding `Scale` and `Zero Point` for Quantization

In [1]:
q_min = torch.iinfo(torch.int8).min
q_max = torch.iinfo(torch.int8).max

NameError: name 'torch' is not defined

In [None]:
q_min

In [None]:
q_max

In [None]:
# r_min = test_tensor.min()
r_min = test_tensor.min().item()

In [None]:
r_min

In [None]:
r_max = test_tensor.max().item()

In [None]:
r_max

In [None]:
scale = (r_max - r_min) / (q_max - q_min)

In [None]:
scale

In [None]:
zero_point = q_min - (r_min / scale)

In [None]:
zero_point

In [None]:
zero_point = int(round(zero_point))

In [None]:
zero_point

- Now, put all of this in a function.

In [None]:
def get_q_scale_and_zero_point(tensor, dtype=torch.int8):
    
    q_min, q_max = torch.iinfo(dtype).min, torch.iinfo(dtype).max
    r_min, r_max = tensor.min().item(), tensor.max().item()

    scale = (r_max - r_min) / (q_max - q_min)

    zero_point = q_min - (r_min / scale)

    # clip the zero_point to fall in [quantized_min, quantized_max]
    if zero_point < q_min:
        zero_point = q_min
    elif zero_point > q_max:
        zero_point = q_max
    else:
        # round and cast to int
        zero_point = int(round(zero_point))
    
    return scale, zero_point

- Test the implementation using the `test_tensor` defined earlier.
```Python
[[191.6, -13.5, 728.6],
 [92.14, 295.5,  -184],
 [0,     684.6, 245.5]]
```

In [None]:
new_scale, new_zero_point = get_q_scale_and_zero_point(
    test_tensor)

In [None]:
new_scale

In [None]:
new_zero_point

## Quantization and Dequantization with Calculated `Scale` and `Zero Point`

- Use the calculated `scale` and `zero_point` with the functions `linear_q_with_scale_and_zero_point` and `linear_dequantization`.

In [None]:
quantized_tensor = linear_q_with_scale_and_zero_point(
    test_tensor, new_scale, new_zero_point)

In [None]:
dequantized_tensor = linear_dequantization(quantized_tensor,
                                           new_scale, new_zero_point)

- Plot to see how the Quantization Error looks like after using calculated `scale` and `zero_point`.

In [None]:
plot_quantization_errors(test_tensor, quantized_tensor, 
                         dequantized_tensor)

In [None]:
(dequantized_tensor-test_tensor).square().mean()

### Put Everything Together: Your Own Linear Quantizer

- Now, put everything togther to make your own Linear Quantizer.

In [None]:
def linear_quantization(tensor, dtype=torch.int8):
    scale, zero_point = get_q_scale_and_zero_point(tensor, 
                                                   dtype=dtype)
    
    quantized_tensor = linear_q_with_scale_and_zero_point(tensor,
                                                          scale, 
                                                          zero_point, 
                                                          dtype=dtype)
    
    return quantized_tensor, scale , zero_point

- Test your implementation on a random matrix.

In [None]:
r_tensor = torch.randn((4, 4))

**Note:** Since the values are random, what you see in the video might be different than what you will get.

In [None]:
r_tensor

In [None]:
quantized_tensor, scale, zero_point = linear_quantization(r_tensor)

In [None]:
quantized_tensor

In [None]:
scale

In [None]:
zero_point

In [None]:
dequantized_tensor = linear_dequantization(quantized_tensor,
                                           scale, zero_point)

In [None]:
plot_quantization_errors(r_tensor, quantized_tensor,
                         dequantized_tensor)

In [None]:
(dequantized_tensor-r_tensor).square().mean()