<a href="https://colab.research.google.com/github/Pigwen/hands-on-sft/blob/main/CHAPTER_2_Loading_a_Quantized_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pre-Reqs

模型参数量到内存占用量的(大致)转换公式
```
model_size_in_mb = (num_params * (bits_per_param / 8)) / 1e6
```

# Quantization in a Nutsshell

量化本质就是分箱，步骤和直方图类似：
* 定义FP32数字的可能范围
* 将其均匀的分成若干个箱子
* 对于每一个数，确定对应的区间(箱子)，并赋予其区间(箱子)id

In [1]:
import torch

torch.manual_seed(11)
weights = torch.randn(1000) * .07
weights.min(), weights.max()

(tensor(-0.2066), tensor(0.2097))

In [2]:
n_bins = 4
bins = torch.linspace(weights.min(), weights.max(), n_bins+1)
bin_width = bins[1] - bins[0]
bins, bin_width

(tensor([-0.2066, -0.1026,  0.0015,  0.1056,  0.2097]), tensor(0.1041))

In [24]:
bin_indexs = ((weights.view(-1, 1) > bins).to(torch.int).argmin(dim=-1) - 1)
print(weights[:20], bin_indexs[:20])

tensor([-0.0358,  0.0720, -0.0247,  0.0086, -0.0127, -0.1048,  0.0099, -0.0367,
        -0.0174, -0.0368,  0.2025, -0.0416,  0.0918,  0.0247, -0.0921, -0.0006,
         0.0174,  0.1101, -0.1148, -0.1115]) tensor([1, 2, 1, 2, 1, 0, 2, 1, 1, 1, 3, 1, 2, 2, 1, 1, 2, 3, 0, 0])


In [25]:
bin_values = bins[:-1]
first_bin = bin_values[0]
bin_values

tensor([-0.2066, -0.1026,  0.0015,  0.1056])

In [26]:
approx_values = bin_indexs * bin_width + first_bin
print(approx_values[:20])

tensor([-0.1026,  0.0015, -0.1026,  0.0015, -0.1026, -0.2066,  0.0015, -0.1026,
        -0.1026, -0.1026,  0.1056, -0.1026,  0.0015,  0.0015, -0.1026, -0.1026,
         0.0015,  0.1056, -0.2066, -0.2066])


In [27]:
print(weights[:20])

tensor([-0.0358,  0.0720, -0.0247,  0.0086, -0.0127, -0.1048,  0.0099, -0.0367,
        -0.0174, -0.0368,  0.2025, -0.0416,  0.0918,  0.0247, -0.0921, -0.0006,
         0.0174,  0.1101, -0.1148, -0.1115])


In [28]:
from torch import nn

mse_fn = nn.MSELoss()
mse_fn(approx_values, weights).sqrt()

tensor(0.0615)

In [None]:
def quantize(weight: torch.Tensor, n_bits: int = 8):
  assert n_bits <= 16