# Quantization Modes


In [None]:
# define imports
import numpy as np
from pathlib import Path

from quantization_notes.utils.utils import plot_table

OUTPUT_DIR = Path("../output")

## Symmetric Quantization

In symmetric quantization, we map the floating-point range to the quantized range with respect to $0$ - think along the lines of $[-100,100]$. To do so, we choose $\alpha = \max|x_f| = -\max(x_f)$ and $\max(x_f)$, where $x_f$ is some number in the floating-point range. 

Additionally, we choose $N_{bins} = 2^n$, where $n$ is the number of bits we want to quantize to. In order to derive our quantization range with respect to $0$, we simply place half the bins before $0$ and the other half after $0$.

Example: Let's say we wanted an 8-bit quantization range. Then, the number of bins would be $N_{bins} = 2^8 = 256 \implies [0,255] \implies [-128,127]$. That is the "full range" symmetric around 0. However in practice, this range is generally "restricted" to $[-127,127]$. We can derive scaling factors to map from floating point to quantized for both ranges:

$$
\def\arraystretch{1.5}
\begin{array}{c|c|c}
& \text{Full Range} & \text{Restricted Range} \\ \hline
\text{Quantized Range} & [-\frac{N_{bins}}{2},\frac{N_{bins}}{2}-1] & [-(\frac{N_{bins}}{2}-1),\frac{N_{bins}}{2}-1] \\ \hline
\text{8-bit Example} & [-128,127] & [-127,127] \\ \hline
\text{Scale Factor} & q_x=\frac{(2^n-1)/2}{\alpha} & q_x=\frac{2^{n-1}-1}{\alpha} \\
\end {array}
$$

Finally, we can compute our symmetric quantized tensor:

$$x_q = \text{round}(q_xx_f)$$


In [None]:
def quantize_symmetric_full(n: int, x_f: np.array) -> tuple[float, np.array]:
    q_x = ((pow(2, n) - 1) / 2) / np.max(np.abs(x_f))
    x_q = np.round(q_x * x_f)
    return q_x, x_q


def quantize_symmetric_restricted(n: int, x_f: np.array) -> tuple[float, np.array]:
    q_x = (pow(2, n - 1) - 1) / np.max(np.abs(x_f))
    x_q = np.round(q_x * x_f)
    return q_x, x_q


data = np.random.uniform(-1000, 1000, size=(5, 5))
print(f"unquantized data:\n {data} \n")

quantized_symmetric_full_scale, quantized_symmetric_full_data = quantize_symmetric_full(8, data)
print(f"quantize symmetric full\n {quantized_symmetric_full_data} \n")

quantized_symmetric_restricted_scale, quantized_symmetric_restricted_data = quantize_symmetric_restricted(8, data)
print(f"quantize symmetric restricted\n {quantized_symmetric_restricted_data} \n")

In [None]:
plot_table(data, str(OUTPUT_DIR / "unquantized-data.png"), "Oranges", "black", "black")
plot_table(
    quantized_symmetric_full_data,
    str(OUTPUT_DIR / "quantized-symmetric-full.png"),
    "Blues",
    "black",
    "black",
)
plot_table(
    quantized_symmetric_restricted_data,
    str(OUTPUT_DIR / "quantized-symmetric-restricted.png"),
    "Blues",
    "black",
    "black",
)

## Asymmetric Quantization

Compared to symmetric quantization, asymmetric also known as *affine* quantization has a few key differences:

- Instead of mapping the floating point range $[-\alpha, \alpha]$ to some quantized range symmetric to $0$, we choose $[\alpha, \beta]$, where $\alpha=\min(x_f)$ and $\beta=\max(x_f)$
- Since the mapping is no longer symmetric to $0$, we have to introduce a **zero-point** $z_{x}$ also known as *quantization bias*. This serves to actually represent $0$ in the quantized range.
- Additionally, the quantized range is now represented by $[0,2^n-1]$.

We can derive a scaling factor for asymmetric quantization:

$$
\def\arraystretch{1.5}
\begin{array}{c|c}
\text{Quantized Range} & [0,N_{bins}-1] \\ \hline
\text{8-bit Example} & [0, 255] \\ \hline
\text{Scale Factor} & q_x=\frac{2^n-1}{\beta-\alpha} \\
\end {array}
$$

To find the zero-point $z_{x}$, simply take the minimum value of $x_f$ i.e. $\alpha$ and scale it to its representation in the new quantized range, which is:

$$
z_x = \text{round}(q_x\alpha)
$$

Finally, we can compute our asymmetric quantized tensor $q_x$ adjusted for bias:

$$
x_q = \text{round}(q_x(x_f-z_x))
$$

## Dequantization and Error

Dequantization of a symmetric quantization is as simple as computing the quotient of the quantized tensor $x_q$ and the scale factor $q_x$:

$$\hat{x_f} = \frac{x_q}{q_x}$$ 

However as expected, there is some error associated with the process of quantizing a tensor and then dequantizing it. We can compute that error using mean-squared error (MSE):

$$
\text{MSE} = \frac{1}{n}\displaystyle\sum(\hat{x_f}-x_f)^2
$$


In [None]:
def dequantize_symmetric(q_x: float, x_q: np.array) -> np.array:
    return x_q / q_x

def mse(a: np.array, b: np.array) -> float:
    return np.mean((a-b)**2)

dequantized_symmetric_full_data = dequantize_symmetric(quantized_symmetric_full_scale, quantized_symmetric_full_data)
dequantized_symmetric_full_error = mse(dequantized_symmetric_full_data, data)

print(f"dequantized symmetric full data: \n {dequantized_symmetric_full_data} \n")
print(f"dequantized symmetric full error: \n {dequantized_symmetric_full_error:.2}% \n")

dequantized_symmetric_restricted_data = dequantize_symmetric(quantized_symmetric_restricted_scale, quantized_symmetric_restricted_data)
dequantized_symmetric_restricted_error = mse(dequantized_symmetric_restricted_data, data)

print(f"dequantized symmetric restricted data: \n {dequantized_symmetric_restricted_data} \n")
print(f"dequantized symmetric restricted error: \n {dequantized_symmetric_restricted_error:.2}% \n")