# Asymmetric and Symmetric Quantization

## Notation:

- alpha (max value): The maximum value of the input data.
- beta (min value): The minimum value of the input data.
- S (scale): The factor that maps floating-point values to integer values.
- Z (zero-point): The offset applied to align the quantized values.

---

## Asymmetric Quantization

1. **Compute Scale (S):**  
   $$ S = \frac{\alpha - \beta}{2^n - 1} $$

2. **Compute Zero-Point (Z):**  
   $$ Z = \text{round} \left(-\frac{\beta}{S} \right) $$

3. **Quantize Values:**  
   $$ x_q = \text{clip} \left( \text{round} \left(\frac{x}{S} \right) + Z, 0, 2^n - 1 \right) $$



---

## Symmetric Quantization

## Notation:

- alpha : The absolute maximum value of the input data.
- S (scale): The factor that maps floating-point values to integer values.
- Z (zero-point): The offset applied to align the quantized values.

1. **Compute Scale (S):**  
   $$ S = \frac{\alpha}{2^{n-1} - 1} $$

2. **Quantize Values:**  
   $$ x_q = \text{clip} \left( \text{round} \left(\frac{x}{S} \right), - (2^{n-1} - 1), 2^{n-1} - 1 \right) $$

In [1]:
import numpy as np

def asymmetric_quantization(x, alpha, beta, n_bits=8):
    qmin, qmax = 0, 2**n_bits - 1  # Range for n-bit representation

    # Compute scale and zero-point
    S = (alpha - beta) / (qmax - qmin)
    Z = int(round(-beta / S))

    # Quantize values
    x_q = np.clip(np.round(x / S) + Z, qmin, qmax).astype(int)

    return x_q, S, Z

def symmetric_quantization(x, alpha, n_bits=8):
    qmin, qmax = -(2**(n_bits-1)), 2**(n_bits-1) - 1  # Symmetric range

    # Compute scale
    S = alpha / qmax

    # Quantize values
    x_q = np.clip(np.round(x / S), qmin, qmax).astype(int)

    return x_q, S


In [2]:
import numpy as np

# Suppress scientific notation
np.set_printoptions(suppress=True)

# Generate randomly distributed parameters
params = np.random.randn(20)

# Make sure important values are at the beginning for better debugging
params[0] = params.max() + 1
params[1] = params.min() - 1
params[2] = 0

# Round each number to the second decimal place
params = np.round(params, 2)

# Print the parameters
print(params)



[ 2.34 -3.42  0.   -0.1  -0.32 -0.08 -0.47  1.34 -1.   -1.11  0.69  0.84
 -2.42  0.61 -0.4  -0.61  1.23 -0.18 -0.57 -0.14]


In [3]:
# Given vector and range
alpha = max(params)  # max value
beta = min(params)  # min value

# Perform asymmetric quantization
quantized_x_asym, scale_asym, zero_point_asym = asymmetric_quantization(params, alpha, beta)

# Perform symmetric quantization
quantized_x_sym, scale_sym = symmetric_quantization(params, alpha)

# Print results
print("Asymmetric Quantization:")
print("Quantized values:", quantized_x_asym)
print("Scale (S):", scale_asym)
print("Zero-point (Z):", zero_point_asym)

print("\nSymmetric Quantization:")
print("Quantized values:", quantized_x_sym)
print("Scale (S):", scale_sym)

Asymmetric Quantization:
Quantized values: [255   0 151 147 137 147 130 210 107 102 182 188  44 178 133 124 205 143
 126 145]
Scale (S): 0.022588235294117645
Zero-point (Z): 151

Symmetric Quantization:
Quantized values: [ 127 -128    0   -5  -17   -4  -26   73  -54  -60   37   46 -128   33
  -22  -33   67  -10  -31   -8]
Scale (S): 0.0184251968503937


In [4]:
import numpy as np

def asymmetric_d_quantization(x, S, Z):
    # qmin, qmax = 0, 2**n_bits - 1  # Range for n-bit representation

    # # Compute scale and zero-point
    # S = (alpha - beta) / (qmax - qmin)
    # Z = int(round(-beta / S))

    # Quantize values
    x_dq = (x - Z)* S

    return x_dq

def symmetric_d_quantization(x, S):

    # Quantize values
    x_dq = (x * S)

    return x_dq


# Quantization errors

In [6]:
from sklearn.metrics import mean_squared_error

print(mean_squared_error(params, asymmetric_d_quantization(quantized_x_asym, scale_asym, zero_point_asym)))
print(mean_squared_error(params, symmetric_d_quantization(quantized_x_sym, scale_sym)))

4.1563321799308393e-05
0.05656446493892984
