#### About

> Model compression

Model compression refers to the process of reducing the size of a machine learning model while maintaining its performance. This is often done to make the model more efficient in terms of storage, memory usage, and inference speed, especially when the model is deployed in resource-constrained environments such as mobile devices or embedded systems. 

A common method used for model compression is quantization, which involves converting model parameters (eg, weights, biases) from floating-point values ​​to less precise fixed-point values. This reduces the model's memory footprint and computational requirements.

In [1]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import QuantileTransformer

In [2]:
iris = load_iris()
X = iris.data
y = iris.target

In [3]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [4]:
model = LogisticRegression()
model.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [5]:
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Original Model Accuracy:", accuracy)

Original Model Accuracy: 1.0


In [7]:
# Quantize the model's parameters to 8-bit integers
quantizer = QuantileTransformer(output_distribution='uniform', n_quantiles=256)
for param in model.coef_:
    param_orig_shape = param.shape
    param = param.reshape(-1, 1)
    param_quantized = quantizer.fit_transform(param)
    param_quantized = (param_quantized * 255).astype('uint8')
    param = quantizer.inverse_transform(param_quantized / 255).reshape(param_orig_shape)
    param[:] = param



In [8]:

# Evaluate the accuracy of the quantized model
y_pred_quantized = model.predict(X_test)
accuracy_quantized = accuracy_score(y_test, y_pred_quantized)
print("Quantized Model Accuracy:", accuracy_quantized)

Quantized Model Accuracy: 1.0
