# Alat Pengoptimalan Model Tensorflow (TMO)

Di notebook ini, kami akan mendemonstrasikan cara menggunakan TMO untuk mengoptimalkan model penerapan. Kami melatih model pada kumpulan data MNIST dan kemudian mengoptimalkannya menggunakan TMO. Kami kemudian akan membandingkan ukuran dan keakuratan model yang dioptimalkan dengan model aslinya.

## Siapkan TMO

Pertama, kita menginstal TMO dan mengimpor paket yang diperlukan.

In [1]:
%pip install -q tensorflow
%pip install -q tensorflow-model-optimization

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/242.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━[0m [32m163.8/242.5 kB[0m [31m5.0 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m235.5/242.5 kB[0m [31m5.2 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m242.5/242.5 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import tensorflow as tf
import tensorflow_model_optimization as tfmot
from tensorflow import keras
import pathlib
import numpy as np


## Kuantisasi Pasca Pelatihan

Alat kuantisasi pasca pelatihan mengubah bobot model terlatih dari presisi 32 bit menjadi 8 bit. Alat ini mengonversi model TensorFlow float yang sudah dilatih saat kita mengonversinya ke format TensorFlow Lite menggunakan [TensorFlow Lite Converter](https://www.tensorflow.org/lite/models/convert/)

### Muat kumpulan data MNIST

Kami memuat kumpulan data MNIST dari Keras dan mempersiapkannya untuk pelatihan.

In [3]:
# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 and 1.
train_images = train_images / 255.0
test_images = test_images / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


### Latih Modelnya

Selanjutnya, kita mendefinisikan model CNN dan melatihnya pada dataset MNIST.

In [4]:
# Define the model architecture
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.fit(
  train_images,
  train_labels,
  epochs=1,
  validation_data=(test_images, test_labels)
)



<tf_keras.src.callbacks.History at 0x7f19297a00a0>

### Konversi Model ke TFLite

Setelah melatih model, kami mengonversinya ke format [TFLite](https://www.tensorflow.org/lite/guide ) dan kemudian melakukan kuantisasi selama konversi.

In [5]:
tflite_models_dir = pathlib.Path("notebooks/Unit 9 - Model Optimization/models")
tflite_models_dir.mkdir(exist_ok=True, parents=True)
converter = tf.lite.TFLiteConverter.from_keras_model(model)

# without quantization
tflite_model = converter.convert()
tflite_model_file = tflite_models_dir/"original_model.tflite"
tflite_model_file.write_bytes(tflite_model)

# with quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
tflite_model_quant_file = tflite_models_dir/"quantized_model.tflite"
tflite_model_quant_file.write_bytes(tflite_quant_model)

23968

### Periksa Ukuran Model

Ukuran model terkuantisasi jauh lebih kecil dibandingkan model aslinya.

In [6]:
%ls -lh {tflite_models_dir}

ls: cannot access 'notebooks/Unit': No such file or directory
ls: cannot access '9': No such file or directory
ls: cannot access '-': No such file or directory
ls: cannot access 'Model': No such file or directory
ls: cannot access 'Optimization/models': No such file or directory


### Periksa Akurasi Model

Selanjutnya, kami mengevaluasi keakuratan model terkuantisasi pada kumpulan data pengujian dan membandingkannya dengan model aslinya.
Berdasarkan hasil terlihat bahwa keakuratan model terkuantisasi sangat mendekati model aslinya.

In [7]:
# A helper function to evaluate the TF Lite model using "test" dataset.
def evaluate_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for test_image in test_images:
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  # Compare prediction results with ground truth labels to calculate accuracy.
  accurate_count = 0
  for index in range(len(prediction_digits)):
    if prediction_digits[index] == test_labels[index]:
      accurate_count += 1
  accuracy = accurate_count * 1.0 / len(prediction_digits)

  return accuracy


interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))
interpreter.allocate_tensors()
print("Original model accuracy = ", evaluate_model(interpreter))


interpreter_quant = tf.lite.Interpreter(model_path=str(tflite_model_quant_file))
interpreter_quant.allocate_tensors()
print("Quantized model accuracy = ", evaluate_model(interpreter_quant))

Original model accuracy =  0.966
Quantized model accuracy =  0.9659


## Pemangkasan

Pemangkasan merupakan suatu teknik untuk memperkecil ukuran model dengan menghilangkan beban-beban yang tidak penting. Hal ini ditentukan oleh besarnya bobot. Kita dapat menggunakan pemangkasan saat melatih model untuk memperkecil ukuran model.

In [8]:
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

# Compute end step to finish pruning after 2 epochs.
batch_size = 128
epochs = 2
validation_split = 0.1 # 10% of training set will be used for validation set.

num_images = train_images.shape[0] * (1 - validation_split)
end_step = np.ceil(num_images / batch_size).astype(np.int32) * epochs

# Define model for pruning.
pruning_params = {
      'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.50,
                                                               final_sparsity=0.80,
                                                               begin_step=0,
                                                               end_step=end_step)
}

model_for_pruning = prune_low_magnitude(model, **pruning_params)

# `prune_low_magnitude` requires a recompile.
model_for_pruning.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

print(model_for_pruning.summary())

callbacks = [
  tfmot.sparsity.keras.UpdatePruningStep(),
]

model_for_pruning.fit(train_images, train_labels,
                  batch_size=batch_size, epochs=epochs, validation_split=validation_split,
                  callbacks=callbacks)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 prune_low_magnitude_reshap  (None, 28, 28, 1)         1         
 e (PruneLowMagnitude)                                           
                                                                 
 prune_low_magnitude_conv2d  (None, 26, 26, 12)        230       
  (PruneLowMagnitude)                                            
                                                                 
 prune_low_magnitude_max_po  (None, 13, 13, 12)        1         
 oling2d (PruneLowMagnitude                                      
 )                                                               
                                                                 
 prune_low_magnitude_flatte  (None, 2028)              1         
 n (PruneLowMagnitude)                                           
                                                        

<tf_keras.src.callbacks.History at 0x7f1929e85930>

### Bandingkan Akurasi

Terlihat bahwa keakuratan model yang dipangkas sangat mendekati model aslinya.

In [9]:
_, baseline_model_accuracy = model.evaluate(
    test_images, test_labels, verbose=0)
_, model_for_pruning_accuracy = model_for_pruning.evaluate(
   test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)
print('Pruned test accuracy:', model_for_pruning_accuracy)

Baseline test accuracy: 0.965399980545044
Pruned test accuracy: 0.965399980545044


### Bandingkan Ukuran Model

Terakhir, kami membandingkan ukuran model yang dipangkas dengan model aslinya.

In [10]:
model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)

pruning_converter = tf.lite.TFLiteConverter.from_keras_model(model_for_export)
pruned_tflite_model = pruning_converter.convert()
pruned_model_file = tflite_models_dir/"pruned_model.tflite"
pruned_model_file.write_bytes(pruned_tflite_model)

84616

In [11]:
%ls -lh {tflite_models_dir}

ls: cannot access 'notebooks/Unit': No such file or directory
ls: cannot access '9': No such file or directory
ls: cannot access '-': No such file or directory
ls: cannot access 'Model': No such file or directory
ls: cannot access 'Optimization/models': No such file or directory
