# 02 TensorFlow Lite: Conversion and Optimization

This notebooks uses Tensorflow Lite to convert the previous designed and trained Keras models.
For the moment, the notebook only enables the conversation in supported quantization (see the picture below).


You can check the Appendix notebook `AXX-Exploring-TFL-Conversion.ipynb`, to explore all possible quantization in a systematic manner.

In [None]:
%run '00_README.ipynb'
%run 'H02_TFL-Conversion.ipynb'

## Background Information

[Converter Documentation](https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter)

[More Converter Documentation for TF 2.2](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/convert/index.md)

[Tensorflow Blog about Integer Quantization](https://blog.tensorflow.org/2019/06/tensorflow-integer-quantization.html)

<img src="https://www.tensorflow.org/lite/performance/images/optimization.jpg" style="width: 800px;"/>

#### More on quantizing with TF Lite
- [Documentation](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/convert/quantization.md)
- [Full integer quantization of weights and activations](https://www.tensorflow.org/lite/convert/quantization)
- [Quantization Specification](https://www.tensorflow.org/lite/performance/quantization_spec)

Multiple models can be selected (shift-click) and collective converted and optimized:

In [None]:
model_selection = widgets.SelectMultiple(
    options=sorted(glob.glob("keras-model/*.h5")),
    description='Select model:',
    layout=Layout(width='100%', height='200px')
)
display(model_selection)

In [None]:
# load first model
tf_model_file = model_selection.value[0]
tf_model = tf.keras.models.load_model(tf_model_file)

# set model name
model_name = get_tf_model_string(tf_model_file)

In [None]:
 data_selection = widgets.Dropdown(
    options=sorted(glob.glob("keras-model/*.py")),
    description='Select model:',
    layout=Layout(width='100%')
)
display(data_selection)

In [None]:
tf_model_data = data_selection.value
%run -i {tf_model_data}

## Convert Model into working formats

At this stage we have three conversation available:
1. none: `float32`
2. mixed: `int8` weights, `float32` activations
3. full: `int8`

In [None]:
for _tf_model_file in model_selection.value:
    _tf_model = tf.keras.models.load_model(_tf_model_file)
    convert_tf_model(_tf_model, _tf_model_file, 'none', x_train_normalized[:1000])
    convert_tf_model(_tf_model, _tf_model_file, 'mixed', x_train_normalized[:1000])
    convert_tf_model(_tf_model, _tf_model_file, 'full', x_train_normalized[:1000])

## Filesize differences

In [None]:
tfl_model_files = glob.glob(f'./TFLite-model/*{model_name}*.tflite')
unquantized_model_file = glob.glob(f'./TFLite-model/*{model_name}*Q-none.tflite')[0]

In [None]:
for tfl_model_file in tfl_model_files:
    model_size, reduction = get_tfl_size(tfl_model_file, unquantized_model_file=unquantized_model_file)
    print(f"{tfl_model_file[15:]:<40}" + 
          "{:>20}".format("%10d KiB" %model_size) + 
          "{:>20}".format("(%.2f%% smaller)" %reduction))

#### using gzip
This is interesting to explore potential savings when a pruned networked is used as `.tflite` files do not exploit the pruned model.

In [None]:
for tfl_model_file in tfl_model_files:
    model_size, reduction = get_tfl_size(tfl_model_file, gzip=True, unquantized_model_file=unquantized_model_file)
    print(f"{tfl_model_file[15:]:<40}" + 
          "{:>10}".format("%10d KiB" %model_size) + 
          "{:>10}".format("(%.2f%% smaller)" %reduction))

## Inference with TFLite Interpreter
[Interpreter Documentation](https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter)

### Invoke the model for the whole training set

In [None]:
# original predictions & score (Keras model)
tf_model_predictions = tf_model.predict(x=x_test_normalized)

tf_model_accuracy = calc_accuracy(tf_model_predictions, y_test)
tf_model_loss = loss_fn_crossentropy(y_test, tf_model_predictions).numpy()

# score = tf_model.evaluate(x=x_test_normalized, y=y_test)
# tf_model_loss = score[0]
# tf_model_accuracy = score[1]

# converted models
for tfl_model_file in tfl_model_files:
    print(f"Evaluating {tfl_model_file}:")
    
    tfl_model_predictions = tfl_predict(tfl_model_file, x=x_test_normalized)
    tfl_model_accuracy = calc_accuracy(tfl_model_predictions, y_test)
    
    print("\tOriginal (Keras) model accuracy:\t", tf_model_accuracy)
    print("\tTF Lite Model Accuracy:\t\t\t", tfl_model_accuracy)
    
    tfl_crossentropy_loss = loss_fn_crossentropy(y_test, tfl_model_predictions).numpy()
    # tfl_meansquared_loss = loss_fn_meansquared(y_test, tfl_model_predictions).numpy()

    print("\tOriginal (Keras) model cross entropy loss:\t", tf_model_loss)
    print("\tTF Lite Model cross entropy loss:\t\t", tfl_crossentropy_loss)
    
    
    # What's the error?
    try:
        np.testing.assert_almost_equal(tf_model_predictions, tfl_model_predictions, decimal=2)
    except AssertionError as err:
        #print(f"\t{err}")
        pass

        
    print("________\n\n")

In [None]:
calc_accuracy(tfl_model_predictions, y_test)

## Convert individual models

### Which model?

In [None]:
tfl_files = glob.glob(f'./TFLite-model/*{model_name}*.tflite')
tfl_model_dropdown = widgets.Dropdown(
    options=tfl_files,
    description='Converted Model:'
)
display(tfl_model_dropdown)

### Investigate input and output details

In [None]:
tfl_model_file = tfl_model_dropdown.value

input_details, output_details = get_tfl_details(tfl_model_file)

print("\ninput_details:\n", input_details)
print("\noutput_details:\n", output_details)

### Invoke selected model for a single random image from the testset

In [None]:
image_no = np.random.randint(x_test_normalized.shape[0])

tfl_inference(tfl_model_file, x_test_normalized[image_no])

---

## Analysis with tflite_analyser
Make sure to install [tflite_analyser](https://github.com/PeteBlackerThe3rd/tflite_analyser) and its dependencies and point to the script.

In [None]:
!git clone https://github.com/PeteBlackerThe3rd/tflite_analyser

In [None]:
!python3 tflite_analyser/tflite_analyser.py {tfl_model_file} --all