# Introduction

TensorFlow Lite (TFLite) is a lightweight version of TensorFlow, an open-source machine learning framework developed by Google. TFLite is designed specifically for running machine learning models on edge devices with limited computational resources, such as mobile devices, IoT (Internet of Things) devices, and embedded systems.

TFLite allows developers to deploy trained TensorFlow models on these edge devices, enabling on-device inference without requiring a constant internet connection or relying on cloud services. This is particularly useful for applications where low latency, privacy, and offline operation are important considerations.

Key features of TensorFlow Lite include:

1. **Model Optimization**: TFLite provides tools and techniques for optimizing TensorFlow models for deployment on edge devices, including quantization, model pruning, and post-training quantization.

2. **Interpreter**: TFLite includes an interpreter that allows models to be executed efficiently on various hardware platforms, including CPUs, GPUs, and specialized accelerators like Neural Processing Units (NPUs) and Graphics Processing Units (GPUs).

3. **Flexibility**: TFLite supports a wide range of TensorFlow model architectures, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and custom models built using TensorFlow's high-level APIs.

4. **Portability**: TFLite models can be easily ported across different platforms and operating systems, including Android, iOS, Linux, and microcontroller-based systems.

5. **Integration**: TFLite seamlessly integrates with popular development environments and frameworks, such as Android Studio, TensorFlow.js, and TensorFlow Lite for Microcontrollers.

Overall, TensorFlow Lite provides a streamlined solution for deploying machine learning models on edge devices, enabling efficient inference and real-time decision-making in resource-constrained environments.

# Example

An example on how to use TFLite using a simple Neural Network is given below.

In [None]:
import numpy as np
import tensorflow as tf

# Step 1: Create and train a simple neural network
# Generate some dummy data for demonstration
X_train = np.random.rand(100, 10)
y_train = np.random.randint(0, 2, size=(100,))

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=10, batch_size=32)

# Step 2: Save the TensorFlow model
model.export("../models/model.pb")

# Step 3: Convert the TensorFlow model to TFLite
converter = tf.lite.TFLiteConverter.from_saved_model("../models/model.pb")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

# Save the TFLite model to a file
with open('../models/model.tflite', 'wb') as f:
    f.write(tflite_model)

print("TFLite model saved successfully!")
# https://github.com/tensorflow/tensorflow/issues/63987

# Exercises

## Exercise 1: Reduced model size

Check the size of the model.

In [None]:
import os

original_model_size = os.path.getsize('../models/model.pb')

# Get the size of the converted TFLite model file
tflite_model_size = os.path.getsize('../models/model.tflite')

# Print the sizes of the original model and the converted TFLite model
print(f"Original TensorFlow model size: {original_model_size} bytes")
print(f"Converted TFLite model size: {tflite_model_size} bytes")

The size of the lite model is actually larger. This is because we've taken a small example neural network. When using a larger network, the difference will be more obvious.

## Exercise 2

Now use your saved RNN prediction model and convert it.

Tip: Use `model.export(filepath)` if you want to export a SavedModel for use with TFLite/TFServing/etc. Received: filepath=simple_nn_model.pb.

If you want to optimize the model further, check out: https://www.tensorflow.org/lite/performance/model_optimization

In [None]:

converter = tf.lite.TFLiteConverter.from_saved_model("../models/model.pb")
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

# Save the TFLite model to a file
with open('../models/model.tflite', 'wb') as f:
    f.write(tflite_model)


In [None]:
original_model_size = os.path.getsize('../models/model.pb')

# Get the size of the converted TFLite model file
tflite_model_size = os.path.getsize('../models/model.tflite')

# Print the sizes of the original model and the converted TFLite model
print(f"Original TensorFlow model size: {original_model_size} bytes")
print(f"Converted TFLite model size: {tflite_model_size} bytes")

# Bonus exercise

Try to run it on a Raspberry Pi!

For more information: https://www.tensorflow.org/lite/guide/python