### Features and components of mobile AI
* **Lightweight**: This can lead to less accurate models. One example would be MobileNet.
* **Low-latency**.
* **Privacy**: There's no need to leave data from the device, improving privacy.
* **Improved power consumption**: Normal models are very "power hungry".
* **Efficient model format**.
* **Pre-trained models**.

### Components in TensorFlow lite
1. **Converter**(to TensorFlow Lite format)
    - Transforms TensorFlow models into a form efficient for reading the interpreter
    - Introduces optimizations to improve binary size model performance and/or reduce model size
    
    
2. **Interpreter**(Core)
    - Diverse platform support (Android, iOS, embedded Linux and microcontrollers)
    - Platform APIs for accelerated inference

### Architecture
After a training a model, it has to be converted before it can be used on a device.

Steps to **convert** the TensorFlow model:

1. Save the model in the recommended saved model format.
2. Use the TensorFlow Lite converter tools to flatten the model to prepare it for mobile or embedded devices.

Steps to use TF Lite:

1. Jump start: Use pretrained or retrained models.
2. Custom model: Develop and deploy custom model.
3. Performance: Explore options, validate and accelerate models.
4. Optimize: Model Optimization Toolkit.

Inference in theses devices has to be performed very quicky in these devices because it's a very resource consuming task.For this purpose, TensorFlow Lite can employ hardware acceleration librarios or APIs for supported devices.

Secondly, inference can be boosted with Edge TPUs as their solely built for operation on DL models. It's also used for training models.

[GPU Delegate](https://www.youtube.com/watch?v=QSbAUxWfxQw)

### Converting and saving a model

In python you can call `tf.lite.TFLiteConverter` to do the conversion. You can then instantiate your model from:

1. SavedModel (preferred)

2. Keras (a model instance HDF5 file)

3. Concrete function(s)

In [None]:
import tensorflow as tf
import pathlib

pretrained_model = tf.keras.applications.MobileNetV2(weights='imagenet', input_shape=(224, 224, 3))

# Saving the model for later use by tflite_convert
pretrained_model.save('model.h5')

# version_number = 1
# export_dir = f"/tmp/saved_model/{version_number}/"

# Export the SavedModel
# tf.saved_model.save(pretrained_model, export_dir)

# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(pretrained_model)
tflite_model = converter.convert()

# Save the model
tflite_model_file = pathlib.Path('/tmp/foo.tflite')
tflite_model_file.write_bytes(tflite_model)

### Optimization techniques
* **Quantization**: Reduces the precision of the numbers in the weights and biases of the model.
    - All available CPU platforms are supported.
    - Reduces latency and inference cost.
    - It has low memory footprint.
    - Allows execution on hardware restricted-to or optimized-for fixed-point operations.
    - Optimized models for special purpose HW accelerators (TPUs).
    
    How it works?
    - It converts all the floats in the weights of the model into ints.
* **Weight pruning**: Reduces the overall number of parameters:
* **Model topology transforms**: Allows you to get a more efficient model to begin with.

In [1]:
# TENSORFLOW LITE INTERPRETER IN PYTHON

# Load TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()

# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Point the data to be used for testing and run the interpreter
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
tflite_results = interpreter.get_tensor(output_details[0]['index'])

### Transfer Learning on Cats vs Dogs

1. **Prepare the dataset**: Download the data, split into sets (train, val, test), and preprocess.

2. **Transfer Learning**: Choose a feature vector module (ex. MobileNet V2) from TFHub and perform transfer learning.

3. **Export and convert**: Export the trained model to SavedModel and convert it to TFLite.

4. **Deploy**: Deploy the converted model on a mobile device (Android/iOS/Linux/Microcontroller)  