## Creating a TF Lite model with RETVec

Please note that using RETVec with TF Lite requires `tensorflow_text>=2.13.0` and `tensorflow>=2.13.0`. You can upgrade your TensorFlow following the instructions [here](https://www.tensorflow.org/install/pip).

This notebook shows how to create, save, and run a TF Lite compatible model which uses the RETVec tokenizer.

In [1]:
# installing retvec if needed
try:
    import retvec
except ImportError:
    !pip install retvec

In [2]:
import os
# os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'  # silence TF INFO messages
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers

# import the RETVec tokenizer layer
from retvec.tf import RETVecTokenizer

2023-10-10 17:31:09.576542: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-10 17:31:09.632125: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-10 17:31:09.632160: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-10 17:31:09.632199: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-10 17:31:09.650379: I tensorflow/core/platform/cpu_feature_g

The only important change to make for RETVec is to set `use_native_tf_ops=True`.
This will make the layer use `tensorflow_text.utf8_binarize` which is supported natively by TF Lite.

In [3]:
# using strings directly requires to put a shape of (1,) and dtype tf.string
inputs = layers.Input(shape=(1, ), name="input", dtype=tf.string)

# add RETVec tokenizer layer with `use_native_tf_ops`
x = RETVecTokenizer(model='retvec-v1', use_native_tf_ops=True)(inputs)

# standard two layer LSTM
# x = layers.Bidirectional(layers.LSTM(32, return_sequences=True))(x)
# x = layers.Bidirectional(layers.LSTM(32))(x)
# outputs = layers.Dense(4, activation='sigmoid')(x)
model = tf.keras.Model(inputs, x)
model.summary()

2023-10-10 17:31:20.330676: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input (InputLayer)          [(None, 1)]               0         
                                                                 
 ret_vec_tokenizer (RETVecT  (None, 128, 256)          230144    
 okenizer)                                                       
                                                                 
Total params: 230144 (899.00 KB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 230144 (899.00 KB)
_________________________________________________________________


In [4]:
save_path = "./demo_models/tflite/retvec_demo_model_2"
model.save(save_path)

INFO:tensorflow:Assets written to: ./demo_models/tflite/retvec_demo_model_2/assets


INFO:tensorflow:Assets written to: ./demo_models/tflite/retvec_demo_model_2/assets


In [5]:
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(save_path) # path to the SavedModel directory
converter.target_spec.supported_ops = [
  tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
  # tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
converter.allow_custom_ops = True
tflite_model = converter.convert()

2023-10-10 17:31:31.219307: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2023-10-10 17:31:31.219341: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
2023-10-10 17:31:31.220140: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: ./demo_models/tflite/retvec_demo_model_2
2023-10-10 17:31:31.224928: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2023-10-10 17:31:31.224945: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: ./demo_models/tflite/retvec_demo_model_2
2023-10-10 17:31:31.250505: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:382] MLIR V1 optimization pass is not enabled
2023-10-10 17:31:31.254321: I tensorflow/cc/saved_model/loader.cc:233] Restoring SavedModel bundle.
2023-10-10 17:31:31.339505: I tensorflow/cc/saved_model/loader.cc:217] Running initialization op on Sav

In [6]:
from tensorflow.lite.python import interpreter
import tensorflow_text as tf_text

interp = interpreter.InterpreterWithCustomOps(
    model_content=tflite_model,
    custom_op_registerers=tf_text.tflite_registrar.SELECT_TFTEXT_OPS)
interp.allocate_tensors()

INFO: Created TensorFlow Lite XNNPACK delegate for CPU.


In [None]:
input_data = np.array(['Some minds are better kept apart'])

tokenize = interp.get_signature_runner('serving_default')
output = tokenize(input=input_data)
print('TensorFlow Lite result = ', output['ret_vec_tokenizer'])

### isolate issue to RaggedTensorToTensor

In [None]:
from absl import app
import numpy as np
import tensorflow as tf
import tensorflow_text as tf_text

from tensorflow.lite.python import interpreter

class TokenizerModel(tf.keras.Model):

  def __init__(self, **kwargs):
    super().__init__(**kwargs)
    self.tokenizer = tf_text.WhitespaceTokenizer()

  @tf.function(input_signature=[
      tf.TensorSpec(shape=[None], dtype=tf.string, name='input')
  ])
  def call(self, input_tensor):
    # tokens = tf_text.pad_along_dimension(self.tokenizer.tokenize(input_tensor), right_pad=["test"])
    # tokens = tf_text.pad_along_dimension(tf.ragged.constant([["test"], ["another", "test"]]), right_pad=["test"])
    # tokens = tf.concat([tokens, tf.ragged.constant([['test'], ['test']])], axis=1)
    # tokens = tokens[:,:1]
    # tokens = self.tokenizer.tokenize(input_tensor)
    # tokens = tokens.to_tensor()
    # tokens = tf.reshape(tokens, (2, 1))
    return { 
      'tokens': self.tokenizer.tokenize(input_tensor).to_tensor(default_value="")
     } # to_tensor does not work, gives a tf.Range error so this is hacky

In [None]:
# Test input data.
input_data = np.array(['Some minds are better kept apart', 'this is a test'])

# Define a Keras model.
model = TokenizerModel()

# Perform TensorFlow Text inference.
tf_result = model(tf.constant(input_data))
print('TensorFlow result = ', tf_result['tokens'])

TensorFlow result =  tf.Tensor(
[[b'Some' b'minds' b'are' b'better' b'kept' b'apart']
 [b'this' b'is' b'a' b'test' b'' b'']], shape=(2, 6), dtype=string)


In [None]:
# Convert to TensorFlow Lite.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.allow_custom_ops = True
tflite_model = converter.convert()

INFO:tensorflow:Assets written to: /tmp/tmpe_6ldzfc/assets


INFO:tensorflow:Assets written to: /tmp/tmpe_6ldzfc/assets
2023-10-07 00:03:26.661546: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2023-10-07 00:03:26.661576: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2023-10-07 00:03:26.661808: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpe_6ldzfc
2023-10-07 00:03:26.664423: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2023-10-07 00:03:26.664440: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: /tmp/tmpe_6ldzfc
2023-10-07 00:03:26.677327: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2023-10-07 00:03:26.702668: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpe_6ldzfc
2023-10-07 00:03:26.724896: I tensorflow/cc/saved_model/loader.cc:314] SavedModel

In [None]:
# Perform TensorFlow Lite inference.
interp = interpreter.InterpreterWithCustomOps(
    model_content=tflite_model,
    custom_op_registerers=tf_text.tflite_registrar.SELECT_TFTEXT_OPS)
interp.get_signature_list()

{'serving_default': {'inputs': ['input'], 'outputs': ['tokens']}}

In [None]:
tokenize = interp.get_signature_runner('serving_default')
output = tokenize(input=input_data)
print('TensorFlow Lite result = ', output['tokens'])