<a href="https://colab.research.google.com/github/bhadreshpsavani/UnderstandingNLP/blob/master/Notebooks/TFLite/TFLiteExperimentsForMLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Import


In [1]:
!pip install -q transformers

[K     |████████████████████████████████| 3.1 MB 27.2 MB/s 
[K     |████████████████████████████████| 61 kB 400 kB/s 
[K     |████████████████████████████████| 3.3 MB 66.0 MB/s 
[K     |████████████████████████████████| 895 kB 50.1 MB/s 
[K     |████████████████████████████████| 596 kB 47.5 MB/s 
[?25h

## Get Model

In [2]:
from transformers import DistilBertTokenizer, TFDistilBertForMaskedLM
import tensorflow as tf
import warnings
warnings.filterwarnings('ignore')
print(tf.__version__)

2.7.0


In [23]:
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = TFDistilBertForMaskedLM.from_pretrained('distilbert-base-uncased')

inputs = tokenizer("The capital of France is [MASK].", return_tensors="tf")
inputs["labels"] = tokenizer("The capital of France is Paris.", return_tensors="tf")["input_ids"]

outputs = model(inputs)
loss = outputs.loss
logits = outputs.logits

Some layers from the model checkpoint at distilbert-base-uncased were not used when initializing TFDistilBertForMaskedLM: ['activation_13']
- This IS expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFDistilBertForMaskedLM were initialized from the model checkpoint at distilbert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForMaskedLM for predictions without further training.


In [24]:
inputs

{'input_ids': <tf.Tensor: shape=(1, 9), dtype=int32, numpy=
array([[ 101, 1996, 3007, 1997, 2605, 2003,  103, 1012,  102]],
      dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(1, 9), dtype=int32, numpy=array([[1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int32)>, 'labels': <tf.Tensor: shape=(1, 9), dtype=int32, numpy=
array([[ 101, 1996, 3007, 1997, 2605, 2003, 3000, 1012,  102]],
      dtype=int32)>}

In [25]:
tokenizer.decode(tf.math.argmax(logits[0], 1))

'. the capital of france is marseille..'

In [26]:
# Our Model Input shape is (1, 9)
input_spec = tf.TensorSpec([1, 9], tf.int32)
# model._set_inputs(input_spec, training=False) # for tf < 2.2
model._saved_model_inputs_spec = None # for tf > 2.2
model._set_save_spec(input_spec) # for tf > 2.2
input_spec

TensorSpec(shape=(1, 9), dtype=tf.int32, name=None)

In [7]:
model.save_weights('./tensorflow_distilbert/checkpoint')

# TensorFlow Lite:

## With Normal Converstion:

In [8]:
converter = tf.lite.TFLiteConverter.from_keras_model(model)

# For normal conversion:
converter.target_spec.supported_ops = [tf.lite.OpsSet.SELECT_TF_OPS]

In [9]:
tflite_model = converter.convert()
open("distilbert.tflite", "wb").write(tflite_model)





INFO:tensorflow:Assets written to: /tmp/tmpcpw5hhtn/assets


INFO:tensorflow:Assets written to: /tmp/tmpcpw5hhtn/assets
INFO:absl:Using new converter: If you encounter a problem please file a bug. You can opt-out by setting experimental_new_converter=False


268135400

In [10]:
# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="distilbert.tflite")
interpreter.allocate_tensors()

In [11]:
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

In [12]:
input_details

[{'dtype': numpy.int32,
  'index': 0,
  'name': 'serving_default_args_0:0',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
   'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32)},
  'shape': array([1, 9], dtype=int32),
  'shape_signature': array([-1,  9], dtype=int32),
  'sparsity_parameters': {}}]

In [13]:
output_details

[{'dtype': numpy.float32,
  'index': 791,
  'name': 'StatefulPartitionedCall:0',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
   'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32)},
  'shape': array([    1,     9, 30522], dtype=int32),
  'shape_signature': array([   -1,     9, 30522], dtype=int32),
  'sparsity_parameters': {}}]

In [14]:
list(inputs['input_ids'].numpy()[0])

[101, 1996, 3007, 1997, 2605, 2003, 103, 1012, 102]

In [27]:
%%time
interpreter.set_tensor(input_details[0]['index'], inputs['input_ids'])
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
output_data

CPU times: user 166 ms, sys: 3.51 ms, total: 170 ms
Wall time: 96.5 ms


In [28]:
output_data

array([[[ -5.4885235,  -5.4923353,  -5.5315347, ...,  -4.8746433,
          -4.7520714,  -2.8670495],
        [-13.757561 , -14.197444 , -13.902032 , ..., -11.339296 ,
         -10.979487 , -13.772917 ],
        [ -8.26604  ,  -8.81924  ,  -8.554883 , ...,  -7.4753966,
          -6.4735336, -11.509466 ],
        ...,
        [ -3.558647 ,  -3.8016448,  -3.4940026, ...,  -2.6646585,
          -3.334065 ,  -3.5945928],
        [-10.06865  ,  -9.919643 , -10.104453 , ...,  -8.763852 ,
          -8.842559 ,  -5.2510233],
        [-11.046496 , -11.240143 , -11.228372 , ..., -10.204575 ,
         -10.414186 ,  -6.8990993]]], dtype=float32)

In [29]:
tokenizer.decode(tf.math.argmax(output_data[0], 1))

'. the capital of france is marseille..'