## Exporting a BERT/SQuAD model to TensorFlow
You can use the Fireball's ```exportToTf``` function to export a model to TensorFlow code. This function creates a
python file that implements the model using TensorFlow APIs. It also creates a numpy file (npz) that contains the parameters of the network. This notebook shows how to use this function to export a Fireball model to TensorFlow. It is assumed that a trained BERT/SQuAD model already exists in the ```Models``` directory. Please refer to the notebook [Question Answering (BERT/SQuAD)](BertSquad.ipynb) for more info about training and using a BERT/SQuAD model.

Fireball can also export models with reduced number of parameters, pruned models, and quatized models. Please refer to the following notebooks for more information:

- [Reducing number of parameters of BERT/SQuAD Model](BertSquad-Reduce.ipynb)
- [Pruning BERT/SQuAD Model](BertSquad-Quantize.ipynb)
- [Quantizing BERT/SQuAD Model](BertSquad-Quantize.ipynb)

## Load a pretrained model

In [1]:
from fireball import Model

# orgFileName = "Models/BertSquad.fbm"        # Original model
# orgFileName = "Models/BertSquadQR.fbm"      # Quantized - Retrained
# orgFileName = "Models/BertSquadPR.fbm"      # Pruned - Retrained
# orgFileName = "Models/BertSquadPRQR.fbm"    # Pruned - Retrained - Quantized - Retrained
# orgFileName = "Models/BertSquadRR.fbm"      # Reduced - Retrained
# orgFileName = "Models/BertSquadRRQR.fbm"    # Reduced - Retrained - Quantized - Retrained
# orgFileName = "Models/BertSquadRRPR.fbm"    # Reduced - Retrained - Pruned - Retrained
orgFileName = "Models/BertSquadRRPRQR.fbm"  # Reduced - Retrained - Pruned - Retrained - Quantized - Retrained

model = Model.makeFromFile(orgFileName, gpus='0')
model.printLayersInfo()
model.initSession()


Reading from "Models/BertSquadRRPRQR.fbm" ... Done.
Creating the fireball model "Bert-SQuAD" ... Done.

Scope            InShape       Comments                 OutShape      Activ.   Post Act.        # of Params
---------------  ------------  -----------------------  ------------  -------  ---------------  -----------
IN_EMB           ≤512 2        LR512                    ≤512 768      None                      4,743,735  
S1_L1_LN         ≤512 768                               ≤512 768      None     DO:0.1           1,536      
S2_L1_BERT       ≤512 768      768/3072, 12 heads       ≤512 768      GELU                      2,839,069  
S2_L2_BERT       ≤512 768      768/3072, 12 heads       ≤512 768      GELU                      2,898,847  
S2_L3_BERT       ≤512 768      768/3072, 12 heads       ≤512 768      GELU                      2,909,099  
S2_L4_BERT       ≤512 768      768/3072, 12 heads       ≤512 768      GELU                      3,036,692  
S2_L5_BERT       ≤512 768      

## Export the model
Fireball creates a folder and puts 2 files in the folder. Here we call the ```exportToTf``` funtion to export the model.

In [2]:

model.exportToTf("Models/BertSquad_TF", runQuantized=True)



Exporting to TensorFlow model "Models/BertSquad_TF" ... 
    Processed all 16 layers.                                     
    Creating parameters file "Params.npz" ... Done.
Done.


## Running inference on the exported model
To verify the exported model, we can now run inference on it. Here we have a "context" which is a paragraph about InterDigital copied from Wikipedia and 3 different questions related to the context. We use our exported ONNX model to answer the questions.

**Note:** We could use the "Tokenizer" included in Fireball. But to show the independence of the following code from Fireball, we are using Google's original tokenizer from [here](https://github.com/google-research/bert/blob/master/tokenization.py).

**NOTE**: Please reset the kernel before running the next cell.

In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import numpy as np
from Models.BertSquad_TF.TfModel import Network
net=Network()

import tokenization, os
tokenizer = tokenization.FullTokenizer(os.path.expanduser("~")+"/data/SQuAD/vocab.txt")

context = r"""
InterDigital is a technology research and development company that provides wireless and video technologies for 
mobile devices, networks, and services worldwide. Founded in 1972, InterDigital is listed on NASDAQ and is 
included in the S&P SmallCap 600. InterDigital had 2020 revenue of $359 million and a portfolio of about 
32,000 U.S. and foreign issued patents and patent applications.
"""

print(context)
questions = [
    "When was InterDigital established?",
    "How much was InterDigital's revenue in 2020?",
    "What does InterDigital provide?",
]

contextTokens = tokenizer.tokenize(context)

for i, question in enumerate(questions):
    questionTokens = tokenizer.tokenize(question)
    allTokens = ["[CLS]"] + questionTokens + ["[SEP]"] + contextTokens + ["[SEP]"]
    tokIds = tokenizer.convert_tokens_to_ids(allTokens)
    tokTypes = [0]*(len(questionTokens)+2) + [1]*(len(contextTokens)+1)
    sample = ([tokIds], [tokTypes])

    startTok, endTok = net.infer(sample)
    startTok -= len(questionTokens) + 2
    endTok -= len(questionTokens) + 2
    answer = ' '.join(contextTokens[int(startTok):int(endTok+1)]).replace(" ##","")
    print("\nQ%d: %s\n    %s"%(i+1, question, answer))

Instructions for updating:
Use fn_output_signature instead

InterDigital is a technology research and development company that provides wireless and video technologies for 
mobile devices, networks, and services worldwide. Founded in 1972, InterDigital is listed on NASDAQ and is 
included in the S&P SmallCap 600. InterDigital had 2020 revenue of $359 million and a portfolio of about 
32,000 U.S. and foreign issued patents and patent applications.


Q1: When was InterDigital established?
    1972

Q2: How much was InterDigital's revenue in 2020?
    $ 359 million

Q3: What does InterDigital provide?
    wireless and video technologies


## Where do I go from here?

[Exporting BERT/SQuAD Model to CoreML](BertSquad-CoreML.ipynb)

[Exporting BERT/SQuAD Model to ONNX](BertSquad-ONNX.ipynb)

---

[Fireball Playgrounds](../Contents.ipynb)

[Question Answering (BERT/SQuAD)](BertSquad.ipynb)

[Reducing number of parameters of BERT/SQuAD Model](BertSquad-Reduce.ipynb)

[Pruning BERT/SQuAD Model](BertSquad-Prune.ipynb)

[Quantizing BERT/SQuAD Model](BertSquad-Quantize.ipynb)
