# Convert the model to ONNX runtime format

Install required dependencies

In [1]:
!pip install -r requirements.txt

Collecting torch (from -r requirements.txt (line 1))
  Downloading torch-2.2.2-cp38-cp38-manylinux1_x86_64.whl.metadata (25 kB)
Collecting onnx (from -r requirements.txt (line 2))
  Downloading onnx-1.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
Collecting transformers (from -r requirements.txt (line 3))
  Downloading transformers-4.40.0-py3-none-any.whl.metadata (137 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.6/137.6 kB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hCollecting filelock (from torch->-r requirements.txt (line 1))
  Downloading filelock-3.13.4-py3-none-any.whl.metadata (2.8 kB)
Collecting sympy (from torch->-r requirements.txt (line 1))
  Using cached sympy-1.12-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch->-r requirements.txt (line 1))
  Downloading networkx-3.1-py3-none-any.whl.metadata (5.3 kB)
Collecting jinja2 (from torch->-r requirements.txt (line 1))
  Using cach

Convert PT model to ONNX

In [3]:
import torch
from transformers import DPRReader, DPRReaderTokenizer

model_id = "facebook/dpr-reader-multiset-base"
tokenizer = DPRReaderTokenizer.from_pretrained(model_id)
model = DPRReader.from_pretrained(model_id)

model.eval()

# Prepare from dummy input
dummy_inputs = tokenizer(
    questions=["What is love?"],
    titles=["Haddaway"],
    texts=["'What Is Love' is a song recorded by the artist Haddaway"],
    return_tensors="pt",
    padding=True,
    truncation=True,
)

inputs_onnx = tuple(dummy_inputs[k] for k in ['input_ids', 'attention_mask'])

#Export the model to ONNX
torch.onnx.export(
    model,
    inputs_onnx,
    "models/facebook-dpr.onnx",
    opset_version=12,
    input_names=["input_ids", "attention_mask"],
    output_names=["start_logits", "end_logits"],
    dynamic_axes = {
        "input_ids": {0: 'batch_size', 1: 'sequence'},
        "attention_mask": {0: 'batch_size', 1: 'sequence'},
        "start_logits": {0: 'batch_size', 1: 'sequence'},
        "end_logits": {0: 'batch_size', 1: 'sequence'}
    },
    do_constant_folding=True
)

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRReaderTokenizer'.
Some weights of the model checkpoint at facebook/dpr-reader-multiset-base were not used when initializing DPRReader: ['span_predictor.encoder.bert_model.pooler.dense.bias', 'span_predictor.encoder.bert_model.pooler.dense.weight']
- This IS expected if you are initializing DPRReader from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DPRReader from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


FileNotFoundError: [Errno 2] No such file or directory: 'models/facebook-dpr.onnx'