Skip to content

Model Conversion Notes

SWHL edited this page Oct 10, 2023 · 2 revisions

My requirements (only some key packages)

numpy                    1.21.6
torch                    1.7.1+cu101
x-transformers           0.15.0
transformers             4.23.1
tokenizers               0.13.3
timm                     0.5.4
einops                   0.6.0

Convert image_resizer.pth

Insert the convert code in the location of Latex-OCR:

torch.onnx.export(
      self.image_resizer,
      t,
      f='image_resizer.onnx',
      opset_version=12,
      input_names=['input'],
      output_names=['output'],
      dynamic_axes={
          'input': {0: 'batch_size', 1: 'channel', 2:'height', 3: 'width'},
          'output': {0: 'batch_size', 1: 'output_context'}
      },
      export_params=True,
      verbose=False
  )

Convert weights.pth

weights.pth is consist of two parts: encoder and decoder.

Convert the encoder.onnx:

Insert the convert code in the location of Latex-OCR:

torch.onnx.export(
    self.encoder,
    x,
    f='encoder.onnx',
    opset_version=11,
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={
        'input': {0: 'batch_size', 1: 'channel', 2: 'height', 3: 'width'},
        'output': {0: 'batch_size', 1: 'output_context'},
    },
    export_params=True,
    verbose=False,
)

Convert the decoder.onnx:

Insert the convert code in the location of Latex-OCR:

torch.onnx.export(
    self.net,
    (x, mask, context),
    f='decoder.onnx',
    opset_version=13,
    input_names=['x', 'mask', 'context'],
    output_names=['output'],
    dynamic_axes={
        'x': {0: 'batch_size', 1: 'encoded_context'},
        'output': {0: 'batch_size', 1: 'output_seq'}
    },
    export_params=True,
    verbose=False
)

⚠️ Attention:

If the model conversion is unsuccessful, try to comment out the conditional statements when building the model.

Clone this wiki locally