This repository was archived by the owner on Sep 12, 2024. It is now read-only.

Description
ONNX models are serialised version of the current AI models. They are a bit faster from the normal pytorch or huggingface therefore some users might want to use this type of models.
We already do have a model converted and added to huggingface:
https://huggingface.co/TextCortex/codegen-350M-optimized
In total we need to support following model types:
- Pytorch: Filename extension .pt
- Hugginface: Filename extension .bin
- ONNX: Filename extension .onnx
Here is the script for supporting text generation for ONNX models with optimum library from huggingface:
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForCausalLM
# Load models
model = ORTModelForCausalLM.from_pretrained("TextCortex/codegen-350M-optimized")
tokenizer = AutoTokenizer.from_pretrained("TextCortex/codegen-350M-optimized")
def generate_onnx(prompt, min_length=16, temperature=0.1, num_return_sequences=1):
generated_ids=model.generate(input_ids, min_length=min_length, temperature=temperature,
num_return_sequences=num_return_sequences, early_stopping=True)
out = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
return out
For the Vanilla Pytorch models .pt, you can directly use AutoModel class from transformers (which also works for the huggingface .bin model types.)