New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling ONNX models with external data #586
Handling ONNX models with external data #586
Conversation
…/NouamaneTazi/255
…/NouamaneTazi/255
…/NouamaneTazi/255
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Saving is correctly done as we discussed @fxmarty, but loading deserves some more discussion. encoder_file_name: str = ONNX_ENCODER_NAME,
decoder_file_name: str = ONNX_DECODER_NAME,
decoder_with_past_file_name: str = ONNX_DECODER_WITH_PAST_NAME,
subfolder: str = "", Wdyt? |
The documentation is not available anymore as the PR was closed or merged. |
So we can maybe align on:
Possible arborescence:
|
This should work now import shutil
from pathlib import Path
from transformers import (
AutoConfig,
AutoModelForSeq2SeqLM,
AutoTokenizer,
BertForSequenceClassification,
MBartForConditionalGeneration,
)
from transformers.modeling_utils import no_init_weights
from huggingface_hub import HfApi
from optimum.onnxruntime import ORTModelForCausalLM, ORTModelForSeq2SeqLM, ORTModelForSequenceClassification
# model_ckpt = "hf-internal-testing/tiny-bert"
# model_ckpt = "facebook/mbart-large-en-ro"
model_ckpt = "sshleifer/tiny-mbart"
save_path = Path(f"saved_model/{model_ckpt}")
save_path.mkdir(parents=True, exist_ok=True)
tokenizer = AutoTokenizer.from_pretrained(model_ckpt, use_auth_token=True)
config = AutoConfig.from_pretrained(model_ckpt, use_auth_token=True)
with no_init_weights():
model = MBartForConditionalGeneration(config)
# save to local folder
model.save_pretrained(save_path)
model = ORTModelForSeq2SeqLM.from_pretrained(save_path, from_transformers=True)
# save onnx to local folder
model.save_pretrained(save_path / "onnx")
# gives:
# .
# |-tiny-mbart
# | |-special_tokens_map.json
# | |-sentencepiece.bpe.model
# | |-tokenizer.json
# | |-tokenizer_config.json
# | |-onnx
# | | |-special_tokens_map.json
# | | |-decoder_with_past_model
# | | | |-decoder_with_past_model.onnx
# | | |-sentencepiece.bpe.model
# | | |-tokenizer.json
# | | |-decoder_model
# | | | |-decoder_model.onnx
# | | |-tokenizer_config.json
# | | |-encoder_model
# | | | |-encoder_model.onnx
# | | |-config.json
# | |-pytorch_model.bin
# | |-config.json |
Edit: nevermind this comment, this PR changes only the save_pretrained. How different is it from #255 ? Great! I tried it, and in this case, I get several warnings:
|
New issue here: #606 |
Added some tests for saving/loading from local folder and from hub Otherwise the PR should be good to merge once all tests pass To launch external data specific tests:
|
tests/onnxruntime/test_modeling.py
Outdated
def test_save_seq2seq_model_with_external_data(self): | ||
with tempfile.TemporaryDirectory() as tmpdirname: | ||
# randomly intialize large model | ||
config = AutoConfig.from_pretrained(self.LARGE_ONNX_SEQ2SEQ_MODEL_ID) | ||
with no_init_weights(): | ||
model = MBartForConditionalGeneration(config) | ||
|
||
# save transformers model to be able to load it with `ORTModel...` | ||
model.save_pretrained(tmpdirname) | ||
|
||
model = ORTModelForSeq2SeqLM.from_pretrained(tmpdirname, from_transformers=True) | ||
model.save_pretrained(tmpdirname + "/onnx") | ||
|
||
# Verify config and ONNX exported encoder, decoder and decoder with past are present each in their own folder | ||
folder_contents = os.listdir(tmpdirname + "/onnx") | ||
self.assertTrue(CONFIG_NAME in folder_contents) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test may slow down the CI a little. We could consider adding @slow
decorators for such tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, please do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only test failing it seems. It probably fails because it tries to load the whole model and save convert it to external data. We're getting the error
worker 'gw0' crashed while running 'tests/onnxruntime/test_modeling.py::ORTModelIntegrationTest::test_save_seq2seq_model_with_external_data'
I'm wondering if we should just use a smaller model instead, and use FORCE_ONNX_EXTERNAL_DATA
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good yes, that way we both test the feature and get a faster (and less prone to OOM) CI.
optimum/exporters/onnx/convert.py
Outdated
onnx_model = onnx.load(str(output), load_external_data=False) | ||
model_uses_external_data = check_model_uses_external_data(onnx_model) | ||
|
||
if model_uses_external_data or FORCE_ONNX_EXTERNAL_DATA: | ||
logger.info("Saving external data to one file...") | ||
|
||
# try free model memory | ||
del model | ||
del onnx_model | ||
|
||
onnx_model = onnx.load( | ||
str(output), load_external_data=True | ||
) # TODO: this will probably be too memory heavy, shall we free `model` memory? | ||
onnx.save( | ||
onnx_model, | ||
str(output), | ||
save_as_external_data=True, | ||
all_tensors_to_one_file=True, | ||
location=output.name + "_data", | ||
size_threshold=1024 if not FORCE_ONNX_EXTERNAL_DATA else 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a function for this? Probably would be cleaner.
Not sure how tf2onnx
handles files >2GB. Could this be used in the export_tensorflow
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing mainly with pytorch. It'll be better if somebody else made another PR to apply the same modifications to export_tensorflow
@NouamaneTazi Do you think it is likely to have it merged today/tomorrow? I think it should be in the release |
Should be ready to merge once all tests pass @fxmarty 🙌 |
…port-subfolders
… data in a single data file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just waiting for the tests
This PR aims to handle loading and exporting ONNX models with external data, locally and from the hub. We can also now use
FORCE_ONNX_EXTERNAL_DATA=1
to force using external data format even for small models.onnx_data
for easy loading from hub)cc @fxmarty @mht-sharma @michaelbenayoun
Fixes #254 and #377