Handling ONNX models with external data #586

NouamaneTazi · 2022-12-13T16:14:37Z

This PR aims to handle loading and exporting ONNX models with external data, locally and from the hub. We can also now use FORCE_ONNX_EXTERNAL_DATA=1 to force using external data format even for small models

Saving/loading a model with external data locally
Saving external data in a single file (ends with .onnx_data for easy loading from hub)
Saving/loading a model with external data from the hub
Writing tests
Apply the same changes for other models besides seq2seq

cc @fxmarty @mht-sharma @michaelbenayoun

Fixes #254 and #377

…tory

…/NouamaneTazi/255

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

…/NouamaneTazi/255

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

NouamaneTazi · 2022-12-13T16:20:43Z

Saving is correctly done as we discussed @fxmarty, but loading deserves some more discussion.
I'm trying to load the different submodels from the different subfolders (for example here). I think an easy (but bloated) solution would be to have multiple subfolder arguments, like we have for file_name

        encoder_file_name: str = ONNX_ENCODER_NAME,
        decoder_file_name: str = ONNX_DECODER_NAME,
        decoder_with_past_file_name: str = ONNX_DECODER_WITH_PAST_NAME,
        subfolder: str = "",

Wdyt?

HuggingFaceDocBuilderDev · 2022-12-13T16:32:37Z

The documentation is not available anymore as the PR was closed or merged.

optimum/onnxruntime/modeling_seq2seq.py

optimum/exporters/onnx/convert.py

fxmarty · 2022-12-13T17:11:09Z

So we can maybe align on:

use ONNX from subfolders encoder/ decoder/ etc with hard-coded folder names if the subfolders are found
otherwise use top-level (backward compatible)

Possible arborescence:

t5_model/ (subfolder="onnx")
    onnx/
        encoder/
        decoder/
        decoder_with_past/

t5_model/ (subfolder="")
    encoder/
    decoder/
    decoder_with_past/

t5_model/ (subfolder="")

t5_model/ (subfolder="onnx")
    onnx/

…ptimum into export-subfolders

NouamaneTazi · 2022-12-13T18:27:26Z

This should work now

import shutil
from pathlib import Path

from transformers import (
    AutoConfig,
    AutoModelForSeq2SeqLM,
    AutoTokenizer,
    BertForSequenceClassification,
    MBartForConditionalGeneration,
)
from transformers.modeling_utils import no_init_weights

from huggingface_hub import HfApi
from optimum.onnxruntime import ORTModelForCausalLM, ORTModelForSeq2SeqLM, ORTModelForSequenceClassification


# model_ckpt = "hf-internal-testing/tiny-bert"
# model_ckpt = "facebook/mbart-large-en-ro"
model_ckpt = "sshleifer/tiny-mbart"
save_path = Path(f"saved_model/{model_ckpt}")
save_path.mkdir(parents=True, exist_ok=True)

tokenizer = AutoTokenizer.from_pretrained(model_ckpt, use_auth_token=True)

config = AutoConfig.from_pretrained(model_ckpt, use_auth_token=True)
with no_init_weights():
    model = MBartForConditionalGeneration(config)

# save to local folder
model.save_pretrained(save_path)

model = ORTModelForSeq2SeqLM.from_pretrained(save_path, from_transformers=True)
# save onnx to local folder
model.save_pretrained(save_path / "onnx")

# gives:
# .
#  |-tiny-mbart
#  | |-special_tokens_map.json
#  | |-sentencepiece.bpe.model
#  | |-tokenizer.json
#  | |-tokenizer_config.json
#  | |-onnx
#  | | |-special_tokens_map.json
#  | | |-decoder_with_past_model
#  | | | |-decoder_with_past_model.onnx
#  | | |-sentencepiece.bpe.model
#  | | |-tokenizer.json
#  | | |-decoder_model
#  | | | |-decoder_model.onnx
#  | | |-tokenizer_config.json
#  | | |-encoder_model
#  | | | |-encoder_model.onnx
#  | | |-config.json
#  | |-pytorch_model.bin
#  | |-config.json

fxmarty · 2022-12-14T08:49:45Z

Edit: nevermind this comment, this PR changes only the save_pretrained. How different is it from #255 ?

Great! I tried it, and in this case, I get several warnings:

The ONNX file encoder_model/encoder_model.onnx is not a regular name used in optimum.onnxruntime, the ORTModelForConditionalGeneration might not behave as expected.
The ONNX file decoder_model/decoder_model.onnx is not a regular name used in optimum.onnxruntime, the ORTModelForConditionalGeneration might not behave as expected.
The ONNX file decoder_with_past_model/decoder_with_past_model.onnx is not a regular name used in optimum.onnxruntime, the ORTModelForConditionalGeneration might not behave as expected.

optimum/onnxruntime/modeling_seq2seq.py

PoodleWang · 2022-12-18T10:41:25Z

note: issue open here #605 @PoodleWang

let me create a new issue for it. They are different~

New issue here: #606

…port-subfolders

NouamaneTazi · 2022-12-18T12:29:52Z

Added some tests for saving/loading from local folder and from hub
The tests which sae to hub such as test_push_ort_model_with_external_data_to_hub seem to fail when tested locally because it saves to my personal repo on the hub, then it tries to load it from hf-internal-testing. It's because of this part in push_to_hub. I'm wondering if it works fine on the CI and if we should do anything about it? 🤔

Otherwise the PR should be good to merge once all tests pass

To launch external data specific tests:

pytest ./tests/onnxruntime/test_modeling.py::ORTModelIntegrationTest -k "external"

NouamaneTazi · 2022-12-18T12:36:47Z

tests/onnxruntime/test_modeling.py

+    def test_save_seq2seq_model_with_external_data(self):
+        with tempfile.TemporaryDirectory() as tmpdirname:
+            # randomly intialize large model
+            config = AutoConfig.from_pretrained(self.LARGE_ONNX_SEQ2SEQ_MODEL_ID)
+            with no_init_weights():
+                model = MBartForConditionalGeneration(config)
+
+            # save transformers model to be able to load it with `ORTModel...`
+            model.save_pretrained(tmpdirname)
+
+            model = ORTModelForSeq2SeqLM.from_pretrained(tmpdirname, from_transformers=True)
+            model.save_pretrained(tmpdirname + "/onnx")
+
+            # Verify config and ONNX exported encoder, decoder and decoder with past are present each in their own folder
+            folder_contents = os.listdir(tmpdirname + "/onnx")
+            self.assertTrue(CONFIG_NAME in folder_contents)
+


This test may slow down the CI a little. We could consider adding @slow decorators for such tests

cc @michaelbenayoun @fxmarty @mht-sharma

yes, please do!

This is the only test failing it seems. It probably fails because it tries to load the whole model and save convert it to external data. We're getting the error

worker 'gw0' crashed while running 'tests/onnxruntime/test_modeling.py::ORTModelIntegrationTest::test_save_seq2seq_model_with_external_data'

I'm wondering if we should just use a smaller model instead, and use FORCE_ONNX_EXTERNAL_DATA ?

I think it would be good yes, that way we both test the feature and get a faster (and less prone to OOM) CI.

optimum/exporters/onnx/convert.py

mht-sharma · 2022-12-19T11:39:27Z

optimum/exporters/onnx/convert.py

+            onnx_model = onnx.load(str(output), load_external_data=False)
+            model_uses_external_data = check_model_uses_external_data(onnx_model)
+
+            if model_uses_external_data or FORCE_ONNX_EXTERNAL_DATA:
+                logger.info("Saving external data to one file...")
+
+                # try free model memory
+                del model
+                del onnx_model
+
+                onnx_model = onnx.load(
+                    str(output), load_external_data=True
+                )  # TODO: this will probably be too memory heavy, shall we free `model` memory?
+                onnx.save(
+                    onnx_model,
+                    str(output),
+                    save_as_external_data=True,
+                    all_tensors_to_one_file=True,
+                    location=output.name + "_data",
+                    size_threshold=1024 if not FORCE_ONNX_EXTERNAL_DATA else 0,


Should we create a function for this? Probably would be cleaner.

Not sure how tf2onnx handles files >2GB. Could this be used in the export_tensorflow?

I was testing mainly with pytorch. It'll be better if somebody else made another PR to apply the same modifications to export_tensorflow

fxmarty · 2022-12-20T07:43:46Z

@NouamaneTazi Do you think it is likely to have it merged today/tomorrow? I think it should be in the release

…port-subfolders

NouamaneTazi · 2022-12-21T13:22:42Z

Should be ready to merge once all tests pass @fxmarty 🙌

…port-subfolders

… data in a single data file

fxmarty

LGTM, just waiting for the tests

NouamaneTazi and others added 17 commits July 1, 2022 17:11

attempt at fixing saving onnx model with external data

031c70e

styling

6fec76f

fix: cache_dir wasn't used when loading from transformers

7c59c2e

separate onnx_cache_dir argument from model's cache_dir

387fe6c

we can now load large ONNX models by specifying external's data direc…

cce8e90

…tory

Merge branch 'main' of https://github.com/huggingface/optimum into pr…

5902565

…/NouamaneTazi/255

Fix saving external data for large models (seq2seq)

845f788

fix saving external data for all ORT models

1f2687f

make style

cd2babd

typing

16d2e24

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

apply suggestions

0a5a52c

Merge branch 'main' of https://github.com/huggingface/optimum into pr…

b3f7bbb

…/NouamaneTazi/255

make style

2ec8009

Merge branch 'main' of https://github.com/huggingface/optimum into pr…

2c1e907

…/NouamaneTazi/255

apply suggestion

af664f8

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

export onnx models to separate subfolders when multiple models

a961d8b

this should get us correct file_names but not correct subfolder!!

ca7a38b

fxmarty reviewed Dec 13, 2022

View reviewed changes

optimum/onnxruntime/modeling_seq2seq.py Show resolved Hide resolved

fxmarty reviewed Dec 13, 2022

View reviewed changes

optimum/exporters/onnx/convert.py Outdated Show resolved Hide resolved

NouamaneTazi added 3 commits December 13, 2022 18:17

export_models is only used for multiple submodels

a7cb11e

we can now load seq2seq model from local dir (multiple submodels)

9479c97

Merge branch 'save-large-models' of https://github.com/nouamanetazi/o…

c9f1eb5

…ptimum into export-subfolders

fix save_pretrained

840bc44

NouamaneTazi requested a review from fxmarty December 13, 2022 18:28

fxmarty reviewed Dec 14, 2022

View reviewed changes

optimum/onnxruntime/modeling_seq2seq.py Outdated Show resolved Hide resolved

NouamaneTazi added 4 commits December 18, 2022 12:05

apply same fixes to modeling_ort.py

16bd118

add tests

cf1e8ed

fix auth token in tests

2511eaa

Merge branch 'main' of https://github.com/huggingface/optimum into ex…

8f51d89

…port-subfolders

NouamaneTazi commented Dec 18, 2022

View reviewed changes

NouamaneTazi added 4 commits December 18, 2022 13:45

add **kwargs to all _save_pretrained

4c6bc60

quick fix

2e61b70

make style

2013d57

try reducing memory footprint when exporting onnx

9648d85

mht-sharma reviewed Dec 19, 2022

View reviewed changes

optimum/exporters/onnx/convert.py Show resolved Hide resolved

mht-sharma reviewed Dec 19, 2022

View reviewed changes

NouamaneTazi added 2 commits December 21, 2022 14:19

replace large seq2seq model with small on to make tests pass

1d41c20

Merge branch 'main' of https://github.com/huggingface/optimum into ex…

cd384f1

…port-subfolders

NouamaneTazi and others added 9 commits December 21, 2022 14:26

fix merge

36af9a8

Merge branch 'main' of https://github.com/huggingface/optimum into ex…

222a8a7

…port-subfolders

we no longer export models to subfolders. instead we regroup external…

27bc039

… data in a single data file

util from last commit

a08f697

empty commit

78703f2

fix import

5fc415d

add onnx utils

b796f9d

fix import2

f1ef9e9

better tests

b262c46

fxmarty approved these changes Dec 22, 2022

View reviewed changes

parameterized and skip order

e550834

NouamaneTazi merged commit 6da9e1a into huggingface:main Dec 22, 2022

fxmarty mentioned this pull request Dec 22, 2022

MBart model cannot be loaded #377

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling ONNX models with external data #586

Handling ONNX models with external data #586

NouamaneTazi commented Dec 13, 2022 •

edited

NouamaneTazi commented Dec 13, 2022

HuggingFaceDocBuilderDev commented Dec 13, 2022 •

edited

fxmarty commented Dec 13, 2022 •

edited

NouamaneTazi commented Dec 13, 2022 •

edited

fxmarty commented Dec 14, 2022 •

edited

PoodleWang commented Dec 18, 2022

NouamaneTazi commented Dec 18, 2022 •

edited

NouamaneTazi Dec 18, 2022

fxmarty Dec 18, 2022 •

edited

NouamaneTazi Dec 18, 2022 •

edited

michaelbenayoun Dec 19, 2022

mht-sharma Dec 19, 2022

NouamaneTazi Dec 19, 2022

fxmarty commented Dec 20, 2022

NouamaneTazi commented Dec 21, 2022

fxmarty left a comment

Handling ONNX models with external data #586

Handling ONNX models with external data #586

Conversation

NouamaneTazi commented Dec 13, 2022 • edited

NouamaneTazi commented Dec 13, 2022

HuggingFaceDocBuilderDev commented Dec 13, 2022 • edited

fxmarty commented Dec 13, 2022 • edited

NouamaneTazi commented Dec 13, 2022 • edited

fxmarty commented Dec 14, 2022 • edited

PoodleWang commented Dec 18, 2022

NouamaneTazi commented Dec 18, 2022 • edited

NouamaneTazi Dec 18, 2022

Choose a reason for hiding this comment

fxmarty Dec 18, 2022 • edited

Choose a reason for hiding this comment

NouamaneTazi Dec 18, 2022 • edited

Choose a reason for hiding this comment

michaelbenayoun Dec 19, 2022

Choose a reason for hiding this comment

mht-sharma Dec 19, 2022

Choose a reason for hiding this comment

NouamaneTazi Dec 19, 2022

Choose a reason for hiding this comment

fxmarty commented Dec 20, 2022

NouamaneTazi commented Dec 21, 2022

fxmarty left a comment

Choose a reason for hiding this comment

NouamaneTazi commented Dec 13, 2022 •

edited

HuggingFaceDocBuilderDev commented Dec 13, 2022 •

edited

fxmarty commented Dec 13, 2022 •

edited

NouamaneTazi commented Dec 13, 2022 •

edited

fxmarty commented Dec 14, 2022 •

edited

NouamaneTazi commented Dec 18, 2022 •

edited

fxmarty Dec 18, 2022 •

edited

NouamaneTazi Dec 18, 2022 •

edited