Do not force download by default in ORTModel by fxmarty · Pull Request #356 · huggingface/optimum

fxmarty · 2022-08-23T10:51:02Z

We passed by default force_download=True, although the caching works well when passing force_download=False. Especially when passing from_transformers=False, everything works as expected as the onnx model is loaded from the cache hub/ subfolder.

When passing from_transformers=True, although we cache only the PyTorch model (before the new transformers cache system in transformers/ cache subfolder, in hub/ subfolder in the next transformers release), the force_download argument is anyway ignored in _from_transformers so this PR has no impact there:

optimum/optimum/onnxruntime/modeling_ort.py

Lines 239 to 303 in e903324

    
               @classmethod 
        
               def _from_transformers( 
        
                   cls, 
        
                   model_id: str, 
        
                   save_dir: Union[str, Path] = default_cache_path, 
        
                   use_auth_token: Optional[Union[bool, str, None]] = None, 
        
                   revision: Optional[Union[str, None]] = None, 
        
                   force_download: bool = False, 
        
                   cache_dir: Optional[str] = None, 
        
                   **kwargs, 
        
               ): 
        
                   """ 
        
                   Converts a vanilla Transformers model into an optimized model using `transformers.onnx.export_onnx`. 
        
                   Arguments: 
        
                       model_id (`str` or `Path`): 
        
                           Directory from which to load 
        
                       save_dir (`str` or `Path`): 
        
                           Directory where the onnx model should be saved, default to `transformers.file_utils.default_cache_path`, which is the cache dir for 
        
                           transformers. 
        
                       use_auth_token (`str` or `bool`): 
        
                           Is needed to load models from a private repository 
        
                       revision (`str`): 
        
                           Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id 
        
                       cache_dir (`Union[str, Path]`, *optional*): 
        
                           Path to a directory in which a downloaded pretrained model configuration should be cached if the 
        
                           standard cache should not be used. 
        
                       force_download (`bool`, *optional*, defaults to `False`): 
        
                           Whether or not to force the (re-)download of the model weights and configuration files, overriding the 
        
                           cached versions if they exist. 
        
                       kwargs (`Dict`, *optional*): 
        
                           kwargs will be passed to the model during initialization 
        
                   """ 
        
                   # create local save dir in cache dir 
        
                   save_dir = Path(save_dir).joinpath(model_id) 
        
                   save_dir.mkdir(parents=True, exist_ok=True) 
        
                   kwargs["model_save_dir"] = save_dir 
        
                   # reads pipeline task from ORTModelForXXX class if available else tries to extract from hub 
        
                   if cls.export_feature is not None: 
        
                       task = cls.export_feature 
        
                   else: 
        
                       task = HfApi().model_info(model_id, revision=revision).pipeline_tag 
        
                       if task in ["sentiment-analysis", "text-classification", "zero-shot-classification"]: 
        
                           task = "sequence-classification" 
        
                       elif task in ["feature-extraction", "fill-mask"]: 
        
                           task = "default" 
        
                   # 2. convert to temp dir 
        
                   # FIXME: transformers.onnx conversion doesn't support private models 
        
                   preprocessor = get_preprocessor(model_id) 
        
                   model = FeaturesManager.get_model_from_feature(task, model_id) 
        
                   _, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=task) 
        
                   onnx_config = model_onnx_config(model.config) 
        
                   # export model 
        
                   export( 
        
                       preprocessor=preprocessor, 
        
                       model=model, 
        
                       config=onnx_config, 
        
                       opset=onnx_config.default_onnx_opset, 
        
                       output=save_dir.joinpath(ONNX_WEIGHTS_NAME), 
        
                   ) 
        
                   kwargs["config"] = model.config.__dict__ 
        
                   # 3. load normal model 
        
                   return cls._from_pretrained(save_dir.as_posix(), **kwargs)

HuggingFaceDocBuilderDev · 2022-08-23T11:07:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

regisss

LGTM @fxmarty!

If the force_download argument is not used in _from_transformers, maybe we should remove it what do you think?

philschmid

That's awesome! But we have to add test to validate that it is working as expected. Especially since we are storing the "model.onnx" files at the top level with out any hash. So there is no way for the library to know if the cached "model.onnx" is the correct one or not.

We can add the local_files_only argument to the from_pretrained method and have then working and failing tests for "cached" and regular models. As well as for loading different models after each other to make sure we have the correct ones loaded.
See documentation: https://huggingface.co/docs/huggingface_hub/package_reference/file_download#huggingface_hub.hf_hub_download.local_files_only

We should also check that this works for the seq2seq models.

And we should check if the new cache system allows better distribution of onnx files. Currently, when a user loads and saves a onnx file it is saved at the top cache directory with model.onnx, which means if he loads a new model "from" cache it uses the old model. Thats why i went with "force_download" in the beginning.

fxmarty · 2022-08-23T12:58:09Z

When loading an onnx model with from_transformers=False, here's where it is loaded from in the hub/ subfolder:

For seq2seq models (with from_transformers=False), we have:

which does work as well. Thanks for the suggestion to add tests, I will try to add some!

When from_transformers=True, the argument force_download is ignored.

In case we passed from_transformers=True, in the old caching system, we indeed get e.g. a transformers/facebook/levit-256/model.onnx file with no hash recorded. However, even if this file exists, it is NOT loaded into the InferenceSession and the cached pytorch model is converted to onnx and loaded into an InferenceSession (the file is overwritten at each load).

Under the new caching system, if we pass from_transformers=True, we get e.g. a hub/facebook/levit-256/model.onnx with no hash recorded as well. Looking at the timestamps, this file is overwritten at every conversion.

In any case, I think the case from_transformers=True is independent from this PR.

philschmid

LGTM!

* Override export of ORTSeq2SeqTrainer * Do not force download by default in ORTModel (#356) * Update OnnxConfigWithLoss wrapper * ORT optimizer refactorization (#294) * Refactorization of ORTOptimizer * Refactorization of ORTModel * Adapt examples according to refactorization * Adapt tests * Fix style * Remove quantizer modification * Fix style * Apply modifications from #270 for quantizer and optimizer to have same behavior * Add test for optimization of Seq2Seq models * Fix style * Add ort config saving when optimizing a model * Add ort config saving when quantizing a model * Add tests * Fix style * Adapt optimization examples * Fix readme * Remove unused parameter * Adapt quantization examples * Fix quantized model and ort config saving * Add documentation * Add model configuration saving to simplify loading of optimized model * Fix style * Fix description * Fix quantization tests * Remove opset argument which is onnx config default opset when exporting with ORTModels * Fix import (#360) * Fix export of decoders * Add flag to export only decoders * Fix ORTTrainer inference ort subclass parsing * Fix filenames when empty suffix given (#363) * fix(optimization): handle empty file suffix * fix(quantization): handle empty file suffix * use pathlibfor save_dir * run test again * Update optimum/onnxruntime/quantization.py Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * ReRun test that failed because of cache (network) Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * Override the evaluation and prediction loop in ORTSeq2SeqTrainer * Fix documentation (#369) * fix class * Update optimization.mdx * Fix label smoother device prob * Fix lm_logits and labels dimension mismatch * Clean up Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> Co-authored-by: Pierre Snell <ierezell@gmail.com> Co-authored-by: Pierre Snell <pierre.snell@botpress.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

* Inference with ORTModel * Clean up unused imports * Replace Inference session by ort model * Inference with ORTModel * Clean up unused imports * Replace Inference session by ort model * Update modeling for custom tasks * Replace in evaluation_loop * refectoring prediction_loop * ORTSeq2SeqTrainer refactoring - Inference with ORTModel (#359) * Override export of ORTSeq2SeqTrainer * Do not force download by default in ORTModel (#356) * Update OnnxConfigWithLoss wrapper * ORT optimizer refactorization (#294) * Refactorization of ORTOptimizer * Refactorization of ORTModel * Adapt examples according to refactorization * Fix ORTTrainer inference ort subclass parsing * Replace datasets.load_metric by evaluate * Add summarization example * Enable ORT inference * Fix inference args * Mention ORT inference in READMEs * Remove repetitve code in Trainer * Update examples to trfrs 4.22.1 * Fix qa example prediction error * Update summarization/README.md * Fix logger consistency * Make readme consistent with trfrs * Put back onnx config with past and loss test

pass False as default force_download arg

727288c

regisss approved these changes Aug 23, 2022

View reviewed changes

philschmid suggested changes Aug 23, 2022

View reviewed changes

fxmarty added 2 commits August 23, 2022 15:59

add tests

3c0e036

style

a25d41a

fxmarty requested a review from philschmid August 23, 2022 14:02

philschmid approved these changes Aug 24, 2022

View reviewed changes

regisss merged commit 122a9d8 into huggingface:main Aug 24, 2022

fxmarty mentioned this pull request Oct 27, 2022

Enable caching support for from_pretrained method in ORTModelForXXX #258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not force download by default in ORTModel#356

Do not force download by default in ORTModel#356
regisss merged 3 commits into
huggingface:mainfrom
fxmarty:do-not-force-download-ortmodel

fxmarty commented Aug 23, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Aug 23, 2022

Uh oh!

regisss left a comment •

edited

Loading

Uh oh!

philschmid left a comment

Uh oh!

fxmarty commented Aug 23, 2022 •

edited

Loading

Uh oh!

philschmid left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	@classmethod
	def _from_transformers(
	cls,
	model_id: str,
	save_dir: Union[str, Path] = default_cache_path,
	use_auth_token: Optional[Union[bool, str, None]] = None,
	revision: Optional[Union[str, None]] = None,
	force_download: bool = False,
	cache_dir: Optional[str] = None,
	**kwargs,
	):
	"""
	Converts a vanilla Transformers model into an optimized model using `transformers.onnx.export_onnx`.
	Arguments:
	model_id (`str` or `Path`):
	Directory from which to load
	save_dir (`str` or `Path`):
	Directory where the onnx model should be saved, default to `transformers.file_utils.default_cache_path`, which is the cache dir for
	transformers.
	use_auth_token (`str` or `bool`):
	Is needed to load models from a private repository
	revision (`str`):
	Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id
	cache_dir (`Union[str, Path]`, optional):
	Path to a directory in which a downloaded pretrained model configuration should be cached if the
	standard cache should not be used.
	force_download (`bool`, optional, defaults to `False`):
	Whether or not to force the (re-)download of the model weights and configuration files, overriding the
	cached versions if they exist.
	kwargs (`Dict`, optional):
	kwargs will be passed to the model during initialization
	"""

	# create local save dir in cache dir
	save_dir = Path(save_dir).joinpath(model_id)
	save_dir.mkdir(parents=True, exist_ok=True)
	kwargs["model_save_dir"] = save_dir

	# reads pipeline task from ORTModelForXXX class if available else tries to extract from hub
	if cls.export_feature is not None:
	task = cls.export_feature
	else:
	task = HfApi().model_info(model_id, revision=revision).pipeline_tag
	if task in ["sentiment-analysis", "text-classification", "zero-shot-classification"]:
	task = "sequence-classification"
	elif task in ["feature-extraction", "fill-mask"]:
	task = "default"
	# 2. convert to temp dir
	# FIXME: transformers.onnx conversion doesn't support private models
	preprocessor = get_preprocessor(model_id)
	model = FeaturesManager.get_model_from_feature(task, model_id)
	_, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=task)
	onnx_config = model_onnx_config(model.config)

	# export model
	export(
	preprocessor=preprocessor,
	model=model,
	config=onnx_config,
	opset=onnx_config.default_onnx_opset,
	output=save_dir.joinpath(ONNX_WEIGHTS_NAME),
	)
	kwargs["config"] = model.config.__dict__
	# 3. load normal model
	return cls._from_pretrained(save_dir.as_posix(), **kwargs)

Conversation

fxmarty commented Aug 23, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Aug 23, 2022

Uh oh!

regisss left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

fxmarty commented Aug 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

regisss left a comment •

edited

Loading

fxmarty commented Aug 23, 2022 •

edited

Loading