Save model with smaller default `max_shard_size` for HuggingFace flavor #8567

wenfeiy-db · 2023-05-30T23:42:45Z

Related Issues/PRs

Follow-up of #8448

What changes are proposed in this pull request?

Save transformer models with smaller shard size by default (500MB). An environment variable MLFLOW_HUGGINGFACE_MODEL_MAX_SHARD_SIZE is added to allow those who don't like this behavior to override our default of 500MB. HF's default setting is 10GB.

This PR also saves the model directly instead of the wrapping pipeline, which was for some backward compatibility issue that no longer is a problem now that we enforce a minimum transformers version

How is this patch tested?

Existing unit/integration tests
Manual tests (describe details, including test results, below)

Manually testing with a wheel

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

mlflow.transformers.save_model now saves model with max shard size as 500 MB by default. Users can override it via an environmental variable MLFLOW_HUGGINGFACE_MODEL_MAX_SHARD_SIZE.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

mlflow-automation · 2023-05-31T00:00:47Z

Documentation preview for 9b462ae will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/5138030687.

mlflow/transformers.py

ankit-db · 2023-05-31T04:02:12Z

mlflow/transformers.py

@@ -426,7 +428,10 @@ def save_model(
        flavor_conf.update({_PROCESSOR_TYPE_KEY: _get_instance_type(processor)})

    # Save the pipeline object
-    built_pipeline.save_pretrained(save_directory=path.joinpath(_PIPELINE_BINARY_FILE_NAME))
+    built_pipeline.model.save_pretrained(


Are we sure that we can stop saving the pipeline entirely in this way? I think there's a risk that the max_shard_size parameter may not be supported for all pipeline types (though not sure). What do we think about the following approach:

Try to save_pretrained on the model

If that doesn't work, just save the pipeline

max_shard_size parameter may not be supported for all pipeline types (though not sure)

Do you mean maybe not all "model" types support the max_shard_size param? It seems for me it's supported for any transformers.PreTrainedModel by reading its doc.

But yeah I agree we can play safe. I can add the try catch thing. Do you have suggestion on which errors to catch? I'm not super familiar with HF errors, and don't want to accidentally swallow errors.

If you think it's broadly supported I'm good with it - this comment mostly originates from me not understanding Huggingface pipelines' full level of flexibility

We're calling this on the base PreTrainedModel, which provides a save_pretrained method with max_shard_size. It's possible that some models that override save_pretrained, may be missing this though.

For models who override save_pretrained and break its signature, I don't think mlflow transformers flavor is guaranteed to work, right?
I'll keep this unchanged (i.e. no try catch) for now then. We can revisit this if we later find the assumption is broken.

ankit-db · 2023-05-31T04:04:32Z

mlflow/transformers.py

-    built_pipeline.save_pretrained(save_directory=path.joinpath(_PIPELINE_BINARY_FILE_NAME))
+    built_pipeline.model.save_pretrained(
+        save_directory=path.joinpath(_MODEL_BINARY_FILE_NAME),
+        max_shard_size=MLFLOW_HUGGINGFACE_MODEL_MAX_SHARD_SIZE.get(),


Can we also check if the parameter is passed in in kwargs, so that we can support folks overriding it?

They can override it with the env var now. Do we want to expose it in mlflow.transformers.save_model too? I think it depends on how easy we want to allow users to override this, and I thought we don't want it to be too easy.
Let me know if we want 1) env var 2) kwargs or 3) both. I can change accordingly.

I think restricting this behavior to purely environment variables is fine. Can we add a note in models.rst explaining the behavior and the env var key that would need to be overridden for edge cases to change the sharding size?

ankit-db · 2023-05-31T04:11:25Z

Btw, tests were triggered (several non-audio also failing)

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

BenWilson2

LGTM with a small request to add a note in models.rst explaining how to override this setting

ankit-db

Thanks for the PR - let's get it in before the next MLflow release!

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

wenfeiy-db · 2023-05-31T20:40:27Z

@BenWilson2 I added a note in model.rst, but some tests (while all were passing before) are failing for some reason. My last commit only changes the doc and one comment, so the new failing tests are quite confusing for me. Do you know what might go wrong?

Edit: The tests are also passing in my local env, so i'm making empty commit to trigger another run.

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

…flow#8567)" This reverts commit 67ed9b7.

wenfeiy-db added 3 commits May 30, 2023 16:34

max shard size; save model

682e703

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

comment

4529e8f

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

remove print

5c399f0

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

BenWilson2 reviewed May 31, 2023

View reviewed changes

mlflow/transformers.py Outdated Show resolved Hide resolved

ankit-db reviewed May 31, 2023

View reviewed changes

ankit-db changed the title ~~Save Hugginface model with smaller shard size~~ Support max_shard_size parameter for HuggingFace flavor May 31, 2023

ankit-db reviewed May 31, 2023

View reviewed changes

wenfeiy-db added 2 commits May 31, 2023 00:19

fix conflicting key

4503b53

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

Merge branch 'master' of github.com:mlflow/mlflow into max-shard-size

40e81e1

github-actions bot added area/models MLmodel format, model serialization/deserialization, flavors rn/feature Mention under Features in Changelogs. labels May 31, 2023

BenWilson2 approved these changes May 31, 2023

View reviewed changes

ankit-db approved these changes May 31, 2023

View reviewed changes

add note in doc

3f5e7c2

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

trigger check

9b462ae

Signed-off-by: wenfeiy-db <wenfei.yan@databricks.com>

BenWilson2 merged commit 67ed9b7 into mlflow:master May 31, 2023
35 checks passed

wenfeiy-db changed the title ~~Support max_shard_size parameter for HuggingFace flavor~~ Save model with smaller default max_shard_size for HuggingFace flavor Jun 1, 2023

dbczumar added a commit to dbczumar/mlflow that referenced this pull request Jul 1, 2023

Revert "Support max_shard_size parameter for HuggingFace flavor (ml…

0eaf880

…flow#8567)" This reverts commit 67ed9b7.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save model with smaller default `max_shard_size` for HuggingFace flavor #8567

Save model with smaller default `max_shard_size` for HuggingFace flavor #8567

wenfeiy-db commented May 30, 2023 •

edited by BenWilson2

mlflow-automation commented May 31, 2023 •

edited

ankit-db May 31, 2023

wenfeiy-db May 31, 2023 •

edited

ankit-db May 31, 2023

stevenchen-db May 31, 2023

wenfeiy-db May 31, 2023

ankit-db May 31, 2023

wenfeiy-db May 31, 2023

BenWilson2 May 31, 2023

ankit-db commented May 31, 2023

BenWilson2 left a comment

ankit-db left a comment

wenfeiy-db commented May 31, 2023 •

edited

Save model with smaller default max_shard_size for HuggingFace flavor #8567

Save model with smaller default max_shard_size for HuggingFace flavor #8567

Conversation

wenfeiy-db commented May 30, 2023 • edited by BenWilson2

Related Issues/PRs

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

mlflow-automation commented May 31, 2023 • edited

ankit-db May 31, 2023

Choose a reason for hiding this comment

wenfeiy-db May 31, 2023 • edited

Choose a reason for hiding this comment

ankit-db May 31, 2023

Choose a reason for hiding this comment

stevenchen-db May 31, 2023

Choose a reason for hiding this comment

wenfeiy-db May 31, 2023

Choose a reason for hiding this comment

ankit-db May 31, 2023

Choose a reason for hiding this comment

wenfeiy-db May 31, 2023

Choose a reason for hiding this comment

BenWilson2 May 31, 2023

Choose a reason for hiding this comment

ankit-db commented May 31, 2023

BenWilson2 left a comment

Choose a reason for hiding this comment

ankit-db left a comment

Choose a reason for hiding this comment

wenfeiy-db commented May 31, 2023 • edited

Save model with smaller default `max_shard_size` for HuggingFace flavor #8567

Save model with smaller default `max_shard_size` for HuggingFace flavor #8567

wenfeiy-db commented May 30, 2023 •

edited by BenWilson2

mlflow-automation commented May 31, 2023 •

edited

wenfeiy-db May 31, 2023 •

edited

wenfeiy-db commented May 31, 2023 •

edited