Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize download of artifacts and improve performance of downloading large models. #5359

Merged
merged 14 commits into from
Feb 12, 2022

Conversation

mehtayogita
Copy link
Collaborator

@mehtayogita mehtayogita commented Feb 8, 2022

What changes are proposed in this pull request?

Parallelize download function for ArtifactRepository.

How is this patch tested?

Unit testing
Manually testing with models having large artifacts. For a model with ~200 artifacts, this change provided 2x improvement. The code version at HEAD took 109 secs to load the model using ModelsArtifactRepository, while the updated code in this PR took 66.56 secs.

Does this PR change the documentation?

  • No. You can skip the rest of this section.
  • Yes. Make sure the changed pages / sections render correctly by following the steps below.
  1. Check the status of the ci/circleci: build_doc check. If it's successful, proceed to the
    next step, otherwise fix it.
  2. Click Details on the right to open the job page of CircleCI.
  3. Click the Artifacts tab.
  4. Click docs/build/html/index.html.
  5. Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Support parallel download of artifacts. Fix issue: #5338

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Add one line in the model signature introduction section and add link to detailed section in the introduction.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
…g sphinx build locally.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
…e while logging model to avoid float precision errors.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
…formance for downloading large models.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging rn/bug-fix Mention under Bug Fixes in Changelogs. labels Feb 8, 2022
mehtayogita and others added 2 commits February 7, 2022 16:43
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
@mehtayogita mehtayogita changed the title Add support to parallel download of artifacts and improve performance of downloading large models. Parallelize download of artifacts and improve performance of downloading large models. Feb 8, 2022
@mehtayogita
Copy link
Collaborator Author

@ankit-db @dbczumar sending this for early review to confirm if it is ok to add the parallel download functionality directly in the ArtifactRepository class. If this approach looks good, will proceed with further testing and adding more unit tests.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mehtayogita This is awesome! Thanks for bringing download parallelism to all of the artifact repositories across MLflow. LGTM; feel free to merge once you've supplemented unit tests & conducted manual testing and updated the PR description with a manual test plan.

Copy link
Collaborator

@ankit-db ankit-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited about the contribution! 2 small questions

mlflow/store/artifact/artifact_repo.py Outdated Show resolved Hide resolved
mlflow/store/artifact/databricks_artifact_repo.py Outdated Show resolved Hide resolved
@ankit-db
Copy link
Collaborator

ankit-db commented Feb 8, 2022

Also, it'd be good to add some unit testing here to verify edge cases - how do we plan on doing that?

@mehtayogita
Copy link
Collaborator Author

Thanks for the quick review! Will confirm this is working as intended with manual testing, add tests and request another round of review. Thanks!

Also, it'd be good to add some unit testing here to verify edge cases - how do we plan on doing that?

Thanks for the quick review! Yes the CL needs to be polished more and have unit tests. The parallelism is currently tested using tests/store/artifact/test_databricks_artifact_repo.py, while it is better to be part of tests/store/artifact/test_artifact_repo.py. The edge cases will be tested in there. I will first confirm this change is working as intended with manual testing and will then update the PR with tests and address TODOs left in the code. Thanks!

…_artifacts function.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
…ential per one artifact download.

Signed-off-by: Yogita Mehta <yogita.mehta@databricks.com>
@mehtayogita
Copy link
Collaborator Author

Updated the PR to include the unit tests and confirmed performance improvements with manual testing.

Copy link
Collaborator

@ankit-db ankit-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the change - great first contribution to MLflow!

@ankit-db
Copy link
Collaborator

Btw, could you add a bit more detail about the performance test you did and include a reference to the performance improvement in the release note?

Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @mehtayogita !

@mehtayogita
Copy link
Collaborator Author

Btw, could you add a bit more detail about the performance test you did and include a reference to the performance improvement in the release note?

Added details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts Artifact stores and artifact logging rn/bug-fix Mention under Bug Fixes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants