Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Improve performance by lowering amount of calls to retrieve model #5507

Closed
4 of 23 tasks
Davidswinkels opened this issue Mar 17, 2022 · 7 comments
Closed
4 of 23 tasks
Labels
area/models MLmodel format, model serialization/deserialization, flavors area/server-infra MLflow Tracking server backend area/tracking Tracking service, tracking client APIs, autologging enhancement New feature or request

Comments

@Davidswinkels
Copy link
Contributor

Davidswinkels commented Mar 17, 2022

Thank you for submitting a feature request. Before proceeding, please review MLflow's Issue Policy for feature requests and the MLflow Contributing Guide.

Please fill in this feature request template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?

  • Yes. I can contribute this feature independently.
  • Yes. I would be willing to contribute this feature with guidance from the MLflow community.
  • No. I cannot contribute this feature at this time.

Proposal Summary

Retrieve models more efficiently by lowering required amount of requests.

Currently to retrieve a model we have to do 3 requests:
experiment_name="energy_forecast_10001_Amsterdam"
experiment = mlflow.get_experiment_by_name(experiment_name)
run = mlflow.search_runs(experiment.experiment_id, max_results=1)
model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], "model/"))

It would be nice if this can be speeded up by getting model in only 1 request:
model = mlflow.sklearn.load_latest_model(experiment_name)

or 2 requests:
run = mlflow.search_runs(experiment_name, max_results=1)
model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], "model/"))

Motivation

  • What is the use case for this feature?
    Performance
  • Why is this use case valuable to support for MLflow users in general?
    Performance for all users to load models.
  • Why is this use case valuable to support for your project(s) or organization?
    Performance.
  • Why is it currently difficult to achieve this use case? (please be as specific as possible about why related MLflow features and components are insufficient)
    It's more difficult/impossible to improve the performance at higher level when lower calls are not performant.

What component(s), interfaces, languages, and integrations does this feature affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Languages

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

Details

(Use this section to include any additional information about the feature. If you have a proposal for how to implement this feature, please include it here. For implementation guidelines, please refer to the Contributing Guide.)

@Davidswinkels Davidswinkels added the enhancement New feature or request label Mar 17, 2022
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry area/models MLmodel format, model serialization/deserialization, flavors area/server-infra MLflow Tracking server backend labels Mar 17, 2022
@BenWilson2
Copy link
Member

Hi @Davidswinkels have you taken a look at the model registry functionality?
https://www.mlflow.org/docs/latest/model-registry.html#fetching-an-mlflow-model-from-the-model-registry

The ability to retrieve a particular model with a single API call is in there, allowing you to get an artifact by specifying a version or a stage directly. This might simplify your use case.

As far as tightly coupling the tracking server and artifact retrieval into a single API call, I'm afraid that it wouldn't buy any performance improvement (they are separate services) and would only complicate the APIs.

Hopefully the model registry (and perhaps also Projects https://www.mlflow.org/docs/latest/projects.html ) might help to reduce the amount of lines of code in your work if that is the concern.

Please let me know if there are any other points that you'd like to discuss.

@Davidswinkels
Copy link
Contributor Author

Hi Ben. Thanks for the answer! We will look into the model registry documentation even more to see if that improves performance and if we can fetch models easier with less code. Then we will get back here. Yes agree from a loose-coupling perspective it's nice to keep tracking server and artifact retrieval separated. So then it's not wise to do:
model = mlflow.sklearn.load_latest_model(experiment_name).

Do you or others think it is interesting to be able to search_runs based on experiment_name? Or is there no need for that with model_registry?

run = mlflow.search_runs(experiment_name, max_results=1)
model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], "model/"))

@BenWilson2
Copy link
Member

There certainly won't be a need to search for the experiment name while using the model registry since there is a very small subset of "production-capable" models that would be registered.
That being said, that doesn't seem like what your use case is for if you're asking about searching for runs based on experiment names.
We'll have an internal discussion around whether we'd like to pursue something like this (which would add a further layer of complication performance-wise to the search_runs fluent API as we'd be adding a call to get the experiment information (e.g., from SQLAlchemy:

def get_experiment_by_name(self, experiment_name):
"""
Specialized implementation for SQL backed store.
"""
with self.ManagedSessionMaker() as session:
stages = LifecycleStage.view_type_to_stages(ViewType.ALL)
experiment = (
session.query(SqlExperiment)
.options(*self._get_eager_experiment_query_options())
.filter(
SqlExperiment.name == experiment_name, SqlExperiment.lifecycle_stage.in_(stages)
)
.one_or_none()
)
return experiment.to_mlflow_entity() if experiment is not None else None
) to each of the provided names in the search query). Absolutely no promises here other than the fact that we'll discuss it.

@BenWilson2
Copy link
Member

Hi @Davidswinkels if you're up for creating a search_runs_by_experiment_name() implementation that performs the client-side resolution of experiment names to experiment_id's and then submits those to the search_runs() API, please feel free to file a PR and we'll be more than happy to review and provide feedback.

@Davidswinkels
Copy link
Contributor Author

Davidswinkels commented Mar 25, 2022

Hey Ben. We were thinking of adding this since we were using MLFlow with file-based backend-store-uri. We are currently switching to a database backend to improve performance. Plus with database backend we can now switch to model registry too. We still have to test how much performance would increase from using the calls via model registry. From initial performance check the "get_experiment_by_name" does not seem the bottleneck anymore:

experiment = mlflow.get_experiment_by_name(experiment_name)
run = mlflow.search_runs(experiment.experiment_id, max_results=1)
model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], "model/"))

Performance comparison of MLFlow model retrieval (file-based vs SQLite database) over 10 calls:

  File-based Database (SQLite)
mlflow.get_experiment_by_name(experiment_name) 5.3 s 0.4 s
mlflow.search_runs(experiment.experiment_id, max_results=1) 8.1 s 4.7 s
mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], "model/")) 0.2 s 0.2 s

The requested feature to get run by experiment_name would still improve performance quite a bit for people who use file-based backend, but we won't develop it for now since for us with database backend getting experiment based on experiment_name is less of a performance issue.

@r3stl355
Copy link
Contributor

I'll give this a try

@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging and removed area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry area/artifacts Artifact stores and artifact logging labels Mar 31, 2022
@Davidswinkels
Copy link
Contributor Author

Davidswinkels commented Apr 12, 2022

This issue was resolved by this PR (#5564) and mlflow 1.25.0 release. Did a small test on MLFlow==1.25.0 with a SQLite database. Performance did improve! It varied quite a bit compared to before. Probably due too environment (local vs kubernetes cluster, and file-based vs SQLite) and also how many runs/models were stored.

Summary performance check model retrieval per code chunk

  Average over 10 calls
Tracking registry: model via name + experiment + run 1.48 s
Tracking registry: model via name + run 1.46 s
Model registry: model via version + model registry 1.62 s
Model registry: multiple models via stage None + model registry 1.46 s
Model registry: single model via stage Production+ model registry 1.49 s

Tracking registry model retrieval

Retrieve model via name + experiment + run (1.48 s ± 44.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

experiment_name="Blub5"
experiment = mlflow.get_experiment_by_name(experiment_name)
run = mlflow.search_runs(experiment.experiment_id, max_results=1)
model = mlflow.pyfunc.load_model(os.path.join(run.artifact_uri[0], "model/"))

Retrieve model via name + run (1.46 s ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

experiment_name="Blub5"
run = mlflow.search_runs(experiment_names=[experiment_name], max_results=1)
model = mlflow.pyfunc.load_model(os.path.join(run.artifact_uri[0], "model/"))

Model registry model retrieval

Retrieve model via version + model registry (1.62 s ± 171 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

model_name = "Blub5"
client = MlflowClient()
model_versions = client.get_latest_versions(model_name, stages=["None"])
model_version = model_versions[0].version
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")

Retrieve model via stage None + model registry (1.46 s ± 43.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))
Ten models are on stage None, but most recently trained model will be retrieved

model_name = "Blub5"
stage = 'None'
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{stage}")

Retrieve model via stage Production + model registry (1.49 s ± 66.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))
One single model is on Production

model_name = "Blub5"
stage = 'Production'
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{stage}")

Thanks restless for implementing this. More neat to be able to get run based on experiment_name directly from tracking registry :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/models MLmodel format, model serialization/deserialization, flavors area/server-infra MLflow Tracking server backend area/tracking Tracking service, tracking client APIs, autologging enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants