Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async calls to log_metric #1550

Closed
fariasfc opened this issue Jul 4, 2019 · 7 comments
Closed

Async calls to log_metric #1550

fariasfc opened this issue Jul 4, 2019 · 7 comments
Assignees
Labels
Acknowledged This issue has been read and acknowledged by the MLflow admins. area/tracking Tracking service, tracking client APIs, autologging enhancement New feature or request priority/backlog We believe it is useful, but don’t see it being prioritized in the next few months.

Comments

@fariasfc
Copy link

fariasfc commented Jul 4, 2019

Is it possible to call log_metrics and execute it asynchronously?

In my situation, I am doing experiments with simple models and simple datasets that are very fast to train/test, but when I try to log_metrics every epoch, it takes much longer due to the logs on mlflow. I was thinking that we should build some queue that could be processed in parallel with the original code, without blocking the main processes of train and test.

@apurva-koti apurva-koti added the enhancement New feature or request label Jul 16, 2019
@apurva-koti apurva-koti self-assigned this Jul 16, 2019
@fabboe
Copy link
Contributor

fabboe commented Nov 11, 2019

also this would make it more fault-tolerant. currently an experiment that has an temporarily unreachable Tracking URI defined, will die.
Sacred does this, for example.

@juntai-zheng juntai-zheng added the Acknowledged This issue has been read and acknowledged by the MLflow admins. label Mar 6, 2020
@juntai-zheng
Copy link
Collaborator

A potential workaround to the lag due to logging every epoch is instead using mlflow.log_metrics, or the lower-level MlflowClient.log_batch. Having an async process definitely would be useful as a built-in feature, though.

@juntai-zheng juntai-zheng added the area/tracking Tracking service, tracking client APIs, autologging label Jul 1, 2020
@smurching smurching added the priority/backlog We believe it is useful, but don’t see it being prioritized in the next few months. label Jul 13, 2020
@dannyfriar
Copy link
Contributor

Also interested in this. Would be very useful for both fault tolerance and for cases where the running time of an epoch is short.

@ygean
Copy link

ygean commented Dec 3, 2021

@apurva-koti Is there any new progress now?

@dbczumar
Copy link
Collaborator

Hi folks, the MLflow fluent API, including mlflow.log_metrics(), is not designed for asynchronous execution (is not thread safe). We recommend using MlflowClient().log_batch() to record metrics asynchronously.

@you-n-g
Copy link
Contributor

you-n-g commented Jan 29, 2022

We created a simple async wrapper in our project which leverages MLflow

https://github.com/microsoft/qlib/blob/d7d19feb4ebb0c4318ac3bfda32a34c56e28a6a0/qlib/workflow/recorder.py#L298
https://github.com/microsoft/qlib/blob/d7d19feb4ebb0c4318ac3bfda32a34c56e28a6a0/qlib/workflow/recorder.py#L368
https://github.com/microsoft/qlib/blob/d7d19feb4ebb0c4318ac3bfda32a34c56e28a6a0/qlib/utils/paral.py#L94

hope it will be helpful.

@phelps-matthew
Copy link

The primary quantities that have a demand for async logging tend to be ones that are not simple parameters and metrics. For my use case, MlflowClient().log_batch() is clearly insufficient. I needed a manner to log images, plots, and analytics much faster.

@you-n-g 's solution was of immense help. I added some multi-threading capability to their code and simplified a few downstream patterns, for anyone who may find that helpful.

https://github.com/phelps-matthew/dl-schema/blob/torch-advanced/dl_schema/recorder_base.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Acknowledged This issue has been read and acknowledged by the MLflow admins. area/tracking Tracking service, tracking client APIs, autologging enhancement New feature or request priority/backlog We believe it is useful, but don’t see it being prioritized in the next few months.
Projects
None yet
Development

No branches or pull requests

10 participants