-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JAVA API Update Parameters, Metrics and Artifacts #3999
Comments
@pancodia There is a way to update a parameter for an existing run using the MLflowClient() CRUD. In Java, you can use the Java MLflowClient().logParam() Does that answer your question? |
Thanks This partially answers my question. If I need to update a parameter in one of the old runs, how can I get the run (for example by name or run_id) and make it active so that I can update a parameter using |
Yes, you can use |
Nice. @dmatrix Thank you for clarification. My question is answered. |
Actually one more question about updating an existing parameter. In my notebook using Spark kernel, I also use MLflow python API to track experiment runs locally (e.g. parameters calculated locally on the SageMaker notebook instance, plots generated in Python locally). From this SO thread, I learned that using However, when I try to use it to update a parameter, I got mlflow.end_run()
run_id = 'fe03293adb7e4c79a716c11fc938c044'
with mlflow.start_run(run_id=run_id) as run:
mlflow.log_param("test_rmse", 0) Error message:
Is this because MLflow by design does not allow overwrite an existing parameter? Actually, I logged the parameter by mistake, can I delete an existing parameter? "Changing param values is not allowed" also raised when I use JAVA API. Anyway to force overwrite? |
Any way to rewrite an existing parameter value? |
In addition, in my setup, I use a Postgresql database (installed on the same EMR master node) to store the results. |
Hi, did you find a solution ? :) |
Not yet. I am still unable to update the value of an existing parameter. Same for both Python and JAVA API. I am not sure if this is supported by MLflow out of the box. Still waiting for some insider to reply. |
@pancodia @jhagege It seems like you cannot update/replace a previously logged parameter with a specific run id. I suspect this might be by design, as you do not want to taint the state or value of parameters from a previous run; it's a snapshot of that run, with all its dependencies, MLflow entities, etc. What you could do is create a new run with the new parameter, which is will be distinct from the previously logged run id. |
@jhagege Thanks for the suggestion and example. @dmatrix I agree that we should avoid using backdoor as much as possible. However, sometimes I mistakenly log a wrong value of a parameter (it happens when I am switching between different notebooks). Also I am new to MLflow, sometime I logged an artifact in a way that I was not intended to be. For example, I tried to log "plots/model_diagnostics" folder as an artifact. However, I didn't realize that Just a suggestion. Is it possible to have an admin account of MLflow that have overwrite permission? |
The idea behind runs and experiments is trials, hence if you make a mistake in a run, within an experiment, then you can start another run with a different set of params, derived metrics, and artifacts to persist. By design, it violates the idea of an experiment run's outcome results being changed, after the fact. All this would violate the principle of experiment governance and the provenance of an experiment run.
I don't believe this use case is common enough (changing the metrics or parameters, after the run is finished), to warrant administrative APIs, in IMHO. The idea of run is experimental in its own right, hence experimental runs can always be either discarded or dismissed, as a wrong trial, or they can be re-run, with an alternate or altered set of parameters, to produce different or desired outcomes. |
A potential use case is backwards fixing or addition of something that might have happened to warrant a change. It could be that you miss-calculated a metric, recorded a wrong parameter or wanted to change the type of an artifact. Saying: just rerun the experiment instead of fixing this kind of stuff can come at a high cost that doesn't warrant rerunning, which means that the bottle neck is the lack of the feature in the tool. |
As it turns out |
I generally agree, but here's a concrete use-case that doesn't quite fit this mold: I'm using autolog, and it logs one of the parameters incorrectly. I'd still like to continue to leverage the convenience of autolog, but I would also (within the same run) like to correct the parameter as a workaround for the value being incorrectly captured by autolog. A keyword argument like |
I am facing the same issue - where I logged 100+ runs and one of the parameters was logged wrong and this parameter is used to trigger the downstream task. now I need to recreate these 100+ runs. I understand by design this is not encouraged behavior but at least it should give the user the flexibility to change it. |
I think being able to rewrite some parameters is an important feature. Probably it should be enable only trough some additional warnings/flags |
I understand the motivation for not providing the ability to overwrite, but surely there should be a method to delete a parameter and re-enter it? |
Bringing this back again, are there any developments? I agree with the previous comments, it would be very useful to have a delete and a replace option in the API, for tags, metrics and parameters... Thanks :) |
Just adding another vote for being able to go back and modify a param that was erroneously logged |
I am using MLflow (v 1.13.1) on AWS. I set up a tracking server on a Spark EMR master node then access EMR from SageMaker notebook instance by using Livy. Since I am developing models in Scala Spark, so I am using MLflow Java API to track my experiments.
One issue I have is that I need to update the location of model after I found I saved and logged a wrong model. However, I could not find a way in the API doc how to get an existing run and update the value of an existing parameter. Is this supported by MLflow?
Currently I could only create a nested run, and create the same parameter in the nested run with updated value.
The text was updated successfully, but these errors were encountered: