[BUG]: MLflow recipe still erroring out because of logging png file twice #11651
Labels
area/artifacts
Artifact stores and artifact logging
area/examples
Example code
area/recipes
MLflow Recipes, Recipes APIs, Recipes configs, Recipe Templates
bug
Something isn't working
integrations/databricks
Databricks integrations
Issues Policy acknowledgement
Where did you encounter this bug?
Local machine
Willingness to contribute
No. I cannot contribute a bug fix at this time.
MLflow version
System information
Describe the problem
This is the same issue as listed in this bug (which has been closed): https://github.com/mlflow/mlflow/issues/8047
So, I have cloned the classification example: https://github.com/mlflow/recipes-examples/tree/main/classification
I am using VS Code extension for Databricks ,
No matter if I run the commands in the notebook
projectroot/classification/notebooks/databricks.py
or
mlflow recipes run --profile databricks
I run into the problem in the training step itself.
In the previous bug reported, it was exactly the same error. Though the fix was apparently made by introducing a prefix to differentiate between
train
andevaluate
, the problem is, within the train itself, confusion_matrix.png is logged twice (like was also reported in the earlier problem.I tried changing the
.venv/lib/python3.10/site-packages/mlflow/recipes/steps/train.py
file (line 368) bymlflow.autolog(disable=True)
and that worked for the train step, but then the same problem started happening for the evaluate stepInitial error
Error after changing changing the
.venv/lib/python3.10/site-packages/mlflow/recipes/steps/train.py
file (line 368) bymlflow.autolog(disable=True)
Please note that I also added some extra logging statements for debugging below
strangely, the same fix of disabling autologging manually didn't work for evaluate.py (
.venv/lib/python3.10/site-packages/mlflow/recipes/steps/evaluate.py
Tracking information
Code to reproduce issue
Stack trace
Other info / logs
Commenting out the upload step gets the Recipe to run, but
then gets into trouble with registering the model in Unity Catalog of databricks as the tag with . in the value is not allowed
then gets into trouble with registering the model in Unity Catalog of databricks as the tag
mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Tag name mlflow.source.type is not valid
then I had to go in
.venv/lib/python3.10/site-packages/mlflow/utils/mlflow_tags.py
and change the.
with_
What component(s) does this bug affect?
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrationsarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templatesarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingWhat interface(s) does this bug affect?
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportWhat language(s) does this bug affect?
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsThe text was updated successfully, but these errors were encountered: