Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to permanently delete an experiment - invalid URI scheme 'file' #11642

Open
3 of 23 tasks
ondratkadlec opened this issue Apr 8, 2024 · 4 comments · May be fixed by #11773
Open
3 of 23 tasks

[BUG] Unable to permanently delete an experiment - invalid URI scheme 'file' #11642

ondratkadlec opened this issue Apr 8, 2024 · 4 comments · May be fixed by #11773
Labels
area/artifacts Artifact stores and artifact logging area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working has-closing-pr This issue has a closing PR

Comments

@ondratkadlec
Copy link

ondratkadlec commented Apr 8, 2024

Issues Policy acknowledgement

  • I have read and agree to submit bug reports in accordance with the issues policy

Where did you encounter this bug?

Local machine

Willingness to contribute

No. I cannot contribute a bug fix at this time.

MLflow version

  • Client: 2.11.3
  • Tracking server: 2.11.3

System information

  • Ubuntu 22.04
  • Python 3.10

Describe the problem

I run a mlflow server on a remote linux machine (e.g. 10.83.182.46, port 8001). I want to store the artifact on this machine and the logs in an sqlite DB. So I run '/home/projects/.local/bin/mlflow server --backend-store-uri sqlite:////home/projects/mlflow/mlruns.db --artifacts-destination /home/projects/mlflow/artifacts -h 0.0.0.0 -p 8001'.
Everything runs smoothly, I can access the http://10.83.182.46:8001, and if I create and experiment, I can see it together with the artifacts (that I am also able to download). On the linux server, I can see the /home/projects/mlflow/artifacts folder.
The problem starts when I manually delete the experiment in UI. I cannot see the experiment in UI, but running '/home/projects/.local/bin/mlflow gc sqlite:////home/projects/mlflow/mlruns.db' results in error 'mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}'. I tried changing the artifacts-location and play with other stuff, but in the end, this error appears every time.

Tracking information

REPLACE_ME

Code to reproduce issue

import mlflow

MLFLOW_TRACKING_URI = "http://10.83.182.46:8001"
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
mlflow.set_experiment("my_experiment")


# Create a run
with mlflow.start_run(run_name='My Run') as run:
    mlflow.log_param("param1", 5)
    mlflow.log_metric("metric1", 1)
    mlflow.log_artifact(local_path="FEATURES.parquet")
    mlflow.log_artifact(local_path="balance.png")
    mlflow.end_run()

Stack trace

/home/projects/.local/bin/mlflow gc --backend-store-uri sqlite:////home/projects/mlflow/mlruns.db
Traceback (most recent call last):
  File "/home/projects/.local/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/home/adlec001/.local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/adlec001/.local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/adlec001/.local/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/adlec001/.local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/adlec001/.local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/adlec001/.local/lib/python3.10/site-packages/mlflow/cli.py", line 605, in gc
    artifact_repo = get_artifact_repository(run.info.artifact_uri)
  File "/home/adlec001/.local/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 124, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/home/adlec001/.local/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 77, in get_artifact_repository
    return repository(artifact_uri)
  File "/home/adlec001/.local/lib/python3.10/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 45, in __init__
    super().__init__(self.resolve_uri(artifact_uri, get_tracking_uri()))
  File "/home/otkadlec001/.local/lib/python3.10/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 59, in resolve_uri
    _validate_uri_scheme(track_parse.scheme)
  File "/home/adlec001/.local/lib/python3.10/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations
@ondratkadlec ondratkadlec added the bug Something isn't working label Apr 8, 2024
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging area/tracking Tracking service, tracking client APIs, autologging area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server and removed area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server labels Apr 8, 2024
@harupy
Copy link
Member

harupy commented Apr 10, 2024

@ondratkadlec Thanks for reporting this! I was able to reproduce the error. I'm investigating how it happens.

Copy link

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

@oleg-z
Copy link

oleg-z commented Apr 20, 2024

Not a contributor. Please let me know if it's ok to comment and provide workaround and happy to contribute once I figure out how I can help.

It seems the issue is when running mlflow gc utils.get_tracking_uri() is not set defaults to file schema. Unfortunatelly GC needs API interface in order to issue delete API calls.

Workaround is to set MLFLOW_TRACKING_URI env var to point to your MLFlow installation. In your case following code should properly execute:

export MLFLOW_TRACKING_URI=http://10.83.182.46:8001
/home/projects/.local/bin/mlflow gc --backend-store-uri sqlite:////home/projects/mlflow/mlruns.db

Example of before and after setting environment variable:
Before:

olegz@dev mlflow % mlflow gc --backend-store-uri sqlite:///$(pwd)/mlruns.db
...
File "/Users/olegz/Library/Python/3.9/lib/python/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

After:

olegz@dev mlflow % export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
olegz@dev mlflow % mlflow gc --backend-store-uri sqlite:///$(pwd)/mlruns.db
Run with ID 7df4f883bdcc466c90405637cabe62c6 has been permanently deleted.
Run with ID 5d5445d3b97845c4a6d6afe798625d3a has been permanently deleted.
Run with ID c2133b7cf23a4a8586b9ce2fd7dd9b6f has been permanently deleted.
Run with ID d967acc86d1642afac33d89e2fbb68aa has been permanently deleted.
Run with ID b706a93dc78f46dfb6c3a2e27641bd6f has been permanently deleted.
Experiment with ID 1 has been permanently deleted.

@ondratkadlec
Copy link
Author

Hello Oleg,
Setting the environment variable indeed solved the issue. Thank you very much.

@github-actions github-actions bot added the has-closing-pr This issue has a closing PR label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts Artifact stores and artifact logging area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working has-closing-pr This issue has a closing PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants