Path resolution fixes for DatabricksArtifactRepository #4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes are proposed in this pull request?
This PR modifies
DatabricksArtifactRepository
to compute the relative path of the repository'sartifact_uri
to the associated MLflow Run's artifact root. All operations are then performed relative to this artifact root.For example, if the repository is instantiated with the uri
dbfs:/databricks/mlflow-tracking/<EXP_ID>/<RUN_ID>/artifacts/my/subpath
, all LIST/UPLOAD/DOWNLOAD operations will be performed relative to this location. Callinglist_artifacts("foo")
will list the artifacts underdbfs:/databricks/mlflow-tracking/<EXP_ID>/<RUN_ID>/artifacts/my/subpath/foo
. Previously, artifacts were listed underdbfs:/databricks/mlflow-tracking/<EXP_ID>/<RUN_ID>/artifacts
(the run root); this worked becauselist_artifacts()
returned artifact paths relative to the run root, rather than relative to the artifact repo root.@arjundc-db Let me know if this makes sense and please leave comments / questions! If you can add relevant tests for listing behavior (e.g., tests ensuring that the repo returns paths relative to the artifact repo root rather than the run root), that would be awesome!
Currently missing:
How is this patch tested?
(Details)
Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls forModel Registry
area/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: Local serving, model deployment tools, spark UDFsarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, JavaScript, plottingarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientsIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes