Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add databricks_mlflow_experiment data source #2389

Merged
merged 26 commits into from
May 14, 2024
Merged

Conversation

840
Copy link
Contributor

@840 840 commented Jun 8, 2023

Changes

Add databricks_mlflow_experiment data source

Tests

Created 3 tests for this data source

  • make test run locally
  • relevant change in docs/ folder
  • covered with integration tests in internal/acceptance
  • relevant acceptance tests are passing
  • using Go SDK

Partially resolves #2301

@840
Copy link
Contributor Author

840 commented Jun 8, 2023

Currently you can only fetch the mlflow_experiment data source using the name in this PR. I believe it would be nice to provide the option to fetch using the ID, as well. I have to read a bit more into this.

@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (4213468) 83.57% compared to head (9b8c8be) 83.57%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2389      +/-   ##
==========================================
- Coverage   83.57%   83.57%   -0.01%     
==========================================
  Files         168      169       +1     
  Lines       15083    15107      +24     
==========================================
+ Hits        12606    12625      +19     
- Misses       1736     1739       +3     
- Partials      741      743       +2     
Files Coverage Δ
provider/provider.go 94.47% <100.00%> (+0.03%) ⬆️
mlflow/data_mlflow_experiment.go 91.30% <91.30%> (ø)

... and 1 file with indirect coverage changes

@nkvuong
Copy link
Contributor

nkvuong commented Jun 8, 2023

@840 thank you for raising the PR. a couple of thoughts for the pending tasks

you could look at https://github.com/databricks/terraform-provider-databricks/blob/master/clusters/data_cluster.go for example of searching both by name & id

the Go SDK has GetExperiment method which takes an ExperimentId as well.

For acceptance tests, you could create an mlflow experiment and try to read it in the same block, similar to https://github.com/databricks/terraform-provider-databricks/blob/master/internal/acceptance/data_job_test.go

return err
}
experiment := experimentResponse.Experiment
data.Name = experiment.Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if name is provided, then you don't need to overwrite it

This data source exports the following attributes:

* `artifact_location` - Location where artifacts for the experiment are stored.
* `name` - Human readable name that identifies the experiment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be consistent with the name argument reference


* `artifact_location` - Location where artifacts for the experiment are stored.
* `name` - Human readable name that identifies the experiment.
* `lifecycle_stage` - Current life cycle stage of the experiment: "active" or "deleted". Deleted experiments are not returned by APIs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get by Id will work on deleted experiments

Suggested change
* `lifecycle_stage` - Current life cycle stage of the experiment: "active" or "deleted". Deleted experiments are not returned by APIs.
* `lifecycle_stage` - Current life cycle stage of the experiment: `active` or `deleted`. Deleted experiments are not returned by APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it has to be changed in the Databricks documentation, as well.

@840
Copy link
Contributor Author

840 commented Jun 12, 2023

Blocked by databricks/databricks-sdk-go#428

@nkvuong nkvuong requested a review from a team July 4, 2023 15:01
Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick first pass. LMK when no longer WIP.

mlflow/data_mlflow_experiment.go Outdated Show resolved Hide resolved
@840 840 requested review from a team as code owners January 4, 2024 12:14
@840 840 requested review from mgyucht and removed request for a team January 4, 2024 12:14
@840
Copy link
Contributor Author

840 commented Jan 4, 2024

Will continue this week or early next week.

@840 840 changed the title [WIP] Add databricks_mlflow_experiment data source Add databricks_mlflow_experiment data source Jan 12, 2024
@840
Copy link
Contributor Author

840 commented Jan 12, 2024

@mgyucht WIP tag removed, should be ready!

@840 840 marked this pull request as draft February 7, 2024 10:46
@840 840 marked this pull request as ready for review April 8, 2024 15:51
@840 840 requested a review from mgyucht April 8, 2024 15:54
Copy link
Contributor

@alexott alexott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@alexott alexott added this pull request to the merge queue May 14, 2024
Merged via the queue into databricks:main with commit c1b792c May 14, 2024
5 checks passed
mgyucht added a commit that referenced this pull request May 14, 2024
### New Features and Improvements
* Add owner support to `databricks_sql_table` ([#3570](#3570)).
* Add `databricks_catalog` data source ([#3573](#3573)).
* Add `databricks_table` data source ([#3571](#3571)).
* Add `databricks_mlflow_experiment` data source ([#2389](#2389)).

### Documentation Changes

### Exporter
* Fix rare race condition with variables map ([#3568](#3568)).

### Internal Changes

Dependency updates:

 * Bump go SDK to 0.40.1 ([#3574](#3574)), featuring improvements in tracing and debuggability of failed requests.
@mgyucht mgyucht mentioned this pull request May 14, 2024
github-merge-queue bot pushed a commit that referenced this pull request May 15, 2024
### New Features and Improvements
* Add owner support to `databricks_sql_table` ([#3570](#3570)).
* Add `databricks_catalog` data source ([#3573](#3573)).
* Add `databricks_table` data source ([#3571](#3571)).
* Add `databricks_mlflow_experiment` data source ([#2389](#2389)).

### Documentation Changes

### Exporter
* Fix rare race condition with variables map ([#3568](#3568)).

### Internal Changes

Dependency updates:

 * Bump go SDK to 0.40.1 ([#3574](#3574)), featuring improvements in tracing and debuggability of failed requests.
@840 840 deleted the issue-2301-1 branch May 20, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Data Sources for MLflow experiment and model
5 participants