Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for UC Volumes to the databricks fs commands #1209

Merged
merged 18 commits into from Feb 20, 2024
Merged

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Feb 15, 2024

Changes

shreyas.goenka@THW32HFW6T cli % databricks fs -h
Commands to do file system operations on DBFS and UC Volumes.

Usage:
  databricks fs [command]

Available Commands:
  cat         Show file content.
  cp          Copy files and directories.
  ls          Lists files.
  mkdir       Make directories.
  rm          Remove files and directories.

This PR adds support for UC Volumes to the fs commands. The fs commands for UC volumes work the same as they currently do for DBFS. This is ensured by running the same test matrix we across both DBFS and UC Volumes versions of the fs commands.

Tests

Support for UC volumes is tested by running the same tests as we did originally for DBFS commands. The tests require a main catalog to exist in the workspace, which does in our test workspaces environments which have the TEST_METASTORE_ID environment variable set.

For the Files API filer, we do the same by running mostly common tests to ensure the filers for "local", "wsfs", "dbfs" and "files API" are consistent.

The tests are also made to all run in parallel to reduce the time taken. To ensure the separation of the tests, each test creates its own UC schema (for UC volumes tests) or DBFS directories (for DBFS tests).

Usage:
  databricks fs [command]

Available Commands:
  cat         Show file content
  cp          Copy files and directories to and from DBFS.
  ls          Lists files
  mkdir       Make directories
  rm          Remove files and directories from dbfs.

Flags:
  -h, --help   help for fs

Global Flags:
      --debug            enable debug logging
  -o, --output type      output type: text or json (default text)
  -p, --profile string   ~/.databrickscfg profile
  -t, --target string    bundle target to use (if applicable)

Use "databricks fs [command] --help" for more information about a command. commands
@shreyas-goenka shreyas-goenka changed the title Add support for UC Volumes to the Commands to do DBFS operations. [WIP] Add support for UC Volumes to the Commands to do DBFS operations. Feb 15, 2024
@shreyas-goenka shreyas-goenka changed the title [WIP] Add support for UC Volumes to the Commands to do DBFS operations. [WIP] Add support for UC Volumes to the databricks fs commands Feb 15, 2024
return FileDoesNotExistError{absPath}
}

// This API returns 409 if the underlying path is a directory.
if aerr.StatusCode == http.StatusConflict {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API now allows deleting empty directories

@codecov-commenter
Copy link

codecov-commenter commented Feb 15, 2024

Codecov Report

Attention: 249 lines in your changes are missing coverage. Please review.

Comparison is base (d9f34e6) 52.60% compared to head (597ea7a) 52.05%.

Files Patch % Lines
libs/filer/files_client.go 0.00% 195 Missing ⚠️
internal/helpers.go 44.73% 40 Missing and 2 partials ⚠️
cmd/fs/cp.go 0.00% 3 Missing ⚠️
cmd/fs/cat.go 0.00% 2 Missing ⚠️
cmd/fs/ls.go 0.00% 2 Missing ⚠️
cmd/fs/mkdir.go 0.00% 2 Missing ⚠️
cmd/fs/rm.go 0.00% 2 Missing ⚠️
cmd/fs/fs.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1209      +/-   ##
==========================================
- Coverage   52.60%   52.05%   -0.55%     
==========================================
  Files         308      308              
  Lines       17223    17470     +247     
==========================================
+ Hits         9060     9094      +34     
- Misses       7495     7706     +211     
- Partials      668      670       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shreyas-goenka shreyas-goenka changed the title [WIP] Add support for UC Volumes to the databricks fs commands Add support for UC Volumes to the databricks fs commands Feb 15, 2024
libs/filer/files_client.go Outdated Show resolved Hide resolved
libs/filer/files_client.go Outdated Show resolved Hide resolved
@shreyas-goenka shreyas-goenka changed the title Add support for UC Volumes to the databricks fs commands [WIP] Add support for UC Volumes to the databricks fs commands Feb 16, 2024
@@ -152,9 +152,6 @@ func newCpCommand() *cobra.Command {
cmd.RunE = func(cmd *cobra.Command, args []string) error {
ctx := cmd.Context()

// TODO: Error if a user uses '\' as path separator on windows when "file"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

underlying issue is closed now

@@ -65,6 +94,3 @@ func TestAccFsCatDoesNotSupportOutputModeJson(t *testing.T) {
_, _, err = RequireErrorRun(t, "fs", "cat", "dbfs:"+path.Join(tmpDir, "hello.txt"), "--output=json")
assert.ErrorContains(t, err, "json output not supported")
}

// TODO: Add test asserting an error when cat is called on an directory. Need this to be
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added TestAccFsCatOnADir for this TODO

}

func (w *FilesClient) Mkdir(ctx context.Context, name string) error {
// Directories are created implicitly.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change allows empty directories to be created, something that was not possible before.

@shreyas-goenka shreyas-goenka changed the title [WIP] Add support for UC Volumes to the databricks fs commands Add support for UC Volumes to the databricks fs commands Feb 19, 2024
Copy link
Contributor

@andrewnester andrewnester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall look good, see the comments on concurrent execution. Also, could you trigger an integration test run for this PR to make sure all tests passed correctly?

libs/filer/files_client.go Outdated Show resolved Hide resolved
libs/filer/files_client.go Outdated Show resolved Hide resolved
libs/filer/files_client.go Outdated Show resolved Hide resolved
internal/helpers.go Show resolved Hide resolved
return err
}

// This API returns a 404 if the file doesn't exist.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This API returns a 404 if the file doesn't exist.
// This API returns a 404 if the directory doesn't exist.

Copy link

@bonnetn bonnetn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the errors returned by the Files API.
Leaving the review of the actual code to the team.

@andrewnester andrewnester added this pull request to the merge queue Feb 20, 2024
Merged via the queue into main with commit 5ba0aaa Feb 20, 2024
4 checks passed
@andrewnester andrewnester deleted the uc-volumes branch February 20, 2024 16:22
andrewnester added a commit that referenced this pull request Feb 20, 2024
CLI:
 * Add support for UC Volumes to the `databricks fs` commands ([#1209](#1209)).

Bundles:
 * Use dynamic configuration model in bundles ([#1098](#1098)).
 * Allow use of variables references in primitive non-string fields ([#1219](#1219)).
 * Add an experimental default-sql template ([#1051](#1051)).
 * Add an experimental dbt-sql template ([#1059](#1059)).

Internal:
 * Add fork-user to winget release workflow ([#1214](#1214)).
 * Use `any` as type for data sources and resources in `tf/schema` ([#1216](#1216)).
 * Avoid infinite recursion when normalizing a recursive type ([#1213](#1213)).
 * Fix issue where interpolating a new ref would rewrite unrelated fields ([#1217](#1217)).
 * Use `dyn.Value` as input to generating Terraform JSON ([#1218](#1218)).

API Changes:
 * Changed `databricks lakehouse-monitors update` command with new required argument order.
 * Added `databricks online-tables` command group.

OpenAPI commit cdd76a98a4fca7008572b3a94427566dd286c63b (2024-02-19)
Dependency updates:
 * Bump Terraform provider to v1.36.2 ([#1215](#1215)).
 * Bump github.com/databricks/databricks-sdk-go from 0.32.0 to 0.33.0 ([#1222](#1222)).
@andrewnester andrewnester mentioned this pull request Feb 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Feb 20, 2024
CLI:
* Add support for UC Volumes to the `databricks fs` commands
([#1209](#1209)).

Bundles:
* Use dynamic configuration model in bundles
([#1098](#1098)).
* Allow use of variables references in primitive non-string fields
([#1219](#1219)).
* Add an experimental default-sql template
([#1051](#1051)).
* Add an experimental dbt-sql template
([#1059](#1059)).

Internal:
* Add fork-user to winget release workflow
([#1214](#1214)).
* Use `any` as type for data sources and resources in `tf/schema`
([#1216](#1216)).
* Avoid infinite recursion when normalizing a recursive type
([#1213](#1213)).
* Fix issue where interpolating a new ref would rewrite unrelated fields
([#1217](#1217)).
* Use `dyn.Value` as input to generating Terraform JSON
([#1218](#1218)).

API Changes:
* Changed `databricks lakehouse-monitors update` command with new
required argument order.
 * Added `databricks online-tables` command group.

OpenAPI commit cdd76a98a4fca7008572b3a94427566dd286c63b (2024-02-19)
Dependency updates:
* Bump Terraform provider to v1.36.2
([#1215](#1215)).
* Bump github.com/databricks/databricks-sdk-go from 0.32.0 to 0.33.0
([#1222](#1222)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants