Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workspace export-dir command #449

Merged
merged 14 commits into from
Jun 8, 2023
Merged

Add workspace export-dir command #449

merged 14 commits into from
Jun 8, 2023

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Jun 7, 2023

Changes

This PR:

  1. Adds the export-dir command
  2. Changes filer.Read to return an error if a user tries to read a directory
  3. Adds returning internal file structures from filer.Stat().Sys()

Tests

Integration tests and manually

@shreyas-goenka
Copy link
Contributor Author

shreyas-goenka commented Jun 7, 2023

Output:

case 1: command is run, directory foo does not exist

shreyas.goenka@THW32HFW6T vandaley-industries % cli workspace export-dir /Users/me@example.com/december foo
Export started. Download files from  /Users/me@example.com/december
Downloaded /Users/me@example.com/december/.gitignore -> foo/.gitignore
Downloaded /Users/me@example.com/december/apple -> foo/apple
Downloaded /Users/me@example.com/december/hello_world -> foo/hello_world.py
Downloaded /Users/me@example.com/december/mango -> foo/mango
Downloaded /Users/me@example.com/december/veggies/tomatoes -> foo/veggies/tomatoes
Export complete. Files can be found at foo

case 2: foo exists, with a couple of files in it.

shreyas.goenka@THW32HFW6T vandaley-industries % rm foo/apple foo/mango
shreyas.goenka@THW32HFW6T vandaley-industries % cli workspace export-dir /Users/me@example.com/december foo
Export started. Download files from  /Users/me@example.com/december
Skipping /Users/me@example.com/december/.gitignore because foo/.gitignore already exists
Downloaded /Users/me@example.com/december/apple -> foo/apple
Skipping /Users/me@example.com/december/hello_world because foo/hello_world.py already exists
Downloaded /Users/me@example.com/december/mango -> foo/mango
Skipping /Users/me@example.com/december/veggies/tomatoes because foo/veggies/tomatoes already exists
Export complete. Files can be found at  foo

case 3: foo exists, with a couple of files in it. We use the overwrite flag

shreyas.goenka@THW32HFW6T vandaley-industries % cli workspace export-dir /Users/me@example.com/december foo --overwrite
Export started. Download files from  /Users/me@example.com/december
Downloaded /Users/me@example.com/december/.gitignore -> foo/.gitignore
Downloaded /Users/me@example.com/december/apple -> foo/apple
Downloaded /Users/me@example.com/december/hello_world -> foo/hello_world.py
Downloaded /Users/me@example.com/december/mango -> foo/mango
Downloaded /Users/me@example.com/december/veggies/tomatoes -> foo/veggies/tomatoes
Export complete. Files can be found at foo

case 4: output with the json mode enabled

shreyas.goenka@THW32HFW6T vandaley-industries % cli workspace export-dir /Users/me@example.com/december foo --overwrite --output=json
{
  "source_path":"/Users/me@example.com/december",
  "type":"EXPORT_STARTED"
}{
  "source_path":"/Users/me@example.com/december/.gitignore",
  "target_path":"foo/.gitignore",
  "type":"DOWNLOAD_COMPLETE"
}{
  "source_path":"/Users/me@example.com/december/apple",
  "target_path":"foo/apple",
  "type":"DOWNLOAD_COMPLETE"
}{
  "source_path":"/Users/me@example.com/december/hello_world",
  "target_path":"foo/hello_world.py",
  "type":"DOWNLOAD_COMPLETE"
}{
  "source_path":"/Users/me@example.com/december/mango",
  "target_path":"foo/mango",
  "type":"DOWNLOAD_COMPLETE"
}{
  "source_path":"/Users/me@example.com/december/veggies/tomatoes",
  "target_path":"foo/veggies/tomatoes",
  "type":"DOWNLOAD_COMPLETE"
}{
  "target_path":"foo",
  "type":"EXPORT_COMPLETED"
}%

@shreyas-goenka
Copy link
Contributor Author

shreyas-goenka commented Jun 7, 2023

github is experiencing an outage, so my latest commits are not in, but they are lint fixes, todo cleanups and making the test prefixes TestAcc

// 2. Allows use to error out if the path is a directory. This is needed
// because the Dbfs.Open method on the SDK does not error when the path is
// a directory
stat, err := w.Stat(ctx, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in databricks/databricks-sdk-go#415

I would prefer not adding a stat call for every read if we can avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a todo with a follow-up issue filed. I would rather not block on a go SDK release or remove test coverage

libs/filer/workspace_files_client.go Show resolved Hide resolved
libs/filer/workspace_files_client.go Outdated Show resolved Hide resolved
defer f.Close()

// Write content to the local file
r, err := workspaceFiler.Read(ctx, relPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means doing 2 stat calls for every object as is.

Copy link
Contributor Author

@shreyas-goenka shreyas-goenka Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think this is OK if a few assumptions hold true:

  1. Stat API calls do not overload webapp
  2. Latency added is reasonable

The reason I am fine with it, is this should be solved when we have the Files API since it is supposed to be a "DUMB" filesystem (no fancy extension stripping, service throws an error when we try to read a directory)

Right now it's us handing backend problems on the client side causing excessive API calls

cmd/workspace/workspace/export_dir.go Outdated Show resolved Hide resolved
cmd/workspace/workspace/export_dir.go Outdated Show resolved Hide resolved
cmd/workspace/workspace/export_dir.go Outdated Show resolved Hide resolved
@pietern
Copy link
Contributor

pietern commented Jun 8, 2023

This looks great -- comments are minor.

cmd/workspace/workspace/events.go Outdated Show resolved Hide resolved
cmd/workspace/workspace/events.go Outdated Show resolved Hide resolved
cmd/workspace/workspace/events.go Show resolved Hide resolved
cmd/workspace/workspace/export_dir.go Outdated Show resolved Hide resolved
libs/filer/workspace_files_client.go Outdated Show resolved Hide resolved
libs/filer/workspace_files_client.go Show resolved Hide resolved
@shreyas-goenka shreyas-goenka merged commit 4818541 into main Jun 8, 2023
4 checks passed
@shreyas-goenka shreyas-goenka deleted the export-dir branch June 8, 2023 16:15
@pietern pietern mentioned this pull request Jun 12, 2023
pietern added a commit that referenced this pull request Jun 12, 2023
## Changes

CLI:
* Add directory tracking to sync
([#425](#425)).
* Add fs cat command for dbfs files
([#430](#430)).
* Add fs ls command for dbfs
([#429](#429)).
* Add fs mkdirs command for dbfs
([#432](#432)).
* Add fs rm command for dbfs
([#433](#433)).
* Add installation instructions
([#458](#458)).
* Add new line to cmdio JSON rendering
([#443](#443)).
* Add profile on `databricks auth login`
([#423](#423)).
* Add readable console logger
([#370](#370)).
* Add workspace export-dir command
([#449](#449)).
* Added secrets input prompt for secrets put-secret command
([#413](#413)).
* Added spinner when loading command prompts
([#420](#420)).
* Better error message if can not load prompts
([#437](#437)).
* Changed service template to correctly handle required positional
arguments ([#405](#405)).
* Do not generate prompts for certain commands
([#438](#438)).
* Do not prompt for List methods
([#411](#411)).
* Do not use FgWhite and FgBlack for terminal output
([#435](#435)).
* Skip path translation of job task for jobs with a Git source
([#404](#404)).
* Tweak profile prompt
([#454](#454)).
* Update with the latest Go SDK
([#457](#457)).
* Use cmdio in version command for `--output` flag
([#419](#419)).

Bundles:
* Check for nil environment before accessing it
([#453](#453)).

Dependencies:
* Bump github.com/hashicorp/terraform-json from 0.16.0 to 0.17.0
([#459](#459)).
* Bump github.com/mattn/go-isatty from 0.0.18 to 0.0.19
([#412](#412)).

Internal:
* Add Mkdir and ReadDir functions to filer.Filer interface
([#414](#414)).
* Add Stat function to filer.Filer interface
([#421](#421)).
* Add check for path is a directory in filer.ReadDir
([#426](#426)).
* Add fs.FS adapter for the filer interface
([#422](#422)).
* Add implementation of filer.Filer for local filesystem
([#460](#460)).
* Allow equivalence checking of filer errors to fs errors
([#416](#416)).
* Fix locker integration test
([#417](#417)).
* Implement DBFS filer
([#139](#139)).
* Include recursive deletion in filer interface
([#442](#442)).
* Make filer.Filer return fs.DirEntry from ReadDir
([#415](#415)).
* Speed up sync integration tests
([#428](#428)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants