Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): service add DBFS API 2.0 support #3334

Merged
merged 9 commits into from
Oct 26, 2023

Conversation

morristai
Copy link
Member

@morristai morristai commented Oct 18, 2023

Description

Additional Notes

  • Read operation is a bit tricky, as discussed here. As for now, it can read whole object and read for specific range, but reqwest::Body::bytes_stream() in poll_next() will not return whole response at once if the file is too big, which causes serde json failed. Will need to discuss for possible implementation.

TODO

  • Append write need to add new oio write trait, as we discuss here.
  • Add new GitHub Action e2e test workflow, but we need to setup Databricks cluster for this purpose.
  • Add necessary unit tests based on PR review.

@morristai morristai marked this pull request as draft October 18, 2023 05:43
core/src/raw/http_util/client.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
@morristai morristai force-pushed the feat/service_add_dbfs_api_2.0 branch from 0109912 to a06ad4c Compare October 25, 2023 14:30
@morristai morristai marked this pull request as ready for review October 25, 2023 14:31
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/backend.rs Show resolved Hide resolved
core/src/services/dbfs/core.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
core/src/services/dbfs/reader.rs Outdated Show resolved Hide resolved
@Xuanwo
Copy link
Member

Xuanwo commented Oct 26, 2023

By the way, testing DBFS can be challenging as Databricks is a substantial project that we require:

image

@morristai
Copy link
Member Author

By the way, testing DBFS can be challenging as Databricks is a substantial project that we require:

image

Is there anything I can do on my end?

@Xuanwo
Copy link
Member

Xuanwo commented Oct 26, 2023

Is there anything I can do on my end?

I assume we require a sponsor to provide us with a Databricks workspace for testing.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, let's rock!

@Xuanwo Xuanwo merged commit a4b260a into apache:main Oct 26, 2023
110 of 111 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add new service support: DBFS API 2.0
2 participants