Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Clean up Datasource abstractions #40296

Closed
15 of 16 tasks
bveeramani opened this issue Oct 12, 2023 · 0 comments
Closed
15 of 16 tasks

[Data] Clean up Datasource abstractions #40296

bveeramani opened this issue Oct 12, 2023 · 0 comments

Comments

bveeramani added a commit that referenced this issue Nov 3, 2023
This PR adds `_FileDatasink`, and it's user-facing subclasses `RowBasedFileDatasink` and `BockBasedFileDatasink`. #40693 migrates `FileDatasource` implementations to the new APIs. These changes are part of a larger effort to clean up `Datasource` interfaces (#40296).  

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
bveeramani added a commit that referenced this issue Nov 3, 2023
This PR is part of a larger effort to clean up Datasource interfaces (#40296). #40199 introduces a new `Datasink` abstraction, and this PR migrates the write-supporting database-related `Datasource`s (BigQuery and SQL) to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
bveeramani added a commit that referenced this issue Nov 3, 2023
This PR is part of a larger effort to clean up Datasource interfaces (#40296). This #40691 added the new FileDatasink base class, and this PR migrates FileDatasource implementations to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
bveeramani added a commit that referenced this issue Nov 3, 2023
This PR is part of a larger effort to clean up Datasource interfaces (#40296). #40691 added the new FileDatasink base class, and this PR migrates ParquetDatasource to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
stephanie-wang pushed a commit that referenced this issue Nov 13, 2023
#40296 copied write-related code from Datasource implementations to Datasink implementations. As a result, there's now unused write-related code in existing Datasource implementations. This PR removes them.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this issue Nov 29, 2023
This PR adds `_FileDatasink`, and it's user-facing subclasses `RowBasedFileDatasink` and `BockBasedFileDatasink`. ray-project#40693 migrates `FileDatasource` implementations to the new APIs. These changes are part of a larger effort to clean up `Datasource` interfaces (ray-project#40296).  

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this issue Nov 29, 2023
This PR is part of a larger effort to clean up Datasource interfaces (ray-project#40296). ray-project#40199 introduces a new `Datasink` abstraction, and this PR migrates the write-supporting database-related `Datasource`s (BigQuery and SQL) to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this issue Nov 29, 2023
This PR is part of a larger effort to clean up Datasource interfaces (ray-project#40296). This ray-project#40691 added the new FileDatasink base class, and this PR migrates FileDatasource implementations to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this issue Nov 29, 2023
This PR is part of a larger effort to clean up Datasource interfaces (ray-project#40296). ray-project#40691 added the new FileDatasink base class, and this PR migrates ParquetDatasource to the new API.

The primary motivation for these changes is to reduced complexity of our internal code base. For more information, see https://docs.google.com/document/d/1Bqhbzvxv7liwpOhyBzRVy5tOzXdy-NiMSFa-6hupr18/edit#heading=h.rytitv546vx5.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this issue Nov 29, 2023
ray-project#40296 copied write-related code from Datasource implementations to Datasink implementations. As a result, there's now unused write-related code in existing Datasource implementations. This PR removes them.

---------

Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant