-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Add read_avro
#43663
[Data] Add read_avro
#43663
Conversation
93998eb
to
c7028e9
Compare
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
c7028e9
to
5ac94e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Looks like the tests aren't currently run in CI. Could you update the BUILD
file like how we've done for test_text
?
Lines 300 to 306 in d40ef0c
py_test( | |
name = "test_text", | |
size = "small", | |
srcs = ["tests/test_text.py"], | |
tags = ["team:data", "exclusive"], | |
deps = ["//:ray_lib", ":conftest"], | |
) |
Also, could you update input_output.rst
?
ray/doc/source/data/api/input_output.rst
Lines 58 to 65 in d40ef0c
Text | |
---- | |
.. autosummary:: | |
:nosignatures: | |
:toctree: doc/ | |
read_text |
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
06c0927
to
0d9d782
Compare
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
/rerun-check buildkite/premerge |
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Signed-off-by: Stefan He <hebiaobuaa@gmail.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
read_avro
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w/ one minor nit.
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
…into pr/43663 Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Avro is a widely used data serialization system that integrates well with many big data processing environments. By supporting Avro data storage format for IO operations in Ray, we can enhance Ray's interoperability with the data ecosystem, making it easier for users to work with Avro files directly in their Ray applications. This addition aims to provide a seamless experience for users dealing with Avro formatted data for analytics, machine learning, and other data-intensive tasks. --------- Signed-off-by: Stefan He <hebiaobuaa@gmail.com> Signed-off-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Biao He(bhe) <bhe@linkedin.com> Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Balaji Veeramani <balaji@anyscale.com>
Why are these changes needed?
Avro is a widely used data serialization system that integrates well with many big data processing environments. By supporting Avro data storage format for IO operations in Ray, we can enhance Ray's interoperability with the data ecosystem, making it easier for users to work with Avro files directly in their Ray applications. This addition aims to provide a seamless experience for users dealing with Avro formatted data for analytics, machine learning, and other data-intensive tasks.
Related issue number
To Close #43548
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.Testing