Overview <overview> quickstart user-guide examples api/api data-internals
Ray Data is a scalable data processing library for ML workloads. It provides flexible and performant APIs for scaling Offline batch inference <batch_inference_overview>
and Data preprocessing and ingest for ML training <ml_ingest_overview>
. Ray Data uses streaming execution to efficiently process large datasets.
To install Ray Data, run:
$ pip install -U 'ray[data]'
To learn more about installing Ray and its libraries, see Installing Ray <installation>
.
1 2 2 2
Ray Data Overview ^^^
Get an overview of Ray Data, the workloads that it supports, and how it compares to alternatives.
+++ .. button-ref:: data_overview :color: primary :outline: :expand:
Ray Data Overview
Quickstart ^^^
Understand the key concepts behind Ray Data. Learn what Datasets are and how they're used.
+++ .. button-ref:: data_quickstart :color: primary :outline: :expand:
Quickstart
User Guides ^^^
Learn how to use Ray Data, from basic usage to end-to-end guides.
+++ .. button-ref:: data_user_guide :color: primary :outline: :expand:
Learn how to use Ray Data
Examples ^^^
Find both simple and scaling-out examples of using Ray Data.
+++ .. button-ref:: examples :color: primary :outline: :expand:
Ray Data Examples
API ^^^
Get more in-depth information about the Ray Data API.
+++ .. button-ref:: data-api :color: primary :outline: :expand:
Read the API Reference
Ray Blogs ^^^
Get the latest on engineering updates from the Ray team and how companies are using Ray Data.
+++ .. button-link:: https://www.anyscale.com/blog?tag=ray-datasets :color: primary :outline: :expand:
Read the Ray blogs