[doc] remove references to AIR (ray-project#38338)

Signed-off-by: Victor <vctr.y.m@example.com>
vymao · Oct 11, 2023 · 61e1ab2 · 61e1ab2
1 parent 467ae98
commit 61e1ab2
Show file tree

Hide file tree

Showing 27 changed files with 52 additions and 65 deletions.
diff --git a/README.rst b/README.rst
@@ -14,14 +14,14 @@
 
 |
 
-Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for simplifying ML compute:
+Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI libraries for simplifying ML compute:
 
 .. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/what-is-ray-padded.svg
 
 ..
   https://docs.google.com/drawings/d/1Pl8aCYOsZCo61cmp57c7Sja6HhIygGCvSZLi_AuBuqo/edit
 
-Learn more about `Ray AIR`_ and its libraries:
+Learn more about `Ray AI Libraries`_:
 
 - `Data`_: Scalable Datasets for ML
 - `Train`_: Distributed Training
@@ -66,7 +66,6 @@ More Information
 
 - `Documentation`_
 - `Ray Architecture whitepaper`_
-- `Ray AIR Technical whitepaper`_
 - `Exoshuffle: large-scale data shuffle in Ray`_
 - `Ownership: a distributed futures system for fine-grained tasks`_
 - `RLlib paper`_
@@ -78,15 +77,14 @@ More Information
 - `Ray HotOS paper`_
 - `Ray Architecture v1 whitepaper`_
 
-.. _`Ray AIR`: https://docs.ray.io/en/latest/ray-air/getting-started.html
+.. _`Ray AI Libraries`: https://docs.ray.io/en/latest/ray-air/getting-started.html
 .. _`Ray Core`: https://docs.ray.io/en/latest/ray-core/walkthrough.html
 .. _`Tasks`: https://docs.ray.io/en/latest/ray-core/tasks.html
 .. _`Actors`: https://docs.ray.io/en/latest/ray-core/actors.html
 .. _`Objects`: https://docs.ray.io/en/latest/ray-core/objects.html
 .. _`Documentation`: http://docs.ray.io/en/latest/index.html
 .. _`Ray Architecture v1 whitepaper`: https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview
 .. _`Ray Architecture whitepaper`: https://docs.google.com/document/d/1tBw9A4j62ruI5omIJbMxly-la5w4q_TjyJgJL_jN2fI/preview
-.. _`Ray AIR Technical whitepaper`: https://docs.google.com/document/d/1bYL-638GN6EeJ45dPuLiPImA8msojEDDKiBx3YzB4_s/preview
 .. _`Exoshuffle: large-scale data shuffle in Ray`: https://arxiv.org/abs/2203.05072
 .. _`Ownership: a distributed futures system for fine-grained tasks`: https://www.usenix.org/system/files/nsdi21-wang.pdf
 .. _`Ray paper`: https://arxiv.org/abs/1712.05889

diff --git a/doc/source/cluster/kubernetes/examples/ml-example.md b/doc/source/cluster/kubernetes/examples/ml-example.md
@@ -1,6 +1,6 @@
 (kuberay-ml-example)=
 
-# Ray AIR XGBoostTrainer on Kubernetes
+# Ray Train XGBoostTrainer on Kubernetes
 
 :::{note}
 To learn the basics of Ray on Kubernetes, we recommend taking a look

diff --git a/doc/source/cluster/vms/examples/ml-example.md b/doc/source/cluster/vms/examples/ml-example.md
@@ -1,6 +1,6 @@
 (clusters-vm-ml-example)=
 
-# Ray AIR XGBoostTrainer on VMs
+# Ray Train XGBoostTrainer on VMs
 
 :::{note}
 To learn the basics of Ray on VMs, we recommend taking a look

diff --git a/doc/source/data/preprocessors.rst b/doc/source/data/preprocessors.rst
@@ -5,7 +5,7 @@ Using Preprocessors
 
 Data preprocessing is a common technique for transforming raw data into features for a machine learning model.
 In general, you may want to apply the same preprocessing logic to your offline training data and online inference data.
-Ray AIR provides several common preprocessors out of the box and interfaces to define your own custom logic.
+Ray Data provides several common preprocessors out of the box and interfaces to define your own custom logic.
 
 .. https://docs.google.com/drawings/d/1ZIbsXv5vvwTVIEr2aooKxuYJ_VL7-8VMNlRinAiPaTI/edit
 
@@ -55,8 +55,8 @@ Finally, call ``transform_batch`` on a single batch of data.
     :start-after: __preprocessor_transform_batch_start__
     :end-before: __preprocessor_transform_batch_end__
 
-Life of an AIR preprocessor
----------------------------
+Life of a preprocessor
+----------------------
 
 Now that we've gone over the basics, let's dive into how ``Preprocessor``\s fit into an end-to-end application built with AIR.
 The diagram below depicts an overview of the main steps of a ``Preprocessor``:
@@ -132,7 +132,7 @@ Types of preprocessors
 Built-in preprocessors
 ~~~~~~~~~~~~~~~~~~~~~~
 
-Ray AIR provides a handful of preprocessors out of the box.
+Ray Data provides a handful of preprocessors out of the box.
 
 **Generic preprocessors**
 

diff --git a/doc/source/ray-air/api/checkpoint.rst b/doc/source/ray-air/api/checkpoint.rst
@@ -1,7 +1,7 @@
 .. _checkpoint-api-ref:
 
-Ray AIR Checkpoint
-==================
+Checkpoints
+===========
 
 .. seealso::
 

diff --git a/doc/source/ray-air/api/configs.rst b/doc/source/ray-air/api/configs.rst
@@ -2,7 +2,7 @@
 Ray AIR Configurations
 ======================
 
-.. TODO(ml-team): Add a general AIR configuration guide that covers all of these configs.
+.. TODO(ml-team): Add a general configuration guide that covers all of these configs.
 
 .. currentmodule:: ray
 

diff --git a/doc/source/ray-air/api/dataset-ingest.rst b/doc/source/ray-air/api/dataset-ingest.rst
@@ -1,5 +1,5 @@
-Ray Data Ingest into AIR Trainers
-=================================
+Ray Data Ingest into Ray Train
+==============================
 
 .. seealso::
 

diff --git a/doc/source/ray-air/api/integrations.rst b/doc/source/ray-air/api/integrations.rst
@@ -1,7 +1,7 @@
 .. _air-ml-integrations:
 
-Ray AIR Integrations with ML Libraries
-======================================
+Integrations with ML Libraries
+==============================
 
 .. currentmodule:: ray
 

diff --git a/doc/source/ray-air/deployment.rst b/doc/source/ray-air/deployment.rst
@@ -18,9 +18,9 @@ Design Principles
 Pick and choose your own libraries
 ----------------------------------
 
-You can pick and choose which Ray AIR libraries you want to use.
+You can pick and choose which Ray AI libraries you want to use.
 
-This is applicable if you are an ML engineer who wants to independently use a Ray AIR library for a specific AI app or service use case and do not need to integrate with existing ML platforms.
+This is applicable if you are an ML engineer who wants to independently use a Ray library for a specific AI app or service use case and do not need to integrate with existing ML platforms.
 
 For example, Alice wants to use RLlib to train models for her work project. Bob wants to use Ray Serve to deploy his model pipeline. In both cases, Alice and Bob can leverage these libraries independently without any coordination.
 
@@ -32,14 +32,14 @@ In the above diagram:
 
 * Only one library is used -- showing that you can pick and choose and do not need to replace all of your ML infrastructure to use Ray.
 * You can use one of :ref:`Ray's many deployment modes <jobs-overview>` to launch and manage Ray clusters and Ray applications.
-* AIR libraries can read data from external storage systems such as Amazon S3 / Google Cloud Storage, as well as store results there.
+* Ray AI libraries can read data from external storage systems such as Amazon S3 / Google Cloud Storage, as well as store results there.
 
 
 
 Existing ML Platform integration
 --------------------------------
 
-You may already have an existing machine learning platform but want to use some subset of Ray's AIR libraries. For example, an ML engineer wants to use Ray within the ML Platform their organization has purchased (e.g., SageMaker, Vertex).
+You may already have an existing machine learning platform but want to use some subset of Ray's ML libraries. For example, an ML engineer wants to use Ray within the ML Platform their organization has purchased (e.g., SageMaker, Vertex).
 
 Ray can complement existing machine learning platforms by integrating with existing pipeline/workflow orchestrators, storage, and tracking services, without requiring a replacement of your entire ML platform.
 
@@ -49,7 +49,7 @@ Ray can complement existing machine learning platforms by integrating with exist
 
 In the above diagram:
 
-1. A workflow orchestrator such as AirFlow, Oozie, SageMaker Pipelines, etc. is responsible for scheduling and creating Ray clusters and running Ray AIR apps and services. The Ray AIR app may be part of a larger orchestrated workflow (e.g., Spark ETL, then Training on Ray).
+1. A workflow orchestrator such as AirFlow, Oozie, SageMaker Pipelines, etc. is responsible for scheduling and creating Ray clusters and running Ray apps and services. The Ray application may be part of a larger orchestrated workflow (e.g., Spark ETL, then Training on Ray).
 2. Lightweight orchestration of task graphs can be handled entirely within Ray. External workflow orchestrators will integrate nicely but are only needed if running non-Ray steps.
 3. Ray clusters can also be created for interactive use (e.g., Jupyter notebooks, Google Colab, Databricks Notebooks, etc.).
 4. Ray Train, Data, and Serve provide integration with Feature Stores like Feast for Training and Serving.

diff --git a/doc/source/ray-air/examples/dreambooth_finetuning.rst b/doc/source/ray-air/examples/dreambooth_finetuning.rst
@@ -1,7 +1,7 @@
 :orphan:
 
-Fine-tuning DreamBooth with Ray AIR
-===================================
+Fine-tuning DreamBooth with Ray Train
+=====================================
 
 This example shows how to do DreamBooth fine-tuning of a Stable Diffusion model using Ray AIR.
 See the original `DreamBooth project homepage <https://dreambooth.github.io/>`_ for more details on what this fine-tuning method achieves.
@@ -12,7 +12,7 @@ See the original `DreamBooth project homepage <https://dreambooth.github.io/>`_
 
 This example is built on top of `this HuggingFace 🤗 tutorial <https://huggingface.co/docs/diffusers/training/dreambooth>`_.
 See the HuggingFace tutorial for useful explanations and suggestions on hyperparameters.
-**Adapting this example to Ray AIR allows you to easily scale up the fine-tuning to an arbitrary number of distributed training workers.**
+**Adapting this example to Ray Train allows you to easily scale up the fine-tuning to an arbitrary number of distributed training workers.**
 
 **Compute requirements:**
 
@@ -92,14 +92,14 @@ Distributed training
 
 The central part of the training code is the *training function*. This function accepts a configuration dict that contains the hyperparameters. It then defines a regular PyTorch training loop.
 
-There are only a few locations where we interact with the Ray AIR API. We marked them with in-line comments in the snippet below.
+There are only a few locations where we interact with the Ray Train API. We marked them with in-line comments in the snippet below.
 
 Remember that we want to do data-parallel training for all our models.
 
 
 #. We load the data shard for each worker with session.get_dataset_shard("train")
 #. We iterate over the dataset with train_dataset.iter_torch_batches()
-#. We report results to Ray AIR with session.report(results)
+#. We report results to Ray Train with session.report(results)
 
 The code was compacted for brevity. The `full code <https://github.com/ray-project/ray/tree/master/doc/source/templates/05_dreambooth_finetuning/dreambooth/train.py>`_ is more thoroughly annotated.
 

diff --git a/doc/source/ray-contribute/whitepaper.rst b/doc/source/ray-contribute/whitepaper.rst
@@ -7,5 +7,3 @@ For an in-depth overview of Ray internals, check out the `Ray 2.0 Architecture w
 The previous v1.0 whitepaper can be found `here <https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview>`__.
 
 For more about the scalability and performance of the Ray dataplane, see the `Exoshuffle paper <https://arxiv.org/abs/2203.05072>`__.
-
-To learn about the technical design and value proposition of Ray AI Runtime (AIR) see the `Ray AIR Technical Whitepaper <https://docs.google.com/document/d/1bYL-638GN6EeJ45dPuLiPImA8msojEDDKiBx3YzB4_s/preview>`__.
diff --git a/doc/source/ray-observability/user-guides/configure-logging.md b/doc/source/ray-observability/user-guides/configure-logging.md
@@ -176,19 +176,17 @@ import logging
 logger = logging.getLogger("ray")
 logger # Modify the Ray logging config
 ```
-Similarly, to modify the logging configuration for Ray AIR or other libraries, specify the appropriate logger name:
+Similarly, to modify the logging configuration for Ray libraries, specify the appropriate logger name:
 
 ```python
 import logging
 
 # First, get the handle for the logger you want to modify
-ray_air_logger = logging.getLogger("ray.air")
 ray_data_logger = logging.getLogger("ray.data")
 ray_tune_logger = logging.getLogger("ray.tune")
 ray_rllib_logger = logging.getLogger("ray.rllib")
 ray_train_logger = logging.getLogger("ray.train")
 ray_serve_logger = logging.getLogger("ray.serve")
-ray_workflow_logger = logging.getLogger("ray.workflow")
 
 # Modify the ray.data logging level
 ray_data_logger.setLevel(logging.WARNING)
@@ -224,8 +222,8 @@ If you want to control the logger for particular actors or tasks, view [customiz
 
 :::
 
-:::{tab-item} Ray AIR or other libraries
-If you are using Ray AIR or any of the Ray libraries, follow the instructions provided in the documentation for the library.
+:::{tab-item} Ray libraries
+If you are using any of the Ray libraries, follow the instructions provided in the documentation for the library.
 :::
 
 ::::
@@ -405,8 +403,8 @@ logging_setup_func()
 ```
 :::
 
-:::{tab-item} Ray AIR or other libraries
-If you are using Ray AIR or any of the Ray libraries, follow the instructions provided in the documentation for the library.
+:::{tab-item} Ray libraries
+If you are using any of the Ray libraries, follow the instructions provided in the documentation for the library.
 :::
 
 ::::

diff --git a/doc/source/ray-overview/index.md b/doc/source/ray-overview/index.md
@@ -70,7 +70,7 @@ Ray's unified compute framework consists of three layers:
             :outline:
             :expand:
         
-            Ray AIR Libraries
+            Ray AI Libraries
 
     .. grid-item-card::
         
@@ -113,9 +113,9 @@ Each of [Ray's](../ray-air/getting-started) five native libraries distributes a
 - [Train](../train/train): Distributed multi-node and multi-core model training with fault tolerance that integrates with popular training libraries.
 - [Tune](../tune/index): Scalable hyperparameter tuning to optimize model performance.
 - [Serve](../serve/index): Scalable and programmable serving to deploy models for online inference, with optional microbatching to improve performance.
-- [RLlib](../rllib/index): Scalable distributed reinforcement learning workloads that integrate with the other Ray AIR libraries.
+- [RLlib](../rllib/index): Scalable distributed reinforcement learning workloads.
 
-Ray's libraries are for both data scientists and ML engineers alike. For data scientists, AIR can be used to scale individual workloads, and also end-to-end ML applications. For ML Engineers, AIR provides scalable platform abstractions that can be used to easily onboard and integrate tooling from the broader ML ecosystem.
+Ray's libraries are for both data scientists and ML engineers alike. For data scientists, these libraries can be used to scale individual workloads, and also end-to-end ML applications. For ML Engineers, these libraries provides scalable platform abstractions that can be used to easily onboard and integrate tooling from the broader ML ecosystem.
 
 For custom applications, the [Ray Core](../ray-core/walkthrough) library enables Python developers to easily build scalable, distributed systems that can run on a laptop, cluster, cloud, or Kubernetes. It's the foundation that Ray AI Runtime libraries and third-party integrations (Ray ecosystem) are built on.
 

diff --git a/doc/source/ray-overview/learn-more.md b/doc/source/ray-overview/learn-more.md
@@ -46,7 +46,6 @@ Please raise an issue if any of the below links are broken, or if you'd like to
 
 -   [Ray 2.0 Architecture whitepaper](https://docs.google.com/document/d/1tBw9A4j62ruI5omIJbMxly-la5w4q_TjyJgJL_jN2fI/preview)
 -   [Ray 1.0 Architecture whitepaper (old)](https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview)
--   [Ray AIR Technical whitepaper](https://docs.google.com/document/d/1bYL-638GN6EeJ45dPuLiPImA8msojEDDKiBx3YzB4_s/preview)
 -   [Exoshuffle: large-scale data shuffle in Ray](https://arxiv.org/abs/2203.05072)
 -   [RLlib paper](https://arxiv.org/abs/1712.09381)
 -   [RLlib flow paper](https://arxiv.org/abs/2011.12719)

diff --git a/doc/source/ray-overview/use-cases.rst b/doc/source/ray-overview/use-cases.rst
@@ -183,7 +183,7 @@ Read more about building ML platforms with Ray in :ref:`this section <ray-for-ml
 End-to-End ML Workflows
 -----------------------
 
-The following highlights examples utilizing Ray AIR to implement end-to-end ML workflows.
+The following highlights examples utilizing Ray AI libraries to implement end-to-end ML workflows.
 
 - `[Example] Text classification with Ray </ray-air/examples/huggingface_text_classification>`_
 - `[Example] Image classification with Ray </ray-air/examples/torch_image_example>`_

diff --git a/doc/source/ray-references/glossary.rst b/doc/source/ray-references/glossary.rst
@@ -98,12 +98,6 @@ documentation, sorted alphabetically.
         A batch size in the context of model training is the number of data points used
         to compute and apply one gradient update to the model weights.
 
-    Batch predictor
-        A :class:`Ray AIR Batch Predictor<ray.train.predictor.Predictor>` builds on the Predictor class
-        to parallelize inference on a large dataset. A Batch predictor shards the
-        dataset to allow multiple workers to do inference on a smaller number of data
-        points and then aggregating all the worker predictions at the end.
-
     Block
         A processing unit of data. A :class:`~ray.data.Dataset` consists of a
         collection of blocks.
@@ -118,8 +112,8 @@ documentation, sorted alphabetically.
         :ref:`Learn more<ray-placement-group-doc-ref>`.
 
     Checkpoint
-        An AIR Checkpoint is a common interface for accessing data and models across
-        different AIR components and libraries. A Checkpoint can have its data
+        A Ray Train Checkpoint is a common interface for accessing data and models across
+        different Ray components and libraries. A Checkpoint can have its data
         represented as a directory on local (on-disk) storage, as a directory on an
         external storage (e.g., cloud storage), and as an in-memory dictionary.
         :ref:`Learn more<checkpoint-api-ref>`,
@@ -246,7 +240,7 @@ documentation, sorted alphabetically.
     .. TODO: Event
 
     Fault tolerance
-        Fault tolerance in Ray AIR consists of experiment-level and trial-level
+        Fault tolerance in Ray Train and Tune consists of experiment-level and trial-level
         restoration. Experiment-level restoration refers to resuming all trials,
         in the event that an experiment is interrupted in the middle of training due
         to a cluster-level failure. Trial-level restoration refers to resuming
@@ -418,7 +412,7 @@ documentation, sorted alphabetically.
 
     Preprocessor
         :ref:`An interface used to preprocess a Dataset<air-preprocessor-ref>` for
-        training and inference (prediction) with other AIR components. Preprocessors
+        training and inference (prediction). Preprocessors
         can be stateful, as they can be fitted on the training dataset before being
         used to transform the training and evaluation datasets.
 

diff --git a/doc/source/rllib/rllib-examples.rst b/doc/source/rllib/rllib-examples.rst
@@ -221,7 +221,7 @@ Community Examples
    Example of training robotic control policies in SageMaker with RLlib.
 - `Sequential Social Dilemma Games <https://github.com/eugenevinitsky/sequential_social_dilemma_games>`__:
    Example of using the multi-agent API to model several `social dilemma games <https://arxiv.org/abs/1702.03037>`__.
-- `Simple custom environment for single RL with Ray 2.0, Tune and Air <https://github.com/lcipolina/Ray_tutorials/blob/main/RLLIB_Ray2_0.ipynb>`__:
+- `Simple custom environment for single RL with Ray and RLlib <https://github.com/lcipolina/Ray_tutorials/blob/main/RLLIB_Ray2_0.ipynb>`__:
    Create a custom environment and train a single agent RL using Ray 2.0 with Tune and Air.
 - `StarCraft2 <https://github.com/oxwhirl/smac>`__:
    Example of training in StarCraft2 maps with RLlib / multi-agent.

diff --git a/doc/source/rllib/rllib-saving-and-loading-algos-and-policies.rst b/doc/source/rllib/rllib-saving-and-loading-algos-and-policies.rst
@@ -27,7 +27,7 @@ or a single :py:class:`~ray.rllib.policy.policy.Policy` instance.
 The Algorithm- or Policy instances that were used to create the checkpoint in the first place
 may or may not have been trained prior to this.
 
-RLlib uses the new Ray AIR :py:class:`~ray.air.checkpoint.Checkpoint` class to create checkpoints and
+RLlib uses the :py:class:`~ray.air.checkpoint.Checkpoint` class to create checkpoints and
 restore objects from them.
 
 The main file in a checkpoint directory, containing the state information, is currently

diff --git a/doc/source/templates/04_finetuning_llms_with_deepspeed/README.md b/doc/source/templates/04_finetuning_llms_with_deepspeed/README.md
@@ -36,7 +36,7 @@ The pre-trained models for these models is quite large (12.8G for 7B model and 1
 
 ### Cloud storage
 
-Similarly the checkpoints during training can be quite large and we would like to be able to save those checkpoints to the familiar huggingface format so that we can serve it conveniently. The fine-tuning script in this template uses Ray Air Checkpointing to sync the checkpoints created by each node back to a centralized cloud storage on AWS S3. The final file structure for each checkpoint will have a look similar to the following structure:
+Similarly the checkpoints during training can be quite large and we would like to be able to save those checkpoints to the familiar huggingface format so that we can serve it conveniently. The fine-tuning script in this template uses Ray Train Checkpointing to sync the checkpoints created by each node back to a centralized cloud storage on AWS S3. The final file structure for each checkpoint will have a look similar to the following structure:
 
 ```
 aws s3 ls s3://<bucket_path>/checkpoint_00000