# Deep-dive into the ray task's lifecycle

We start by visualizing a task's execution using the following diagram:

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_0.svg" width="800px">

In case you skipped it, this same diagram was presented in our high-level overview of ray tasks.

We will proceed to add more color to this diagram providing useful details for each step of the process

## Small diversion: what are the main components of a ray cluster ?

A Ray cluster consists of:
- One or more **worker nodes**, where each worker node consists of the following processes:
    - **worker processes** responsible for task submission and execution.
    - A **raylet** responsible for resource management and task placement.
- One of the worker nodes is designated a **head node** and is responsible for running 
  - A **global control service** responsible for keeping track of the **cluster-level state** that is not supposed to change too frequently.

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/ray_cluster_detail_0.svg" width="800px">

<!-- 
Reference: 
- See [V2 architecture document -> Architecture Overview -> Design -> Components](https://docs.google.com/document/d/1tBw9A4j62ruI5omIJbMxly-la5w4q_TjyJgJL_jN2fI/preview#heading=h.cclei73t0j5p)
 -->

## Task Execution: Component attribution

Now that we are familiar with the different components on a ray cluster, here is our same tax execution diagram revisited with colors indicating which component is responsible for each step.

- One **worker process** submits the task
- The cluster **autoscaler** will hanlde upscaling nodes to meet new resource requirements
- **Raylet(s)** will handle task scheduling/placement on a worker
- **One worker process** executes the task
- The result information is sent back to the **submitter worker** once complete

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_1.svg" width="800px">

## Task Execution: Exporting and Loading Function Code

Remember a task wraps around a given function - in python a task decorates a python function.

- The submitter worker will serialize the function definition
    - In the case of python, ray makes use of a variant of pickle (cloudpickle) to serialize the function
- The submitter worker will then export the function definition to the GCS Store
- The executor worker will then load and cache the function definition from the GCS Store
- The executor worker will then deserialize the code and execute the function


<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_export_load.svg" width="900px">

<!-- References:
- See code:
    - Exporting Function to GCS Store
        1. [Python .remote calls ._remote](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L140)
        2. [Python ._remote pickles the function](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L299)
        3. [Python ._remote call exports the function via the function manager.export](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_private/function_manager.py#L273)
        4. [Which calls the cython GcsClient.internal_kv_put](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L2579)
        5. [Which calls the gcs_client.cc PythonGcsClient::InternalKVPut](https://github.com/ray-project/ray/blob/55ab6dfd6b415f8795dd1dfed7b3fde2558efc46/src/ray/gcs/gcs_client/gcs_client.cc#L312) that sets the key, value in the proper namespace 
    - Importing Function from GCS Store
        1. [When instantiating a CoreWorker, we add task receivers which will callback CoreWorker::ExecuteTask](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L147)
        2. [CoreWorker::ExecuteTask() will prepare a RayFunction and submit it to its execution callback](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L2721C21-L2721C44)
        3. [The task execution callback in the case of python will execute the function from cython given the set task_execution_handler](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L3075C43-L3075C65)
        4. [The task execution handler will execute the task with a cancellation handler](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L2064C17-L2064C55)
        5. [The handler will call the function_manager.get_execution_info](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L1944)
        6. [function_manager.get_execution_info will in turn call function_manager._wait_for_function](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_private/function_manager.py#L393)
        7. [function_manager._wait_for_function will in turn call function_manager.fetch_and_register_remote_function](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_private/function_manager.py#L455C33-L455C67)
        8. [function_manager.fetch_and_register_remote_function will in turn call function_manager.fetch_registered_method](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_private/function_manager.py#L299C37-L299C60)
        9. [function_manager.fetch_registered_method will in turn call gcs_client.internal_kv_get to read the function defintion from the GCS KV store](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_private/function_manager.py#L281)
    - Caching the function definition:
        10. [As a continuation the execution_infos in-memory dictionary mapping is updated to store the function definition] (https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L1946)
-->

## Task Execution: Submission - Resolving Dependencies and Data Locality 

Here are some key steps in task submission:

- A submitter worker won't request a task to be executed prior to resolving its dependencies.
- A submitter worker will chose the worker node that has most of the dependency data local to it.
- A submitter worker will request what ray calls a "Worker Lease" from the raylet on the data-locality-optimal node


<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_resolving_deps_data_locality.svg" width="800px">

let's unpack the above steps

### Resolving Task Dependencies

Given a particular task `task1` that depends on, objects `A` and `B` as inputs

The submitter worker process will perform these two main steps

1. Wait for each object to be available via async callbacks
    - remember `A` and `B` could very well be the outputs of a different task, hence why we need to wait 
2. Proceed with scheduling now that all dependencies are resolved

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/resolving_deps.svg" width="600px">

<!-- References:
- See code:
    1. [Python .remote calls ._remote](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L140)
    2. [python ._remote call calls submit_task](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L420)
    3. [submit_task calls the cython submit_task function defined here](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L3574)
    4. [cython submit_task Delegates to C++ CoreWorker::SubmitTask]((https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L3643)
    5. [CoreWorker::SubmitTask calls direct_task_submitter.SubmitTask](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L1949)
    5. [direct_task_submitter.SubmitTask calls ResolveDependencies to resolve dependencies](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/direct_task_transport.cc#L28)
    6. [ResolveDependencies calls InlineDependencies](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/dependency_resolver.cc#L117)
    7. [InlineDependencies fetches task metadata like the size](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/dependency_resolver.cc#L44C10-L44C10)
  -->

The submitter process will choose the node that has the **most number of object argument bytes** already local.

The diagram shows the same particular task `task1` we saw before. 

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/data_locality.svg" width="600px">

Note: "enforcing data locality" stage is skipped in case the task's specified scheduling policy is stringent (e.g. a node-affinity policy) - scheduling policies will be discussed in more detail later.

<!-- References:
- See code:
    1. [Python .remote calls ._remote](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L140)
    2. [python ._remote call calls submit_task](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/remote_function.py#L420)
    3. [submit_task calls the cython submit_task function](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L3643)
    4. [cython submit_task Delegates to SubmitTask from c++ Core Worker with calls direct_task_submitter.SubmitTask](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L1949)
    5. [direct_task_submitter.SubmitTask calls ResolveDependencies to resolve dependencies](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/direct_task_transport.cc#L28)
    6. [direct_task_submitter.SubmitTask as a callback will now call RequestNewWorkerIfNeeded](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/direct_task_transport.cc#L135)
    7. [RequestNewWorkerIfNeeded will in turn call GetBestNodeForTask](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/transport/direct_task_transport.cc#L394)
    8. [GetBestNodeForTask will pick a node for locality in case the scheduling strategy is not stringent (i.e. node affinity or spread)](https://github.com/ray-project/ray/blob/master/src/ray/core_worker/lease_policy.cc#L39)
    9. [GetBestNodeIdForTask will find the node with the most object bytes](https://github.com/ray-project/ray/blob/master/src/ray/core_worker/lease_policy.cc#L47C1-L48C1) -->

## Task Execution: Autoscaling nodes given resource requirements

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_autoscaling.svg" width="900px">

## Task Execution: Scheduling - Finding The Best Node and Allocating Resources

Now that a worker lease request is sent, here are the steps that follow to schedule a task

- The raylet on the data-locality-optimal node receives the worker lease request
    - It receives a view of the entire cluster state from the GCS via a periodic broadcast
    - It makes a decision on which node is the best based on its view of the cluster state
- The Raylet on the best node now has to allocate the resources to lease a worker
    - It will attempt to reserve the resources on the node
    - It will then update the GCS periodically with any updates about the resource state of the node
 
This is shown in the below diagram, the potential autoscaling step prior to finding a best node is left out to simplify


<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_find_best_node_allocate_resources.svg" width="900px">

<!-- References:
- See code:
    1. [When a worker lease request comes in a raylet's NodeManager::HandleRequestWorkerLease gets called](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/node_manager.cc#L1783)
    2. [NodeManager::HandleRequestWorkerLease will delegate a call to ClusterTaskManager::QueueAndScheduleTask()](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/node_manager.cc#L1830)
    3. [ClusterTaskManager::QueueAndScheduleTask it will delegate a call to ClusterTaskManager::ScheduleAndDispatchTasks()](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/cluster_task_manager.cc#L64)
    4. [ClusterTaskManager.ScheduleAndDispatchTasks() will call LocalTaskManager::ScheduleAndDispatchTasks()](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/cluster_task_manager.cc#L227)
    5. [LocalTaskManager::ScheduleAndDispatchTasks() will call LocalTaskManager::DispatchScheduledTasksToWorkers()](https://github.com/ray-project/ray/blob/master/src/ray/raylet/local_task_manager.cc#L96)
    6. [LocalTaskManager::DispatchScheduledTasksToWorkers() will secure a worker for a task in the call to WorkPool.PopWorker() only after securing that owner is active, arguments are available, resources are allocated)](https://github.com/ray-project/ray/blob/master/src/ray/raylet/local_task_manager.cc#L267)
-->

### Leases as an optimization to avoid communication with the scheduler for similar scheduling requests

- A scheduling request at task submission can reuse a leased worker if it has the same:
    - Resource requirements as these must be acquired from the node during task execution.
    - Shared-memory task arguments, as these must be made local on the node before task execution.
- This "hot path" most commonly occurs for **subsequent task executions**. We visualize it in the diagram below. Note how we skip:
    - sending a request to a raylet altogether
    - storing and fetching the function code in GCS

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_scheduling_hot_path.svg" width="700px">

## Task Execution: Object Handling (Storage and Distributed Ownership)

Let's revist our mental model for the ray cluster and add some more detail to which components control and manage objects in ray.

- Each worker process stores:
    - An ownership table. System metadata for the objects to which the worker has a reference, e.g., to store ref counts and object locations.
    - An in-process store, used to store small objects.
- Each raylet runs:
    - A shared-memory object store (also known as the Plasma Object Store). Responsible for storing, transferring, and spilling large objects. The individual object stores in a cluster comprise the Ray distributed object store.


<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/ray_cluster_distributed_ownership.svg" width="800px">

<!-- 
Reference: 
- See [V2 architecture document -> Architecture Overview -> Design -> Components](https://docs.google.com/document/d/1tBw9A4j62ruI5omIJbMxly-la5w4q_TjyJgJL_jN2fI/preview#heading=h.cclei73t0j5p)
 -->

Let's take a look at the steps involved in object handling:

- The submitter worker creates an object reference for the future output value of the task in its ownership table
- The submitter worker then submits the task for scheduling
- The executor worker will execute the task function
- The executor worker will then prepare the return object
    - If the return object is small <100KB
        - Return the values inline directly to the submitter's in-process object store.
    - If the return object is large
        - Store the objects in the raylet object store
- Executor updates the submitter's ownership table with reference to new object address


<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/task_execution_detail_distributed_object_store_ownership.svg" width="800px">

<!-- References: 
- See code:
    1. [When instantiating a CoreWorker, we add task receivers which will callback CoreWorker::ExecuteTask](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L147)
    2. [CoreWorker::ExecuteTask() will prepare a RayFunction and submit it to its execution callback](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/core_worker/core_worker.cc#L2721C21-L2721C44)
    3. [The task execution callback in the case of python will execute the function from cython given the set task_execution_handler](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L3075C43-L3075C65)
    4. [The task execution handler will execute the task with a cancellation handler](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L2064C17-L2064C55)
    5. [The handler will call execute_task handling a KeyboardInterrupt error](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L1960C7-L1960C7)
    6. [execute_task will invoke the function_executor](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L1675)
    7. [execute_task will store the outputs in the object store](https://github.com/ray-project/ray/blob/releases/2.8.1/python/ray/_raylet.pyx#L1810C12-L1810C12) -->

## Distributed ownership work in ray

### How does it work ?
The process that submits a task is considered to be the owner of the result of the task

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/distributed_ownership_overview.svg" width="600px">


## Upsides to distributed ownership:

- Latency: Faster than communicating all ownership information back to a head node.
- Scalability: There is no central bottleneck when attempting to scale the cluster given every worker maintains its own ownership information.

## Downsides to distributed ownership:

- objects fate-share with their owner. Even though the object is available on a node, if the owner fails, the object is no longer reachable

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/distributed_ownership_fate_share_with_owner.svg" width="800px">

## Distributed Object Store

The raylet's object store can be thought of as shared memory across all workers on a node.

For values that can be zero-copy deserialized, passing the ObjectRef to `ray.get` or as a task argument will return a direct pointer to the shared memory buffer to the worker.

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/distributed_ownership_data_sharing.svg" width="600px">

### Downside to a shared object-store

This also means that worker processes fate-share with their local raylet process.

A simple mental model to have is `raylet = node` if a raylet fails, all workloads on node will fail 

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/distributed_ownership_fate_share_with_raylet.svg" width="600px">

# Overview of Scheduling Strategies

Ray provides different scheduling strategies that you can set on your task.

We will go over:
- How a raylet assess feasibility and availability of nodes
- How every scheduling strategy/policy works and when you should use it

## How does a raylet classify nodes as feasible/infeasible and available/unavailable?

Given a resource requirement, a raylet classifies a node as one of the following:
- feasible
    - available
    - not available
- infeasible node 

Let's understand this by looking at an example task `my_task` that has a resource requirement of 3 CPUs:

- all nodes with >= 3 CPUs are classified as **feasible**
    - all **feasible nodes** that have >= 3 CPUs **idle** are classified as **available**

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/raylet_node_classification.svg" width="700px">

## Default Scheduling Strategy

This is the default scheduling policy used by ray

### Motivation

Ray attempts to strike a balance between favoring nodes that already cater for data locality and favoring those that have low resource utilization.

### How does it work?
It is a hybrid policy that combines the following two heuristics:
- Bin packing heuristic
- Load balancing heuristic

<!-- ### References:
- See code here:
    - [Default Hybrid Scheduling Policy is defined here](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/policy/hybrid_scheduling_policy.cc) -->

The diagram below shows the policy in action in a bin-packing heuristic/mode

Note the **Local Node** shown in the diagram is the node that is local to the raylet that received the worker lease request - which in almost all cases is the raylet that satisfies data locality requirements.

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/scheduling_policy_hybrid_policy_binpacking.svg" width="900px">

The diagram below shows the policy in action in a load balancing heuristic. 

This occurs when our preferred local node is heavily being utilized. The strategy will now spread new tasks amongst other feasible and available nodes.

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/scheduling_policy_hybrid_policy_balancing.svg" width="900px">

## Node Affinity Strategy

### How does it work?
It assigns a task to a given node in either a strict or soft manner.

### Use-cases
- When you want to ensure that your task runs on a specific node: e.g. you want to make sure a given accelerator is used for a compute-intensive task.


<!-- ### References:
- See code here
    - [Node Affinity Policy is defined here](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/policy/node_affinity_scheduling_policy.cc)
  -->

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/scheduling_policy_node_affinity.svg" width="1000px">

### Sample code

In [1]:
import ray
from ray.util.scheduling_strategies import NodeAffinitySchedulingStrategy

# pin this task to only run on the current node id
run_on_same_node = NodeAffinitySchedulingStrategy(
    node_id=ray.get_runtime_context().get_node_id(), 
    soft=False,
)

@ray.remote(
    scheduling_strategy=run_on_same_nodeÏ
)
def node_affinity_schedule():
    return 2


ray.get(node_affinity_schedule.remote())

2023-12-18 15:03:45,916	INFO worker.py:1633 -- Started a local Ray instance. View the dashboard at [1m[32m127.0.0.1:8265 [39m[22m


2

## SPREAD Scheduling Strategy

### How does it work?
It behaves like a best-effort round-robin. It spreads across all the available nodes first and then the feasible nodes.

### Use-cases
- When you want to load-balance your tasks across nodes. e.g. you are building a web service and want to avoid overloading certain nodes.


<!-- ### References:
- See code here
    - [Spread Scheduling Policy is defined here](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/policy/spread_scheduling_policy.cc)
  -->

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/scheduling_policy_spread.svg" width="700px">

### Sample Code

In [2]:
import ray


@ray.remote(scheduling_strategy="SPREAD")
def spread_default_func():
    return 2


ray.get(spread_default_func.remote())

2

## Placement Group Scheduling Strategy

In cases when we want to treat a set of resources as a single unit, we can use placement groups.

### How does it work?

- A **placement group** is formed from a set of **resource bundles**
  - A **resource bundle** is a list of resource requirements that fit in a single node
- A **placement group** can specify a **placement strategy** that determines how the **resource bundles** are placed
  - The **placement strategy** can be one of the following:
    - **PACK**: pack the **resource bundles** into as few nodes as possible
    - **SPREAD**: spread the **resource bundles** across as many nodes as possible
    - **STRICT_PACK**: pack the **resource bundles** into as few nodes as possible and fail if not possible
    - **STRICT_SPREAD**: spread the **resource bundles** across as many nodes as possible and fail if not possible
- **Placement Groups** are **atomic** 
  -  i.e. either all the **resource bundles** are placed or none are placed
  -  GCS uses a two-phase commit protocol to ensure atomicity

### Use-cases

Placement groups are used for **atomic gang scheduling**. Imagine the use case of a distributed training that requires 4 GPU nodes total. Other distributed schedulers might first reserve 3 GPUs and hang waiting for the fourth hogging resources in the meantime. Ray, instead, will either reserve all 4 GPUs or it will fail scheduling.

- Use SPREAD when you want to load-balance your tasks across nodes. e.g. you are building a web service and want to avoid overloading certain nodes.
- Use PACK when you want to maximize resource utilization. e.g. you are running training and want to cut costs by packing all your resource bundles on a small subset of nodes.

<!-- ### References
- [See code here for Bundle Scheduling Policy](https://github.com/ray-project/ray/blob/releases/2.8.1/src/ray/raylet/scheduling/policy/bundle_scheduling_policy.cc) -->

<img src="https://assets-training.s3.us-west-2.amazonaws.com/ray-core/task-actor-lifecycle/v2/scheduling/scheduling_policy_placement_group.svg" width="600px">

### Example Code

In [4]:
import ray
from ray.util.scheduling_strategies import PlacementGroupSchedulingStrategy
# Import placement group related functions
from ray.util.placement_group import (
    placement_group,
    placement_group_table,
    remove_placement_group,
)

# Reserve a placement group of 1 bundle that reserves 0.1 CPU
pg = placement_group([{"CPU": 0.1}], strategy="PACK", name="my_pg")

# Wait until placement group is created.
ray.get(pg.ready(), timeout=10)

# look at placement group states using the table
print(placement_group_table(pg))


@ray.remote(
    scheduling_strategy=PlacementGroupSchedulingStrategy(
        placement_group=pg,
    ),
    # task requirement needs to be less than placement group capacity
    num_cpus=0.1,
)
def placement_group_schedule():
    return 2


out = ray.get(placement_group_schedule.remote())
print(out)

# Remove placement group.
remove_placement_group(pg)

{'placement_group_id': '1f68b1aaa61cf3a4130f9bd5ec2c01000000', 'name': 'my_pg', 'bundles': {0: {'CPU': 0.1}}, 'bundles_to_node_id': {0: 'ed50ecd2dcb4443e107b5b0b9da78065e86e8428924fd1ae82c0a780'}, 'strategy': 'PACK', 'state': 'CREATED', 'stats': {'end_to_end_creation_latency_ms': 2.57, 'scheduling_latency_ms': 2.349, 'scheduling_attempt': 1, 'highest_retry_delay_ms': 0.0, 'scheduling_state': 'FINISHED'}}
2
