## 5. Resource allocation and management

By default, Ray will schedule a task as long as there is at least one CPU available.

In code this can be specified in the `ray.remote`, like this:

In [None]:
@ray.remote(num_cpus=1)
def remote_add(a, b):
    return a + b

However, these resource specifications are not enforced - i.e. they are entirely [logical and not physical](https://docs.ray.io/en/latest/ray-core/scheduling/resources.html#physical-resources-vs-logical-resources).

This means that you can for instance perform multiprocessing or multithreading within a task and oversubscribe to resources.

In [None]:
@ray.remote(num_cpus=1)
def mm(n: int = 4000):
    A = np.random.rand(n, n)
    B = np.random.rand(n, n)

    # Time the dot product
    start = time.time()
    C = np.dot(A, B)
    end = time.time()
    print(f"Took {end - start}s")
    
ray.get(mm.options(runtime_env={"env_vars": {"OMP_NUM_THREADS": "1"}}).remote())
ray.get(mm.options(runtime_env={"env_vars": {"OMP_NUM_THREADS": "8"}}).remote())

<div class="alert alert-info">

Note by default, Ray will set the `OMP_NUM_THREADS` environment variable to the number of CPUs in the cluster.

Learn more about <strong><a href="https://docs.ray.io/en/latest/ray-core/scheduling/resources.html#physical-resources-and-logical-resources" target="_blank">physical resources and logical resources</a></strong>.
</div>

### 5.1. Note on resources requests, available resources, configuring large clusters

<p>During the <em>scheduling stage</em>, Ray evaluates the <strong>resource requirements</strong> specified via the <code>@ray.remote</code> decorator or within the <code>resources={...}</code> argument. These requirements may include:</p>

<ul>
    <li><strong>CPU</strong> e.g., <code>@ray.remote(num_cpus=2)</code>)</li>
    <li><strong>GPU</strong> e.g., <code>@ray.remote(num_gpus=1)</code>)</li>
    <li><strong>Custom resources</strong>: User-defined custom resources like <code>"TPU"</code></li>
    <li><strong>Memory</strong></li>
</ul>

<p>Ray's scheduler checks the <strong>resource specification</strong> (sometimes referred to as <strong>resource shape</strong>) to match tasks and actors with available resources in the cluster. If the exact resource combination is unavailable, Ray may autoscale the cluster.</p>

<p>You can inspect the current resource availability using:</p>
<pre><code>
ray.available_resources()
</code></pre>

<p>This returns a dictionary showing the currently available CPUs, GPUs, memory, and any custom resources, for example:</p>

<pre><code>{'CPU': 24.0, 'GPU': 1.0, 'memory': 2147483648.0}</code></pre>

In [None]:
ray.available_resources()

<div class="alert alert-info">

<strong>Pattern:</strong> configure the head node to be unavailable for compute tasks.

When scaling to large clusters, it's important to ensure that the <strong>head node</strong> does not handle any compute tasks. Users can indicate that the head node is unavailable for compute by setting its resources:

```resources: {"CPU": 0}```

Learn more about <strong><a href="https://docs.ray.io/en/latest/cluster/vms/user-guides/large-cluster-best-practices.html#configuring-the-head-node" target="_blank">configuring the head node</a></strong>.
</div>

### 5.2. Fractional resources

Fractional resources allow Ray Tasks to request a fraction of a CPU or GPU (e.g., 0.5), enabling finer-grained resource allocation.

Let's consider the above example again:

In [None]:
@ray.remote(num_cpus=0.5)
def remote_add(a, b):
    return a + b

In [None]:
ref = remote_add.remote(2, 3)
ref

In [None]:
ray.get(ref)

<div class="alert alert-info">
    Fractional resources include support for <strong><a href="https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html#fractional-accelerators" target="_blank">multiple accelerators</a></strong>, allowing users to load multiple smaller models onto a single GPU. This is especially useful for scenarios like batch inference. Learn more about <strong><a href="https://docs.ray.io/en/latest/ray-core/scheduling/resources.html#fractional-resource-requirements" target="_blank">fractional resource requirements</a></strong>.
</div>

### 5.3. IO bound tasks and fractional resources

Setting fractional cpus or even `num_cpus=0` is a pattern for <strong>I/O-bound tasks</strong> that do not require CPU-intensive computation.

<p>This allows Ray to oversubscribe CPUs, scheduling many such tasks concurrently without reserving CPU cores. Since <code>num_cpus=0</code> always passes the scheduler’s resource check, these tasks get scheduled immediately.</p>

<p>This can lead to <strong>resource savings</strong> and better utilization in workloads with high I/O.</p>

<p>More details about <a href="https://docs.ray.io/en/latest/ray-core/scheduling/resources.html#fractional-resource-requirements" target="_blank">
fractional resource requirements</a>.</p>