You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import os
import ray
from ray.autoscaler.sdk import request_resources
ray.init(address="auto")
@ray.remote(num_cpus=0.2)
class ActorA:
def __init__(self):
pass
a = ActorA.remote()
request_resources(bundles=[{"CPU": 0.1}, {"CPU": 0.1}])
I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.
The text was updated successfully, but these errors were encountered:
PidgeyBE
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Nov 26, 2020
Hi @PidgeyBE,
when you call request_resources(bundles=[{"CPU": 0.1}, {"CPU": 0.1}]) you get these resources "immediately", but they do not add on top of what you already have.
So if you resource demands are {"CPU": 0.2} the available resources become: [{"CPU": 0.2}, {"CPU": 0.1}] the [{"CPU": 0.1}, {"CPU": 0.1}] becomes available immediately but the remaining {"CPU": 0.1} might take more time to become available.
Checkout how request_resources() works here.
Does it make sense?
the total Resource Demands become [{'CPU': 0.2}, {'CPU': 0.1}], because one requested bundle {'CPU': 0.1} fits into the currently deployed task with{'CPU': 0.2}.
But if I do
the total Resource Demand becomes [{'CPU': 0.1}, {'CPU': 0.2}, {'CPU': 0.2}], because non of the requested bundles fits into one of the already deployed tasks?
So the Resource Demands are basically the requested resources, minus the bundles that are smaller or equal to running tasks?
@PidgeyBE that's right. Note that you're seeing here an artifact of the implementation of request_resources() (the internal bin packing algorithm). If only {"CPU": 1} were used instead of different shapes, this artifact were disappear. We could in the future improve the algorithm to fix this edge case.
The intended use of request_resources() is as a hint to scale the cluster to accommodate the requests, ignoring any existing utilization of the cluster; the result cluster size might be slightly larger than necessary due to sub-optimal packing.
What is the problem?
Reproduction (REQUIRED)
-> Output of autoscaling monitor is
-> The expected output is
[{'CPU': 0.2}, {'CPU': 0.1}, {'CPU': 0.1}]
In other tests I did, the requested resources where totally ignored.
ray.kill(a)
now, the output shows:Resource demands: [{'CPU': 0.2}, {'CPU': 0.1}, {'CPU': 0.1}]
So the missing request shows up, but the the request related to the actor is not cleaned up ([autoscaler] Actor resource demands are not cleared after actor is scheduled #12441)
The text was updated successfully, but these errors were encountered: