Skip to content

[RFC] Add release_resources flag to ray.get/wait #12912

@ericl

Description

@ericl

In IO-intensive use cases, the application may not want ray.get() to release resources. For example, if object spilling is on, ray.get() may be requiring remote object reads from external storage. In this case, the worker should be still considered "busy", and releasing resources just results in the needless creation of more workers that can cause thrashing.

Example:

@ray.remote
def read(refs):
   for r in refs:  # assume refs are pointing to objects spilled to external storage
      ray.get(r, release_resources=False)

ray.get([read.remote(refs) for refs in refs_list])

Without the release_resources=False flag, the above program will spawn too many workers and cause Ray to crash.

Alternatives:

  • A possible alternative is to optimize ray.wait/get to only release resources when waiting on task completion, instead of IO transfers. However, this mechanism would be more complex to implement.
  • Implement as an experimental flag for now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRFC issues

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions