Skip to content

[core] add ray.util.concurrent.futures.RayExecutor#44922

Closed
GuillaumeDesforges wants to merge 2 commits intoray-project:masterfrom
tweag:feature/executor
Closed

[core] add ray.util.concurrent.futures.RayExecutor#44922
GuillaumeDesforges wants to merge 2 commits intoray-project:masterfrom
tweag:feature/executor

Conversation

@GuillaumeDesforges
Copy link
Copy Markdown
Contributor

RayExecutor is a drop-in replacement for ProcessPoolExecutor and ThreadPoolExecutor from concurrent.futures but distributes and executes the specified tasks over a pool of dedicated actors belonging to a Ray cluster instead of multiple processes or threads, respectively.

Why are these changes needed?

This change provides an easy way to use Ray for futures, allowing to scale an existing codebase that relies on them.

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@GuillaumeDesforges GuillaumeDesforges requested a review from a team as a code owner April 23, 2024 13:54
@aslonnie aslonnie requested a review from a team April 26, 2024 06:25
Copy link
Copy Markdown
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asking core team to review first.

@GuillaumeDesforges GuillaumeDesforges mentioned this pull request Apr 26, 2024
8 tasks
@anyscalesam anyscalesam requested a review from jjyao April 26, 2024 18:12
@anyscalesam anyscalesam added triage Needs triage (eg: priority, bug/not-bug, and owning component) core Issues that should be addressed in Ray Core @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. labels Apr 26, 2024
@jjyao jjyao removed the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Apr 29, 2024
@anyscalesam anyscalesam added triage Needs triage (eg: priority, bug/not-bug, and owning component) and removed @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. labels May 14, 2024
@judahrand
Copy link
Copy Markdown

This would be phenomenally useful! Is this likely to get reviewed soon? Anyway I can do to help?

@judahrand
Copy link
Copy Markdown

@GuillaumeDesforges I made a few fixes in #46249. Feel free to pull them into this PR if you like and then I will close the other one.

@GuillaumeDesforges
Copy link
Copy Markdown
Contributor Author

The changes you propose make sense to me, I'll pull them into this PR asap and try to fix tests 👍

@GuillaumeDesforges
Copy link
Copy Markdown
Contributor Author

GuillaumeDesforges commented Jul 30, 2024

Sorry it took so much time. I've rebased on #46249 to include your changes @judahrand 🙏

I'll try to fix failed tests if I can.

@anyscalesam
Copy link
Copy Markdown
Contributor

@GuillaumeDesforges any progress on this?

@anyscalesam anyscalesam added enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 19, 2024
@anyscalesam
Copy link
Copy Markdown
Contributor

@GuillaumeDesforges any update here?

@GuillaumeDesforges
Copy link
Copy Markdown
Contributor Author

GuillaumeDesforges commented Aug 20, 2024

I have been short on time, but I do plan to make these tests pass. 👍

EDIT 2024-10-04: still on my TODO 😄

@anyscalesam anyscalesam added the @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. label Aug 26, 2024
RayExecutor is a drop-in replacement for ProcessPoolExecutor and
ThreadPoolExecutor from concurrent.futures but distributes and
executes the specified tasks over a pool of dedicated actors
belonging to a Ray cluster instead of multiple processes or threads,
respectively.

Co-authored-by: Mohamed Nidabdella <mohamed.nidabdella@tweag.io>, Johann Eicher <johann.eicher@tweag.io>, Judah Rand <17158624+judahrand@users.noreply.github.com>
Signed-off-by: Guillaume Desforges <guillaume.desforges.pro@gmail.com>
Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>
Signed-off-by: Guillaume Desforges <guillaume.desforges.pro@gmail.com>
@GuillaumeDesforges
Copy link
Copy Markdown
Contributor Author

I have looked into the doc error. I have no clue how to solve them.

For the Python tests, we seem to have relied on an API that is not available in Python 3.9.

@elijahbenizzy
Copy link
Copy Markdown

Hey folks! Any chance this PR is likely to get merged at some point? Would be very useful, and probably better than my (extremely hacky) prototype:

class RayExecutor(concurrent.futures.ThreadPoolExecutor):
    def __init__(self, threadpool_kwargs: Dict[str, Any], ray_kwargs: Dict[str, Any]):
        super(RayExecutor, self).__init__(**threadpool_kwargs)
        self.ray_kwargs = ray_kwargs

    def init(self):
        # TODO - synchronize number of workers
        ray.init(**self.ray_kwargs)

    def submit(self, __fn, *args, **kwargs):
        return ray.remote(__fn).remote(*args, **kwargs).future()
        ```

@GuillaumeDesforges
Copy link
Copy Markdown
Contributor Author

GuillaumeDesforges commented Dec 17, 2024

@elijahbenizzy happy to see interest in our work 😄

I won't have time anytime soon to try to satisfy the CI unfortunately. Anyone is free to pick this up. 🙂

EDIT: just saw someone was assigned to it. Nice!

@davidwagnerkc
Copy link
Copy Markdown
Contributor

Not sure about the docs / linter, but for the actual core tests I think could use something like this to make the _result_or_cancel available to Python 3.9.

diff --git a/python/ray/util/concurrent/futures/ray_executor.py b/python/ray/util/concurrent/futures/ray_executor.py
index 237a9f17cd..62e0a4cfee 100644
--- a/python/ray/util/concurrent/futures/ray_executor.py
+++ b/python/ray/util/concurrent/futures/ray_executor.py
@@ -2,7 +2,21 @@ from abc import ABC, abstractmethod
 import time
 from functools import partial
 from concurrent.futures import Executor, Future
-from concurrent.futures._base import _result_or_cancel  # type: ignore
+
+try:
+    from concurrent.futures._base import _result_or_cancel  # type: ignore
+except ImportError:
+    # Backport private Python 3.12 function to Python 3.9
+    # https://github.com/python/cpython/tree/main/Lib/concurrent/futures/_base.py#L306
+    def _result_or_cancel(fut, timeout=None):
+        try:
+            try:
+                return fut.result(timeout)
+            finally:
+                fut.cancel()
+        finally:
+            del fut
+
 from typing import (
     Any,
     ParamSpec,

@davidwagnerkc
Copy link
Copy Markdown
Contributor

@jjyao @GuillaumeDesforges Not sure if we want to get this merged or not, but #51933 makes CI all green for this PR. Changes in the description.

@davidwagnerkc
Copy link
Copy Markdown
Contributor

@GuillaumeDesforges

So this pattern of accessing results:

from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor() as ex:
    results = ex.map(str, list(range(25)))
print(list(results))

Will hang for RayExecutor:

from ray.util.concurrent.futures.ray_executor import RayExecutor
with RayExecutor() as ex:
    results = ex.map(str, list(range(25)))
print(list(results))

In fact you quite nicely have the test_results_are_not_accessible_after_shutdown test to test for this fact. But what prevents us from keeping this behavior the same? You should not need anything Ray, any refs, and futures, etc. after the exit of the context manager, so if wait=True on shutdown as it is by default we should not only future.result() to wait on the computations, but also return those results correct?

I can poke at this if we want to change it, but wanted to ask first since I imagine you thought through this given you wrote the test for it.

Cool PR by the way. I have done serious damage with the ProcessPoolExecutor in many different contexts so think it is a great utility API to provide.

@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label Apr 4, 2025
@davidwagnerkc
Copy link
Copy Markdown
Contributor

I made a change to address the above comment - moving discussion to #51933

@codope
Copy link
Copy Markdown
Contributor

codope commented May 27, 2025

Closing this PR in favor of #51933 as it fixes the CI.

@codope codope closed this May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core enhancement Request for new feature and/or capability @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. P1 Issue that should be fixed within a few weeks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants