Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Ray-based concurrent.futures.Executor #30826

Closed
wants to merge 45 commits into from

Conversation

jeicher
Copy link

@jeicher jeicher commented Dec 1, 2022

Why are these changes needed?

This PR provides RayExecutor, a drop-in replacement for ProcessPoolExecutor and ThreadPoolExecutor from concurrent.futures, which executes tasks on a Ray cluster.

Related issue number

Closes #29456

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

jeicher and others added 9 commits December 1, 2022 13:31
Co-authored-by: Mohamed Nidabdella <mohamed.nidabdella@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Co-authored-by: Mohamed Nidabdella <mohamed.nidabdella@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: Johann Eicher <johann.eicher@tweag.io>
Signed-off-by: mohamed <mohamed.nidabdella@tweag.io>
@jeicher
Copy link
Author

jeicher commented Dec 1, 2022

The three failing tests don't seem to have anything to do with our contribution. The Docker Bootstrap test, at the very least, failed because Ubuntu failed to update.

Perhaps someone at Ray could shed some light on the failures?

@jeicher jeicher marked this pull request as ready for review December 1, 2022 14:05
@jeicher jeicher requested a review from a team as a code owner December 1, 2022 14:05
@jeicher jeicher changed the title Feature/executor2 [Core] Ray-based concurrent.futures.Executor Dec 1, 2022
@richardliaw richardliaw added the core Issues that should be addressed in Ray Core label Dec 2, 2022
@rkooo567
Copy link
Contributor

sorry for the delay. I will review this tomorrow!

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually before deep diving into the review, is it currently only support submitting 1 task? It looks like current subclasses of Executor (ProcessExecutor and ThreadPoolExecutor) both has a concept of pool (max_workers). Should we have the same thing?

python/ray/util/ray_executor.py Outdated Show resolved Hide resolved
python/ray/util/ray_executor.py Outdated Show resolved Hide resolved
return fn(*args, **kwargs)

self.__remote_fn = remote_fn
self.context = ray.init(ignore_reinit_error=True, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expose context as @Property method?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial intent was that RayExecutor will be used in context (the with python context). So once we exit the context, ray is shutdown. I don't think we need to expose context as @property method, at least for now

python/ray/util/ray_executor.py Outdated Show resolved Hide resolved
@nidabdella
Copy link

nidabdella commented Jan 2, 2023

@rkooo567 can you take another look at this ?

@jeicher
Copy link
Author

jeicher commented Jan 4, 2023

Hmm actually before deep diving into the review, is it currently only support submitting 1 task? It looks like current subclasses of Executor (ProcessExecutor and ThreadPoolExecutor) both has a concept of pool (max_workers). Should we have the same thing?

Thanks @rkooo567! I've refactored quite a bit to make multiple task submission much clearer. The executor now has a max_workers argument which will execute tasks over an actor pool. If that kwarg is not specified, multiple tasks can still be submitted to the same RayExecutor instance, but scheduling will be managed entirely by ray (i.e. no fixed set of actor/workers).

@rkooo567
Copy link
Contributor

thanks for the update and sorry for the delay! I am planning to review this tomorrow (have been off around new year!)

@rkooo567
Copy link
Contributor

btw there are lots of test failures. Do you mind merging the latest master?

@nidabdella
Copy link

@rkooo567 any update on this?

@jjyao
Copy link
Contributor

jjyao commented Apr 26, 2023

Hi @nidabdella,

Could you reply to the comments I posted above? We can also schedule a meeting to go through the design and make sure we are aligned.

@jeicher jeicher closed this May 3, 2023
@jeicher jeicher reopened this May 3, 2023
@jeicher
Copy link
Author

jeicher commented May 3, 2023

Hi @nidabdella,

Could you reply to the comments I posted above? We can also schedule a meeting to go through the design and make sure we are aligned.

Thanks @jjyao, I've reverted to an Actor-based model. In the meantime, I've pinged you on Slack. Let's try and organise a chat to make sure we're on the same page.

@rkooo567
Copy link
Contributor

I am a bit occupied lately, so I will rely on @jjyao to finish the review!

@rkooo567 rkooo567 removed their assignment May 10, 2023
@stale
Copy link

stale bot commented Jun 10, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jun 10, 2023
@jjyao jjyao removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jun 14, 2023
@jjyao
Copy link
Contributor

jjyao commented Jun 14, 2023

Hi @jeicher, do you have any updates on this?

@jeicher
Copy link
Author

jeicher commented Jun 14, 2023

Hi @jeicher, do you have any updates on this?

Hi @jjyao, thanks for checking. I am on leave at the moment so it will be a while before I get to this again.

@stale
Copy link

stale bot commented Jul 15, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jul 15, 2023
@jeicher
Copy link
Author

jeicher commented Jul 17, 2023

Hey guys, I'll be back from leave soon. Hoping to get to this in the near future!

@stale
Copy link

stale bot commented Aug 11, 2023

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this Aug 11, 2023
@fkaleo
Copy link
Contributor

fkaleo commented Nov 14, 2023

Just chiming in to say I love the idea :)

@jeicher
Copy link
Author

jeicher commented Nov 14, 2023

@fkaleo I'll have a finished solution for this very soon 👍🏻

@jeicher jeicher mentioned this pull request Nov 14, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core stale The issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core] Implement Ray based concurrent.futures.Executor
8 participants