Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] Create Pipeline State Manager #4

Closed
pdames opened this issue Mar 9, 2022 · 4 comments
Closed

[batch] Create Pipeline State Manager #4

pdames opened this issue Mar 9, 2022 · 4 comments

Comments

@pdames
Copy link
Member

pdames commented Mar 9, 2022

The Ray Pipeline State Manager is a central service that consolidates the execution state of all scheduled pipeline work items in Ray's object store. This should be based on the single-process current implementation used in Beam's FnApiRunner.

At a high-level, this should be a Ray Actor that worker tasks use for (1) durable persistence of any ObjectRef that they have persisted in Ray's object store via ref = ray.put(obj) and (2) on-demand retrieval of any persisted ObjectRef which they can materialize via obj = ray.get(ref).

The state manager should also support efficient, atomic checkpointing and restoration of all state persisted in Ray's in-memory object store to durable storage (e.g. on-disk or to a durable cloud storage service etc.).

@pdames
Copy link
Member Author

pdames commented Mar 9, 2022

This work is required as part of #2

@ericl
Copy link

ericl commented Mar 9, 2022

The state manager should also support efficient, atomic checkpointing and restoration of all state persisted in Ray's in-memory object store to durable storage (e.g. on-disk or to a durable cloud storage service etc.).

This is interesting. We might have to do some work on improving the semantics of ObjectRefs serialization (in particular the interaction with ref-counting), since right now they're pinned forever in memory if exported. Hence, checkpointing may cause these objects to be leaked in the object store. cc @jjyao

@pdames
Copy link
Member Author

pdames commented Apr 1, 2022

@pabloem prototype: #6

@pabloem pabloem closed this as completed Oct 26, 2022
@pabloem
Copy link
Collaborator

pabloem commented Oct 26, 2022

@iasoon implemented something like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants