-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move Worker management to separate CPI operation #377
Comments
This was referenced Jul 26, 2023
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Aug 9, 2023
Reimplement `scalems.radical.executor.manage_raptor()` as `scalems.radical.manager.manage_raptor()` - [X] Acquire the Raptor master task through the RuntimeManager. - [ ] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [ ] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Normalize the "stop" command to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate RPExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Aug 9, 2023
* Introduce some containers for CPI Command management. * Reimplement a `cpi` method on the RuntimeManager. * Shut down queue runner threads on `close()` * Translate CPI command messages into appropriate messages to the ScalemsRaptor and fulfil Futures with `CommandItem.run()` - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [ ] Make sure CPI Session is properly shut down. - [ ] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Normalize the "stop" command to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate RPExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Aug 9, 2023
Add and rearrange some program state management. Add some notes and describe incomplete state management. Ref SCALE-MS#378, SCALE-MS#383. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [ ] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Normalize the "stop" command to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate RPExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Aug 9, 2023
Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref SCALE-MS#345, SCALE-MS#377
Merged
eirrgang
added a commit
that referenced
this issue
Aug 10, 2023
Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls). * Introduce some containers for CPI Command management. * Reimplement a `cpi` method on the RuntimeManager. * Shut down queue runner threads on `close()` * Translate CPI command messages into appropriate messages to the ScalemsRaptor and fulfil Futures with `CommandItem.run()` Ref #345, #377
eirrgang
added a commit
that referenced
this issue
Aug 10, 2023
Add and rearrange some program state management. Add some notes and describe incomplete state management. Deferred: - Make sure CPI Session is properly shut down. (Partially deferred) - Acquire the Worker(s) through CPI call to the RuntimeManager. - Normalize the "stop" command to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - Isolate RPExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) Ref #345, #377
eirrgang
added a commit
that referenced
this issue
Aug 11, 2023
* RP TaskDescription *metadata* field changes from a `dict` to a `tuple[str, None|dict]`` of operation name and operand. * `scalems.messages.Command` class hierarchy is replaced by simple `TypedDict`s and utility functions in new `scalems.cpi` module. * `CpiCommand._command_name()` replaces `CpiCommand.command_class()` required classmethod to support `scalems.radical.CpiCommand` subclass registration (for creation function dispatching). * Update signature for `scalems.radical.manager.RuntimeManager.cpi()`. Ref #377
eirrgang
added a commit
that referenced
this issue
Aug 23, 2023
Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Clarify the difference between EXIT_SCOPE and STOP. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref #345, #377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Oct 19, 2023
* RP TaskDescription *metadata* field changes from a `dict` to a `tuple[str, None|dict]`` of operation name and operand. * `scalems.messages.Command` class hierarchy is replaced by simple `TypedDict`s and utility functions in new `scalems.cpi` module. * `CpiCommand._command_name()` replaces `CpiCommand.command_class()` required classmethod to support `scalems.radical.CpiCommand` subclass registration (for creation function dispatching). * Update signature for `scalems.radical.manager.RuntimeManager.cpi()`. Ref SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Oct 19, 2023
Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Clarify the difference between EXIT_SCOPE and STOP. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Implement EXIT_SCOPE in the ScalemsMaster. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Oct 19, 2023
FIXME: Getting Aborted while queue runner is waiting for next item Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Clarify the difference between EXIT_SCOPE and STOP. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Implement EXIT_SCOPE in the ScalemsMaster. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Oct 19, 2023
Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Clarify the difference between EXIT_SCOPE and STOP. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Implement EXIT_SCOPE in the ScalemsMaster. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref SCALE-MS#345, SCALE-MS#377
eirrgang
added a commit
to eirrgang/scale-ms
that referenced
this issue
Oct 19, 2023
FIXME: Getting Aborted while queue runner is waiting for next item Implement the START_SCOPE and EXIT_SCOPE calls to trigger Worker submission and to leave the scope of those Workers. Clarify the difference between EXIT_SCOPE and STOP. Note that Raptor does not actually provide an API for stopping Workers from the Raptor master, so the CPI semantics do not map precisely to the runtime logic. - [X] Acquire the Raptor master task through the RuntimeManager. - [X] Manage a CPI command queue translating CPI calls to Raptor-backed Futures (RPTasks or RPC calls) - [X] Make sure CPI Session is properly shut down. (Partially deferred) - [X] Acquire the Worker(s) through CPI call to the RuntimeManager. - [ ] Implement EXIT_SCOPE in the ScalemsMaster. - [ ] Normalize the CPI usage to shut down everything cleanly and expeditiously. (In too many cases right now, tests take an improperly long time because of various timeouts.) - [ ] Isolate CPIExecutor concurrent.futures.Executor support from asyncio support (avoid blocking the event loop by avoiding event loop usage in the main implementation) - [ ] Separate `messages` for intercomponent communication from `cpi` calls and responses. Ref SCALE-MS#345, SCALE-MS#377
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Let RuntimeManager mediate Raptor resource acquisition.
RPExecutor merely represents the allocated resources, and does not
directly implement the resource management. This allows RPExecutor to
be responsible for providing the concurrent.futures.Executor interface,
which isn't particularly friendly to the asyncio protocols. For regular
scalems usage, RPExecutor should be used in a non-main thread via the
scalems asyncio utilities.
Isolate RPExecutor concurrent.futures.Executor support from
asyncio support (avoid blocking the event loop by avoiding event loop
usage in the main implementation)
We also need this so that we can restore and normalize the "stop"
command to shut down everything cleanly and expeditiously. In too many
cases right now, tests take an improperly long time because of various
timeouts.
Supports #335
The text was updated successfully, but these errors were encountered: