Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precise exception support #742

Open
pmccormick opened this issue Feb 7, 2020 · 3 comments
Open

Precise exception support #742

pmccormick opened this issue Feb 7, 2020 · 3 comments
Assignees
Labels
enhancement Legion Issues pertaining to Legion

Comments

@pmccormick
Copy link
Contributor

A proposal to implement precise exception support within the runtime. There are number of factors here that would make this useful:

  • Advanced debugging mechanisms.
  • Expanded capabilities for fault/error recovery and restartable tasks (beyond current feature set -- perhaps more localized/fine-grained with this path?)
  • Controlled injection of tasks into an executing application (e.g., advanced workflows with a REPL-based mechanism for in situ data analytics, avoiding the abort-and-requeue model in HPC workflows).
  • Etc... These are the top ones (in no particular order) on my mind...
@lightsighter lightsighter self-assigned this Feb 10, 2020
@lightsighter lightsighter added backlog Feature/fix that is desirable, but not currently planned enhancement Legion Issues pertaining to Legion labels Feb 10, 2020
@lightsighter
Copy link
Contributor

Mainly what we want is the ability to pause the runtime at a point in time and have the state of the regions and the program look like it is in the middle of a sequential execution. This will allow debuggers to build around Legion that can do introspection of Legion programs in a sane way.

@magnatelee
Copy link
Contributor

Legate libraries (or any Python library built on top of Legion) can benefit from precise exceptions and I'd like to put this back to our priority list. Any errors from the C++ tasks in those libraries are currently hard errors. They should instead be tossed back to the Python land without halting the runtime and caught at the point where the faulty task was launched. The latter also requires that nothing beyond that faulty task make any perceivable changes to the program state. Since this precise handling of exceptions is inherently a sequential, blocking process, which is bad for performance, this shouldn't be the only mode of execution and we will want to have another one where exceptions are checked in a deferred manner (possibly without any recourse).

@magnatelee magnatelee removed the backlog Feature/fix that is desirable, but not currently planned label Apr 23, 2021
@lightsighter
Copy link
Contributor

Since this precise handling of exceptions is inherently a sequential, blocking process, which is bad for performance, this shouldn't be the only mode of execution and we will want to have another one where exceptions are checked in a deferred manner (possibly without any recourse).

To be clear the default will be to have non-precise exceptions and users will have to opt-in to running the runtime in a way that enables support for precise exceptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Legion Issues pertaining to Legion
Projects
None yet
Development

No branches or pull requests

3 participants