-
Notifications
You must be signed in to change notification settings - Fork 145
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic tracing TODO #407
Comments
This is not actually dead code: |
I had to disable the memoize tests for circuit_sparse.rg and pennant_fast.rg in the nopaint branch because they both hit the "dead code" assertion from the previous comment. |
I checked the items that I know are handled by either me or Mike. I propose we revisit the list and create a new issue per outstanding item. |
Per Mike's suggestion, I'm adding a comment. We run into non-idempotent traces regularly with S3D when making small extensions (e.g., adding a new boundary condition). I don't think we can expect users to understand and debug this, so I'd suggest a mode under mapper control to allow non-repayable traces to be replayed by issuing copies to satisfy the precondition if necessary. There is a question whether the system would just pick some instances to move, or whether we should have mapper calls for establishing preconditions to pick which of multiple instances to use. |
@lightsighter and I have been thinking about the composability of programs that use tracing (especially in a high-level context, such as when user programs in cuNumeric might try and use tracing). There seem to be two main problems around tracing in this area:
We've been thinking so far about two solutions to these problems. The first solution is supporting nested traces, which fixes problem 1, but doesn't address problem 2. Supporting nested traces would allow for arbitrary composition of codes that use tracing, since composing two codes that use tracing corresponds to just a trace record/replay with an existing record/replay. I don't know that much about the implementation of tracing, but Mike says that something like this would not be the hardest thing to do. The second solution is to move towards more automation inside Legion, where we automatically detect when programs are replaying the same sequence of operations, and replay traces when we identify memoizable operation sequences. This solution solves both problems 1 and 2. Mike is already planning on building infrastructure that would help with identifying when repeated sequences of operations occur. The main difficult portion in this aspect is understanding what to do when the runtime decides that a trace should be replayed, but then the application's operation stream diverges from what the runtime predicts will happen. A potential solution inspired by JIT compilers right now is the following:
There is a potential to take this further, where we can push the granularity of memoization down to the operation level, where if the preconditions for an individual operation are satisfied, we could skip / replay the physical analysis for that operation. Since checking preconditions here is the expensive part, where if we see a prefix of some operations that we have seen before, followed by some operations that we haven't, we could replay the analysis of operations in the prefix, and just have the physical analysis effects (equivalence set updates etc) replayed on the final operation in the stream, so that everything after the prefix goes through the pipeline normally. Something like this is reminiscent of what @magnatelee wanted to see in Legate, where if the legate runtime is consistently making the same decisions, analysis costs should go down. The second solution (and extension to it) are more forward looking than the first, but at the same time, we don't have any programs right now that would not be handled by nested tracing but would be handled by automatic tracing. |
Dead code elimination. Just because it is dead code in an entire trace, doesn't mean it is actually dead when being replayed with different downstream operations. I think we don't necessarily need to pick between the two approaches. The important thing is to create a framework for tracing that allows us to explore the trade-offs. The current implementation is too rigid for that. I think we can make the current implementation work just as efficiently with a more "operation-based" implementation that looks backwards at the operations that came before it and infer whether it can be replayed or needs to redo its analysis. If we do that then I think we can explore both nested tracing as well as dynamic discovery of traces. |
Here is the list of features that are missing and will likely be implemented in the current dynamic tracing. I'll check off the boxes when I add them to the code.
find_event
calls because some of them are overly strict. Preconditions can come from outside the trace anywhere, not just in calls to merge events.These features need some discussion before we decide to add them to the code.
The text was updated successfully, but these errors were encountered: