-
Notifications
You must be signed in to change notification settings - Fork 424
Dumping an intermediate representation #2493
Conversation
34ed021
to
38c8203
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far.
src/prepack-cli.js
Outdated
@@ -365,6 +373,20 @@ function run( | |||
flags | |||
); | |||
if (heapGraphFilePath !== undefined) resolvedOptions.heapGraphFormat = "DotLanguage"; | |||
if (dumpIRFilePath !== undefined) { | |||
resolvedOptions.onExecute = realm => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the "onExecute" name a bit surprising. This is called when Abstract Interpretation (execution) is completed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, name matches onParse
, which is called after parsing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both names are confusing. If anything they should get the prefix after
or something to show they happen after the action. We might as well get the naming right for both in this PR?
src/utils/TextPrinter.js
Outdated
let dataTexts = []; | ||
if (args.length > 0) dataTexts.push(`args ${this.describeValues(args)}`); | ||
if (data.unaryOperator !== undefined) dataTexts.push(data.unaryOperator); // used by UNARY_EXPRESSION | ||
if (data.binaryOperator !== undefined) dataTexts.push(data.binaryOperator); // used by BINARY_EXPRESSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably these are mutually exclusive and this if can be inside an else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of invariants over this data
structure could be enforced somewhere and then assumed here. But that's for another PR... Here, I'll make the entries unique (there should be a parser one day).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great step in the right direction and is going to be invaluable to working with the Prepack internals. As a future point, maybe we want to add tests around checking this output at a later point?
Release notes: None The functionality that Prepack provides can be seen as being done by two separate engines: 1) There's the "front-end", which does symbolic execution / abstract interpretation, computing "effects" / "generator trees" for the global code and optimized functions. 2) There's the "back-end", which takes this intermediate representation, computes what's reachable, and then turns it into a new executable program by performing transformations such as breaking cycles. Prepack developers have few tools available to understand what goes on between the front-end and back-end. Usually, they look just at the final output, imagine what the intermediate representation might have been, or maybe use to debugger to poke around memory locations. This PR (and other associated PRs) try to improve that state by turning the intermediate representation into a first-class printable and human-readable data structure. This will help... - in understanding what's going on - enabling different back-ends by ensuring that there's a first-class intermediate representation - enabling future transformation on a well-defined intermediate data structure - new ways of testing, e.g. via snapshots of the intermediate representation, or (once there is a parser) feeding hand-written IR into the back-end There are still a good number of things left to do. In particular: - discovering and printing (nested) optimized functions - printing the structure of abstract values and objects - some form of testing - the current IR is really just an artefact of 2 years of hacking. It is in need of some additional rounds of refactorings.
Add mode --ir to test-runner where it records on prints IR when it would also print generated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NTillmann is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Release notes: None
This resolves #1944.
The functionality that Prepack provides can be seen as being done by two separate engines:
computing "effects" / "generator trees" for the global code and optimized functions.
reachable, and then turns it into a new executable program by performing transformations
such as breaking cycles.
Before, Prepack developers had very few tools available to understand what goes on between the
front-end and back-end. Usually, they just look at the final output, imagine what the
intermediate representation might have been, or maybe use to debugger to poke around
memory locations, or to invoke helper functions to inspect live state that way, hoping the debugger doesn't crash along the way.
This PR (and other associated PRs such as #2490, #2491) try to improve that state by turning the intermediate representation into a first-class data structure that is printable into a human-readable textual form. This will help...
(once there is a parser) feeding hand-written IR into the back-end
Example 1
For example, for this program...
... the IR would currently look like this:
Notes:
* value#N = f(args)
// atemporal abstract value_$N := op-type<op-data>(args)[metadata]
// generator entry that defines a temporal valueop-data
contains various essential dataargs
tends to be a projection ofdata
capturing all values that must be visited, but it's not consistentmetadata
is information that helps the visitor compute minimal reachability, but it's not semantically relevant.op-type<op-data>(args)[metadata]
// generator entry that does not define a temporal valueif ... then ... else
.Example 2
=>
Example 3
Things get slightly ugly with functions.
=>
Example 4
Things get really ugly with optimized functions, but that's what it is right now:
=>
Future work (not for this PR)
There are still a good number of things left to do. In particular:
rounds of refactorings and simplifications.
TemporalOperationEntry
, there's some duplication going on withargs
anddata
. Consider eliminatingargs
and deriving this information when needed fromdata
.OperationDescriptorData
could use some structure, or a (sub)type hierarchy.TemporalOperationEntry
with itsdata
dumping ground for everything else. This is all a bit arbitrary and should be unified.In a way, this PR just provides yet another way of dumping values and generators. Some of the other existing ways should be consolidated or killed.
Added option --ir to test-runner to activate (and test) IR dumping.