C++: Share `TInstruction` across IR stages #3612

dbartol · 2020-06-03T14:15:15Z

Strongly suggest reviewing commit-by-commit
~~C# is currently broken. Will be fixed before I make this draft ready for review~~

Main changes

Each stage of the IR reuses the majority of the instructions from previous stages. Previously, we've been wrapping each reused old instruction in a branch of the TInstruction type for the next stage. This causes use to create roughly three times as many TInstruction objects as we actually need.

Now that IPA union types are supported in the compiler, we can share a single TInstruction IPA type across stages. We create a single TInstruction IPA type, with individual branches of this type for instructions created directly from the AST (TRawInstruction) and for instructions added by each stage of SSA construction (T*PhiInstruction, T*ChiInstruction, T*UnreachedInstruction). Each stage then defines a TStageInstruction type that is a union of all of the branches that can appear in that particular stage. The public Instruction class for each phase extends the TStageInstruction type for that stage.

The interface that each stage exposes to the pyrameterized modules in the IR is now split into three pieces:

The Raw module, exposed only by the original IR construction stage. This module identifies which functions have IR, which TRawInstructions exist, and which IRVariables exist.
The SSA module, exposed only by the two SSA construction stages. This identifiers which Phi, Chi, and Unreached instructions exist.
The global module, exposed by all three stages. This module has all of the predicates whose implementation is different for each stage, like gathering definitions of MemoryOperands.

Similarly, there is now a single TIRFunction IPA type that is shared across all three stages. There is a single IRFunctionBase class that exposes the stage-indepdendent predicates; the IRFunction class for each stage extends IRFunctionBase.

Most of the other changes are largely mechanical.

Subsequent changes

Now that TInstruction is shared between IR stages, several of the per-stage IR construction predicates can now be moved into the Raw interface exposed only by the initial construction of IR from the ASTs. This also removed a couple predicates that were not used previously at all.

Making these part of the IPA object identity changes the failure mode for cases where we assign multiple result types to an instruction. Previously, we would just have one instruction with two result types, but now we'd have two instructions, which breaks things worse. This change goes back to how things were before, to avoid any new surprises on real-world code with invalid ASTs or IR.

Performance tuning

The initial version of this PR slowed down RedundantNullCheckSimple.ql on Wireshark and ChakraCore by ~10%. Most of this slowdown was due to various large and expensive relations being recomputed in multiple stages (especially tuple numbering). I've forced the entire IR (all three phases) to be evaluated in the same phase, which gets back all of the lost performance plus a few percent more. I also made some small changes to ensure that the commonly-used predicates on Instruction are just trivial wrappers around cached predicates in the IR/SSA construction code.

Each stage of the IR reuses the majority of the instructions from previous stages. Previously, we've been wrapping each reused old instruction in a branch of the `TInstruction` type for the next stage. This causes use to create roughly three times as many `TInstruction` objects as we actually need. Now that IPA union types are supported in the compiler, we can share a single `TInstruction` IPA type across stages. We create a single `TInstruction` IPA type, with individual branches of this type for instructions created directly from the AST (`TRawInstruction`) and for instructions added by each stage of SSA construction (`T*PhiInstruction`, `T*ChiInstruction`, `T*UnreachedInstruction`). Each stage then defines a `TStageInstruction` type that is a union of all of the branches that can appear in that particular stage. The public `Instruction` class for each phase extends the `TStageInstruction` type for that stage. The interface that each stage exposes to the pyrameterized modules in the IR is now split into three pieces: - The `Raw` module, exposed only by the original IR construction stage. This module identifies which functions have IR, which `TRawInstruction`s exist, and which `IRVariable`s exist. - The `SSA` module, exposed only by the two SSA construction stages. This identifiers which `Phi`, `Chi`, and `Unreached` instructions exist. - The global module, exposed by all three stages. This module has all of the predicates whose implementation is different for each stage, like gathering definitions of `MemoryOperand`s. Similarly, there is now a single `TIRFunction` IPA type that is shared across all three stages. There is a single `IRFunctionBase` class that exposes the stage-indepdendent predicates; the `IRFunction` class for each stage extends `IRFunctionBase`. Most of the other changes are largely mechanical.

Now that `TInstruction` is shared between IR stages, several of the per-stage IR construction predicates can now be moved into the `Raw` interface exposed only by the initial construction of IR from the ASTs. This also removed a couple predicates that were not used previously at all.

Making these part of the IPA object identity changes the failure mode for cases where we assign multiple result types to an instruction. Previously, we would just have one instruction with two result types, but now we'd have two instructions, which breaks things worse. This change goes back to how things were before, to avoid any new surprises on real-world code with invalid ASTs or IR.

This updates C#'s IR to share `TInstruction` across stages the same way C++ does. The only interesting part is that, since we have not yet ported full alias analysis to C#, I stubbed out the required parts of the aliased SSA interface in `AliasedSSAStub.qll`.

…lysis-team/69_union

Before this change, evaluation of the IR was spread out across about 5 stages. This resulted in a lot of redundant evaluation, especially tuple numbering of large IPA types like `TInstruction`. This change makes two small changes that, when combined, ensure that the IR is evaluated all in one stage: First, we mark `TInstruction` as `cached`. This collapses all of the work to create instructions, across all three IR phases, into a single phase. Second, we make the `SSA` module in `SSAConstruction.qll` just contain aliases to `cached` predicates defined in the `Cached` module. This ensures that all of the `Operand`-related SSA computation happens in the same stage as all of the `Instruction`-related SSA computation.

There were a few places in the IR itself where we use `Instruction.getResultType()`, which returns the C++ `Type` of the result, instead of `Instruction.getResultIRType()`, which returns the language-neutral `IRType` of the result. By removing this usage, we can avoid evaluating `getResultType()` at all. There are still other uses of `Instruction.getResultType()` in other libraries. We should switch those as well.

Most of the predicates on `Instruction` are thin wrappers around cached predicates in the `IRConstruction` or `SSAConstruction` modules. However, `getResultIRType()` has to join `Construction::getInstructionResultType()` with `LanguageType::getIRType()`. `getResultIRType()` is called frequently both within the IR code and by IR consumers, and that's a big join to have to repeat in multiple stages. I looked at most of the other predicates in `Instruction.qll`, and didn't see any other predicates that met all of the criteria of "large, commonly called, and not already inline".

The four cached predicates used to access common properties of instructions took a `TStageInstruction` as a parameter. This requires the calling code, in `Instruction.qll`, to then join the results with `hasInstruction()` to filter out results for `TRawInstruction`s that were discarded as unreachable. By simply switching the parameter types to `Instruction`, we can force that join to happen in the cached predicate itself. This makes the various accessor predicates on `Instruction` trivially inlinable to the cached predicate, instead of being joins of two huge relations that might have to be recomputed in later stages.

…lysis-team/69_union

The optimizer picked a terrible join order in `VirtualDispatch::DataSensitiveCall::flowsFrom()`. Telling it that `getAnOutNode()` has a unique result convinces it to join first on the `Callable`, rather than on the `ReturnKind`.

dbartol · 2020-06-09T12:41:09Z

After fixing a big performance regression in virtual dispatch, I'm running what I hope will be my final CPP-Differences job. If that comes back without any regressions, this will be ready for review.

dbartol · 2020-06-09T20:33:24Z

CPP-Differences job came back with some pretty impressive (>15%) wins across the board in overall suite analysis time. A significant portion of that is likely to be from this PR accidentally getting rid of a caching issue that caused the IR to be reevaluated during dataflow analysis, but even the IR-only queries at least look no worse than they were before this change.

jbj · 2020-06-10T19:02:04Z

Lots of conflicts: the C# IR was moved to experimental.

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/IRFunction.qll

dbartol · 2020-06-11T16:26:00Z

Merge conflicts with the big C# move to experimental are now fixed.

cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll

MathiasVP · 2020-06-12T11:23:31Z

I really like the changes! I think they all look good, the only thing I can't seem to understand is the small difference in instruction counts. I tried counting the number of instructions in git with this PR and with master:

Master:
PhiInstructions: 56452
ChiInstructions: 185212
Total:           1748874

This PR:
PhiInstructions: 56449
ChiInstructions: 185208
Total:           1748855

it's not a large difference, so I don't know if it's a sign of a bug somewhere. But do you know what the cause of this difference could be?

dbartol · 2020-06-12T13:35:54Z

it's not a large difference, so I don't know if it's a sign of a bug somewhere. But do you know what the cause of this difference could be?

It could have something to do with how we handle multiple results for predicates that should always have a unique result. I'm not too worried, but it should be easy enough to investigate a few examples by comparing IR dumps for functions whose instruction count changed. I'll take a look.

jbj

These are my comments after reviewing the first commit only. I've checked that all my comments apply to the PR as a whole, not just the first commit.

cpp/ql/src/semmle/code/cpp/ir/implementation/internal/TInstruction.qll

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll

jbj · 2020-06-12T18:09:17Z

cpp/ql/src/semmle/code/cpp/ir/implementation/internal/IRFunctionBase.qll

+private import IRFunctionBaseInternal
+
+private newtype TIRFunction =
+  MkIRFunction(Language::Function func) { IRConstruction::Raw::functionHasIR(func) }


I'm surprised this isn't cached. But it wasn't cached before, so this PR is not the right time to experiment with its caching.

cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll

jbj · 2020-06-12T18:25:24Z

cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll

-      result = element.getInstructionResultSize(tag)
-    )
+  IRFunctionBase getInstructionEnclosingIRFunction(TStageInstruction instr) {
+    instr = TRawInstruction(result, _, _, _, _)


I'm not sure it's helpful for these trivial predicates to be cached. But we can measure that separately (after this PR), and it also interacts with the change I'm proposing to remove columns from TRawInstruction.

These are no longer cached, part of the overall removal of caching of raw IR construction.

dbartol · 2020-06-12T19:08:16Z

@MathiasVP Can you share your query for those instruction counts? I'm getting numbers that are in the same general range, but far enough off that I'm not sure I counted exactly the same way you did.

dbartol · 2020-06-12T19:22:43Z

@MathiasVP The only IR diff I found locally in git was newly detected unreachable code because IntegerConstantInstruction now includes enums, thanks to checking getResultIRType() instead of getResultType(). Again, my numbers are different from yours for some reason.

MathiasVP · 2020-06-13T11:10:16Z

@MathiasVP Can you share your query for those instruction counts? I'm getting numbers that are in the same general range, but far enough off that I'm not sure I counted exactly the same way you did.

My query was:

import semmle.code.cpp.ir.IR
select count(Instruction i), select count(PhiInstruction i), select count(ChiInstruction i)

on the snapshot from https://jenkins.internal.semmle.com/job/Changes/job/Build-Snapshots/13817/.

jbj

I've now read all commits in this PR, and it LGTM apart from these comments!

jbj · 2020-06-15T07:31:15Z

cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll

+) {
+  element = getInstructionTranslatedElement(instruction) and
+  tag = getInstructionTag(instruction)
+}


Is there a need for this predicate now? I think instructionOrigin(i, e, t) can now be written as i = TRawInstruction(e, t). That results in a different column order, but at least the body of this predicate can be simplified if it's important to preserve the column order (and then pragma[noinline] should be unnecessary).

Perhaps it would be even better to inline getInstructionTranslatedElement and getInstructionTag (maybe the compiler does that already), and then callers of instructionOrigin can be changed to call those two inline predicates instead. If the compiler does everything right, two such calls should result in one join with TRawInstruction, and that should be just as performant as joining with instructionOrigin.

Removing instructionOrigin() cleaned up the RA a bit. At basically every usage, it had the wrong column order and needed a SCAN to reorder the columns. TRawInstruction() has the right join order already. Net improvement:

Eliminated four SCANs of instructionOrigin()

Added one new SCAN of TRawInstruction()

Eliminated a materialized join_rhs predicate for TranslatedElement::hasInstruction().

Eliminated instructionOrigin() itself.

Changing those TRawInstruction() calls to just use getInstructionTranslatedElement()/getInstructionTag() like we do everywhere else had essentially no effect on the RA, but made everything look more consistent.

cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll

cpp/ql/test/library-tests/syntax-zoo/raw_consistency.expected

cpp/ql/src/semmle/code/cpp/ir/implementation/IRType.qll

cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Instruction.qll

See #2272. I've added code comments in all of the places that future me will be tempted to hoist these overrides.

These predicates are only used within the new single IR stage, so there's no need to cache them beyond that. RA diffs are trivial. Where previously many of the predicate on `Instruction` were inline wrappers around cached predicates from `IRConstruction`, now the predicates from `IRConstruction` get inlined into the `Instruction` predicates, and the `Instruction` predicates get materialized. The net amount of work is the same, but now it's not getting cached unnecessarily.

This noopt predicate is no longer necessary. It's equivalent to `instruction = TRawInstruction(element, tag)`, which is already materialized and has a more favorable column order anyway.

Replace most direct calls to `TRawInstruction()` with calls to `getInstructionTranslatedElement()` and `getInstructionTag()`, matching existing practice. One tiny RA diff in an inconsequential join order in `getInstructionVariable`.

dbartol · 2020-06-17T16:15:00Z

@MathiasVP OK, I get the same numbers as you when running on that snapshot. I was using the one from lgtm.com. The diffs are due to two instances of unreachable code elimination due to constant propagation of enums. One instance is the same one I saw on the lgtm.com snapshot, and the other appears to be in code that only exists in the Jenkins snapshot.

dbartol · 2020-06-17T16:15:55Z

@jbj I think I've addressed everything that can reasonably be addressed in this PR. Please take a last look and merge when you get a chance.

Dave Bartolomeo added 3 commits June 1, 2020 11:15

dbartol added the C++ label Jun 3, 2020

dbartol changed the title ~~GitHub/codeql c analysis team/69 union~~ C++: Share TInstruction across IR stages Jun 3, 2020

Dave Bartolomeo added 3 commits June 3, 2020 13:49

C++: Add missing QLDoc

e65a5c9

C++/C#: Remove dead QL code

15f41c0

dbartol added the C# label Jun 3, 2020

Dave Bartolomeo added 10 commits June 3, 2020 16:10

Merge remote-tracking branch 'github/master' into github/codeql-c-ana…

a409b9d

…lysis-team/69_union

C++/C#: Fix formatting

cb2370c

C++/C#: Fix formatting

94c2bba

Merge remote-tracking branch 'github/master' into github/codeql-c-ana…

0ae98e7

…lysis-team/69_union

C++: Better caching for getPrimaryInstructionForSideEffect()

c511cc3

C++: Fix join order in virtual dispatch with unique

3fc02ce

The optimizer picked a terrible join order in `VirtualDispatch::DataSensitiveCall::flowsFrom()`. Telling it that `getAnOutNode()` has a unique result convinces it to join first on the `Callable`, rather than on the `ReturnKind`.

jbj mentioned this pull request Jun 9, 2020

C++: Performance tweak for 1-field struct loads #3659

Merged

dbartol marked this pull request as ready for review June 9, 2020 20:31

dbartol requested review from a team as code owners June 9, 2020 20:31

jbj reviewed Jun 10, 2020

View reviewed changes

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/IRFunction.qll Show resolved Hide resolved

Dave Bartolomeo added 2 commits June 11, 2020 10:24

C#: Rename IR module references to point to experimental

b116a3e

Merge from master, including fixing up merge conflicts

41df700

MathiasVP reviewed Jun 12, 2020

View reviewed changes

cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll Outdated Show resolved Hide resolved

C++: Simplify logic to an implication

4331b9b

jbj reviewed Jun 12, 2020

View reviewed changes

Dave Bartolomeo added 6 commits June 12, 2020 16:09

C++/C#: More efficient evaluation of SSA::hasInstruction()

ac16993

C++/C#: Stop caching getOldInstruction()

2aabe43

C++/C#: Move ast out of TRawInstruction

07c1520

C++/C#: Move irFunc out of various TInstruction branches

978275c

C++:/C# Remove opcode from TRawInstruction

73d2e09

C#: Fix up C# IR construction to latest interface

eac3b06

C++/C#: Fix formatting

89a1fd4

jbj reviewed Jun 15, 2020

View reviewed changes

Dave Bartolomeo added 5 commits June 16, 2020 16:37

Merge from master

24c3110

C++/C#: Move overrides of IRType::getByteSize() into leaf classes

8e977dc

See #2272. I've added code comments in all of the places that future me will be tempted to hoist these overrides.

C++: Remove instructionOrigin()

c101674

This noopt predicate is no longer necessary. It's equivalent to `instruction = TRawInstruction(element, tag)`, which is already materialized and has a more favorable column order anyway.

C++: Replace TRawInstruction() calls

687d6d2

Replace most direct calls to `TRawInstruction()` with calls to `getInstructionTranslatedElement()` and `getInstructionTag()`, matching existing practice. One tiny RA diff in an inconsequential join order in `getInstructionVariable`.

dbartol assigned jbj Jun 17, 2020

jbj approved these changes Jun 19, 2020

View reviewed changes

jbj merged commit 09d7ed0 into github:master Jun 19, 2020

rdmarsh2 mentioned this pull request Dec 15, 2020

C++: share TOperand across IR stages #4825

Merged

C++: Share TInstruction across IR stages #3612

C++: Share TInstruction across IR stages #3612

Uh oh!

Conversation

dbartol commented Jun 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Main changes

Subsequent changes

Performance tuning

Uh oh!

dbartol commented Jun 9, 2020

Uh oh!

dbartol commented Jun 9, 2020

Uh oh!

jbj commented Jun 10, 2020

Uh oh!

Uh oh!

dbartol commented Jun 11, 2020

Uh oh!

Uh oh!

MathiasVP commented Jun 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dbartol commented Jun 12, 2020

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbj Jun 12, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jbj Jun 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dbartol Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

dbartol commented Jun 12, 2020

Uh oh!

dbartol commented Jun 12, 2020

Uh oh!

MathiasVP commented Jun 13, 2020

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

jbj Jun 15, 2020

Choose a reason for hiding this comment

Uh oh!

dbartol Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dbartol commented Jun 17, 2020

Uh oh!

dbartol commented Jun 17, 2020

Uh oh!

Uh oh!

C++: Share `TInstruction` across IR stages #3612

C++: Share `TInstruction` across IR stages #3612

dbartol commented Jun 3, 2020 •

edited

Loading

MathiasVP commented Jun 12, 2020 •

edited

Loading

jbj Jun 12, 2020 •

edited

Loading