-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C#: Initial port of IR for C# #1600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
697556c
to
8a0c3cb
Compare
I'd suggest deleting the ir/implementation/*ssa directories for now. They aren't required to build the "raw" stage of the IR, and they'll require additional work to port. For now, they're mostly just files that don't compile that clutter up the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like most of your changes to the shared files should be covered by the new compatibility layer. I'll attempt to apply by compatibility layer changes to your branch and see if I can back out your changes to those files.
/** | ||
* Gets the function whose IR is represented. | ||
*/ | ||
final Callable getCallable() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all the places where you replaced the type Function
with Callable
, you also changed the name of the predicate to use Callable
. If the predicate is to be shared between languages, it needs to have the same name. I'd recommend just changing the predicate names back to use Function
for now. It's possible that some of the predicates that use the Function
type don't actually need to be in the shared part of the IR, but I'd rather try to figure out how to factor them out later.
@@ -0,0 +1,21 @@ | |||
import csharp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this file is even used anywhere. You should be able to just delete it. I'll delete it on the C++ side.
@@ -0,0 +1,23 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend removing the DataFlow files from this PR as well. DataFlow isn't really interesting until you can build SSA, and SSA was already removed from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments, most of which have auto-fixes that you can just click on in GitHub to automatically push to your branch.
Other than those simple changes, I'm happy with this PR as an initial working implementation.
Somebody from @Semmle/cs needs to review the changes to Type.qll
, since they affect things beyond the IR.
|
||
import csharp | ||
|
||
class Locatable extends Element { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is needed after my changes. You can just make Language::AST
resolve to Element
.
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/IRSanity.ql
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/PrintIR.ql
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A partial review. My main observation is that there is still quite a lot of code that could be shared better. I suggest making a list of the changes that are required, and forwarding them to @dave-bartolomeo
@@ -0,0 +1,4 @@ | |||
// Most queries should operate on the aliased SSA IR, so that's what we expose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation-style comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -0,0 +1,19 @@ | |||
import csharp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File-level documentation comment is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
|
||
override Instruction getChildFalseSuccessor(TranslatedCondition child) { | ||
(child = getLeftOperand() or child = getRightOperand()) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there getAnOperand()
? If not it might be quite good to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
override LogicalOrExpr expr; | ||
|
||
override Instruction getChildTrueSuccessor(TranslatedCondition child) { | ||
(child = getLeftOperand() or child = getRightOperand()) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAnOperand()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Type resultType, boolean isLValue) { | ||
tag = ValueConditionConditionalBranchTag() and | ||
opcode instanceof Opcode::ConditionalBranch and | ||
resultType instanceof VoidType and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use Language::UnknownType
instead of VoidType
here (and elsewhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in all the places where I used ObjectType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Language::VoidType
is for instructions that don't return a result at all, including ConditionalBranch
. Language::UnknownType
is for instructions that return a result, but we don't know or can't represent its type)
Type resultType, boolean isLValue) { | ||
opcode instanceof Opcode::NoOp and | ||
tag = OnlyInstructionTag() and | ||
resultType instanceof VoidType and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language::UnknownType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
class Type = CSharp::Type; | ||
//REVIEW: This might not exist in the database. | ||
class UnknownType = CSharp::UnknownType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use NullType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/InstructionTag.qll
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/TranslatedCall.qll
Outdated
Show resolved
Hide resolved
Type resultType, boolean isLValue) { | ||
tag = ValueConditionConditionalBranchTag() and | ||
opcode instanceof Opcode::ConditionalBranch and | ||
resultType instanceof VoidType and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Language::VoidType
is for instructions that don't return a result at all, including ConditionalBranch
. Language::UnknownType
is for instructions that return a result, but we don't know or can't represent its type)
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/TranslatedDeclarationEntry.qll
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of my feedback has been addressed, and the C++ changes LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a really great start, and you have already covered many parts of the language! I have not combed the generated IR for correctness, or validated every single edge. Some of the feedback can be addressed in follow-up PRs.
|
||
override Type getCallResultType() { result instanceof VoidType } | ||
|
||
override predicate hasQualifier() { any() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about static
constructors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure when those should be called, since they do not have an exact entry point. From what I've read, they are called some time before the first access of the static field (but it is not exactly known when), though if they are not accessed the static constructor is never called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A related question is when or how field initialization should be shown.
csharp/ql/src/semmle/code/csharp/ir/internal/IRCSharpLanguage.qll
Outdated
Show resolved
Hide resolved
// * The IR translation of an object creation expression. | ||
// * Eg. Point a = new Point(1, 2); | ||
// */ | ||
//class TranslatedObjectCreationExpr extends TranslatedNonConstantExpr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this getting reinstated or has it been moved already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should consider splitting this file up further, for example in a translated/
folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectCreation
is dealt with in the TranslatedInitialization.qll
file, removed the comments.
I was thinking of refactoring the file to better reflect the structure of the exprs
folder in the C# library.
result = TranslatedElementInitialization.super.getInstructionConstantValue(tag) | ||
or | ||
tag = getElementDefaultValueTag() and | ||
result = getZeroValue(getElementType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about null
elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TranslatedElementValueInitialization
will not be used (I will probably delete it when I will refactor the whole TranslatedInitialization.qll
file), since when an array is created and initialized we have 2 possibilities: either the size is specified, in which case the initializer list must be that size, or the size is not specified and we initialize as many elements as the initializer list has. Should I refactor this file for this PR?
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/TranslatedStmt.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/TranslatedStmt.qll
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
class TranslatedSwitchStmt extends TranslatedStmt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably, we could model this as a chain of ifs (unlike C++), so you could synthesise something like
switch (x)
{
case 1:
case "foo:
...
break;
case 2:
...
}
as
tmp = x;
if (tmp == 1 || tmp == "foo")
...
else if (tmp == 2)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also have when
conditions to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do this for this PR? I believe this is a good idea, since treating when
and pattern matching would be easier inside conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it later.
@@ -0,0 +1,31 @@ | |||
class Geeks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misplaced {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. One or two small things, let me know when you're ready to merge this.
32e3a14
to
65e7cef
Compare
Ported basic functionalities from the C++ IR Added a simple test that passes the IR sanity check and produces sensible IR (together with the .expected files) to the C# test folder
Refactored the C++ specific names Tidied the code Updated TODOs
Objects now work, although I will refactor the code quite a bit for clarity If and while statements now produce good code Began work on try statements
Introduced 2 new tags to support multidimensional arrays Multidimensional arrays produce correct code All types of initializations for arrays work correctly
Fixed a bug in TranslatedExpr: decl + init where the rhs is a reference now work as expected Uncommented the code for the switch statement
Correct code is now generated from ObjectCreation exprs and ObjectInitializer exprs. Removed TranslatedFieldInitialization and its subclasses and further refactored TranslatedInitialization
Added tests to showcase the instructions generated for object creation and object initialization Updated raw_ir.expected PrintIR now uses the qualified name (with types) when printing the IR for more clarity
Throw statements now give correct code, apart from the case of rethrows: need to make explicit the fact that a finally block is executed even if stack unwinding happens. Added 2 new classes to TranslatedStmt.qll, one for throws that have an exception, one for rethrows. Fixed a bug in TranslatedDeclarationEntry.qll where some local declaration would be missed. Changed toString into getQualifiedName for more clarity when generating the instructions in Instruction.qll. Some general refactoring in TranslatedExpr.qll and TranslatedStmt.qll.
Correct code is now produced for increment and decrement expressions Modified producesExprResult() and TTranslatedLoad() so that no loads are done from outside the crement exprs and that the VariableAddress generated from the access of the operator variable is recognized as an expr that produces result.
Some preliminary refactoring of the TranslatedDeclaration.qll file
Added support for `ForStmt` and `DoWhileStmt` Added test cases
Broke down the class `TranslatedJump` to have more control on the IR control flow. Now GotoLabelStmt, GotoCaseStmt, GotoDefaultStmt and BreakStmt are translated separately. This also fixes an issue when having a switch as the last statement of a void function would create an incorrect CFG.
Properties and property access produce correct code. Fixed a function qualifier bug in `TranslatedCall.qll`. Added a new class to translate `ExprStmt`s whose expr is an `AssignExpr` whose lvalue is an accessor call: we translate only the accessor call in for the translated AST.
Began working o inheritance, polymorphism and constructor init. Correct code is produced for them (though some more work is needed to accurately treat conversions between classes). Removed commented code. Added classes to properly deal with constructor init and modified and refactored TranslatedFunction to accomodate for the changes.
Fixed a bug in `TranslatedArrayExpr` that would prevent the element to produce the correct instruction result, hence creating problems with loads and stores. `ElementsAddress` opcode now inherits from the `UnaryOpcode`, as it should.
Fixed some inconsistencies with casts Fixed some bugs related to which translated elements need loads Added support for IsExpr expressions
Tidied up the code for review Added more comments
Synced the files that are needed for this PR
Added some new classes for built in operations that for the moment have no effect (added to remove errors)
private predicate fieldIsInitialized(Field field) { | ||
exists(field.getInitializer()) | ||
} | ||
//private predicate fieldIsInitialized(Field field) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well delete this properly.
GGreart` |
This is a PR which contains a port of the C++ IR library to C#. The following functionalities have been tested:
if
,for
,while
,do ... while
,switch
(C# 6.0 syntax),try ... catch
, specific jumps(
break
s,goto
s);base
,this
);++
,--
), casts (as
, implicit casts),is
;This PR should be mostly reviewed as a whole, since recent commits contain more comments and tidy ups.
Tests have been added for each implemented feature (can be found in
../library-tests/ir/ir
).The generated instructions can be found in the file
raw_ir.expected
.The generated instructions pass the sanity checks found is
IRSanity.ql
(results in the fileraw_ir_sanity.expected
).