Skip to content

C#: Initial port of IR for C# #1600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Aug 29, 2019
Merged

C#: Initial port of IR for C# #1600

merged 41 commits into from
Aug 29, 2019

Conversation

kiridi
Copy link
Contributor

@kiridi kiridi commented Jul 16, 2019

This is a PR which contains a port of the C++ IR library to C#. The following functionalities have been tested:

  • functions: parameters, calls, return statement;
  • statements: if, for, while, do ... while, switch (C# 6.0 syntax), try ... catch, specific jumps
    (breaks, gotos);
  • variables: declaration, initialization, access;
  • classes: inheritance, polymorphism, constructos, constructor initializers (base, this);
  • objects: creation via constructors, object initializers, field access;
  • arrays: creation, element access and initializers;
  • expressions: basic expressions(unary, binary, ++, --), casts (as, implicit casts), is;

This PR should be mostly reviewed as a whole, since recent commits contain more comments and tidy ups.

Tests have been added for each implemented feature (can be found in ../library-tests/ir/ir).
The generated instructions can be found in the file raw_ir.expected.
The generated instructions pass the sanity checks found is IRSanity.ql (results in the file raw_ir_sanity.expected).

@kiridi kiridi requested a review from a team as a code owner July 16, 2019 17:20
@ghost
Copy link

ghost commented Jul 16, 2019

CLA assistant check
All committers have signed the CLA.

@dave-bartolomeo
Copy link
Contributor

I'd suggest deleting the ir/implementation/*ssa directories for now. They aren't required to build the "raw" stage of the IR, and they'll require additional work to port. For now, they're mostly just files that don't compile that clutter up the PR.

Copy link
Contributor

@dave-bartolomeo dave-bartolomeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like most of your changes to the shared files should be covered by the new compatibility layer. I'll attempt to apply by compatibility layer changes to your branch and see if I can back out your changes to those files.

/**
* Gets the function whose IR is represented.
*/
final Callable getCallable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the places where you replaced the type Function with Callable, you also changed the name of the predicate to use Callable. If the predicate is to be shared between languages, it needs to have the same name. I'd recommend just changing the predicate names back to use Function for now. It's possible that some of the predicates that use the Function type don't actually need to be in the shared part of the IR, but I'd rather try to figure out how to factor them out later.

@@ -0,0 +1,21 @@
import csharp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this file is even used anywhere. You should be able to just delete it. I'll delete it on the C++ side.

@@ -0,0 +1,23 @@
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend removing the DataFlow files from this PR as well. DataFlow isn't really interesting until you can build SSA, and SSA was already removed from this PR.

@kiridi kiridi requested a review from a team as a code owner August 2, 2019 08:43
Copy link
Contributor

@dave-bartolomeo dave-bartolomeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments, most of which have auto-fixes that you can just click on in GitHub to automatically push to your branch.

Other than those simple changes, I'm happy with this PR as an initial working implementation.

Somebody from @Semmle/cs needs to review the changes to Type.qll, since they affect things beyond the IR.


import csharp

class Locatable extends Element { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is needed after my changes. You can just make Language::AST resolve to Element.

Copy link
Contributor

@calumgrant calumgrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A partial review. My main observation is that there is still quite a lot of code that could be shared better. I suggest making a list of the changes that are required, and forwarding them to @dave-bartolomeo

@@ -0,0 +1,4 @@
// Most queries should operate on the aliased SSA IR, so that's what we expose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation-style comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,19 @@
import csharp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level documentation comment is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

override Instruction getChildFalseSuccessor(TranslatedCondition child) {
(child = getLeftOperand() or child = getRightOperand()) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there getAnOperand()? If not it might be quite good to add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

override LogicalOrExpr expr;

override Instruction getChildTrueSuccessor(TranslatedCondition child) {
(child = getLeftOperand() or child = getRightOperand()) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAnOperand()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Type resultType, boolean isLValue) {
tag = ValueConditionConditionalBranchTag() and
opcode instanceof Opcode::ConditionalBranch and
resultType instanceof VoidType and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use Language::UnknownType instead of VoidType here (and elsewhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in all the places where I used ObjectType.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Language::VoidType is for instructions that don't return a result at all, including ConditionalBranch. Language::UnknownType is for instructions that return a result, but we don't know or can't represent its type)

Type resultType, boolean isLValue) {
opcode instanceof Opcode::NoOp and
tag = OnlyInstructionTag() and
resultType instanceof VoidType and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language::UnknownType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


class Type = CSharp::Type;
//REVIEW: This might not exist in the database.
class UnknownType = CSharp::UnknownType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use NullType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Type resultType, boolean isLValue) {
tag = ValueConditionConditionalBranchTag() and
opcode instanceof Opcode::ConditionalBranch and
resultType instanceof VoidType and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Language::VoidType is for instructions that don't return a result at all, including ConditionalBranch. Language::UnknownType is for instructions that return a result, but we don't know or can't represent its type)

Copy link
Contributor

@dave-bartolomeo dave-bartolomeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my feedback has been addressed, and the C++ changes LGTM.

Copy link
Contributor

@calumgrant calumgrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really great start, and you have already covered many parts of the language! I have not combed the generated IR for correctness, or validated every single edge. Some of the feedback can be addressed in follow-up PRs.


override Type getCallResultType() { result instanceof VoidType }

override predicate hasQualifier() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about static constructors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite sure when those should be called, since they do not have an exact entry point. From what I've read, they are called some time before the first access of the static field (but it is not exactly known when), though if they are not accessed the static constructor is never called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related question is when or how field initialization should be shown.

// * The IR translation of an object creation expression.
// * Eg. Point a = new Point(1, 2);
// */
//class TranslatedObjectCreationExpr extends TranslatedNonConstantExpr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this getting reinstated or has it been moved already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should consider splitting this file up further, for example in a translated/ folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectCreation is dealt with in the TranslatedInitialization.qll file, removed the comments.
I was thinking of refactoring the file to better reflect the structure of the exprs folder in the C# library.

result = TranslatedElementInitialization.super.getInstructionConstantValue(tag)
or
tag = getElementDefaultValueTag() and
result = getZeroValue(getElementType())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about null elements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TranslatedElementValueInitialization will not be used (I will probably delete it when I will refactor the whole TranslatedInitialization.qll file), since when an array is created and initialized we have 2 possibilities: either the size is specified, in which case the initializer list must be that size, or the size is not specified and we initialize as many elements as the initializer list has. Should I refactor this file for this PR?

}
}

class TranslatedSwitchStmt extends TranslatedStmt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, we could model this as a chain of ifs (unlike C++), so you could synthesise something like

switch (x)
{
case 1:
case "foo:
  ...
  break;
case 2:
  ...
}

as

tmp = x;
if (tmp == 1 || tmp == "foo")
   ...
else if (tmp == 2)
  ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also have when conditions to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this for this PR? I believe this is a good idea, since treating when and pattern matching would be easier inside conditions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do it later.

@@ -0,0 +1,31 @@
class Geeks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misplaced {

Copy link
Contributor

@calumgrant calumgrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. One or two small things, let me know when you're ready to merge this.

calumgrant
calumgrant previously approved these changes Aug 22, 2019
@kiridi kiridi force-pushed the ircsharp branch 2 times, most recently from 32e3a14 to 65e7cef Compare August 28, 2019 11:14
Andrei Diaconu added 3 commits August 28, 2019 12:25
Ported basic functionalities from the C++ IR
Added a simple test that passes the IR sanity check and produces
sensible IR (together with the .expected files) to the C# test folder
Refactored the C++ specific names
Tidied the code
Updated TODOs
kiridi and others added 22 commits August 28, 2019 12:25
Objects now work, although I will refactor the code quite a bit for clarity
If and while statements now produce good code
Began work on try statements
Introduced 2 new tags to support multidimensional arrays
Multidimensional arrays produce correct code
All types of initializations for arrays work correctly
Fixed a bug in TranslatedExpr: decl + init where the rhs is a reference now work as expected
Uncommented the code for the switch statement
Correct code is now generated from ObjectCreation exprs and ObjectInitializer exprs.
Removed TranslatedFieldInitialization and its subclasses and further refactored TranslatedInitialization
Added tests to showcase the instructions generated for object creation and object initialization
Updated raw_ir.expected
PrintIR now uses the qualified name (with types) when printing the IR for more clarity
Throw statements now give correct code, apart from the case of rethrows: need to make explicit the fact that a finally block is executed even if stack unwinding happens.
Added 2 new classes to TranslatedStmt.qll, one for throws that have an exception, one for rethrows.
Fixed a bug in TranslatedDeclarationEntry.qll where some local declaration would be missed.
Changed toString into getQualifiedName for more clarity when generating the instructions in Instruction.qll.
Some general refactoring in TranslatedExpr.qll and TranslatedStmt.qll.
Correct code is now produced for increment and decrement expressions
Modified producesExprResult() and TTranslatedLoad() so that no loads are done from outside the crement exprs and that the VariableAddress generated from the access of the operator variable is recognized as an expr that produces result.
Some preliminary refactoring of the TranslatedDeclaration.qll file
Added support for `ForStmt` and `DoWhileStmt`
Added test cases
Broke down the class `TranslatedJump` to have more control on the IR control flow.
Now GotoLabelStmt, GotoCaseStmt, GotoDefaultStmt and BreakStmt are translated separately.
This also fixes an issue when having a switch as the last statement of a void function would create an incorrect CFG.
Properties and property access produce correct code.
Fixed a function qualifier bug in `TranslatedCall.qll`.
Added a new class to translate `ExprStmt`s whose expr is an `AssignExpr` whose lvalue is an accessor call: we translate only the accessor call in for the translated AST.
Began working o inheritance, polymorphism and constructor init. Correct code is produced for them (though some more work is needed to accurately treat conversions between classes).
Removed commented code.
Added classes to properly deal with constructor init and modified and refactored TranslatedFunction to accomodate for the changes.
Fixed a bug in `TranslatedArrayExpr` that would prevent the element to produce the correct instruction result, hence creating problems with loads and stores.
`ElementsAddress` opcode now inherits from the `UnaryOpcode`, as it should.
Fixed some inconsistencies with casts
Fixed some bugs related to which translated elements need loads
Added support for IsExpr expressions
Tidied up the code for review
Added more comments
Synced the files that are needed for this PR
kiridi added 3 commits August 28, 2019 12:30
Added some new classes for built in operations that for the moment
have no effect (added to remove errors)
private predicate fieldIsInitialized(Field field) {
exists(field.getInitializer())
}
//private predicate fieldIsInitialized(Field field) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well delete this properly.

@calumgrant calumgrant merged commit d2bee79 into github:master Aug 29, 2019
@j504
Copy link

j504 commented Nov 18, 2020

GGreart`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants