-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat address-of array subscripts the same way as address-of dereferences #1163
Conversation
… C99-specific logic
…ng the address of an lvalue
…of array subscripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see this change. I have a question about whether this is Checked C specific or something that should be upstreamed.
clang/lib/Sema/SemaExpr.cpp
Outdated
@@ -14201,6 +14199,15 @@ QualType Sema::CheckAddressOfOperand(ExprResult &OrigOp, SourceLocation OpLoc) { | |||
|
|||
CheckAddressOfPackedMember(op); | |||
|
|||
if (getLangOpts().C99) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this code is not placed where the comment was removed. Why place it here?
Is this a change that is specific to the Checked C version of clang? Or should it be propagated upstream? Put another way, is this a bug in clang? Or is there something specific about Checked C that requires this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was placed here (rather than where the comment was removed) so that it occurs after the check on line 14127 for taking the address of an lvalue. If this code is placed where the comment is removed, then certain tests (e.g. Sema/complex-imag.c) fail due to missing expected errors (they expect errors "cannot take the address of an rvalue of type " to be emitted).
I can add a comment explaining why this code is placed where it is.
To the best of my knowledge, this change isn't Checked C specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code is moved to where the comment was removed, then the Sema/expr-address-of.c test fails. This test expects the following error:
void foo() {
register int x[10];
&x[10]; // expected-error {{address of register variable requested}}
}
The "address of register variable requested" error is emitted in the call to diagnoseAddressOfInvalidType
on line 14187. If the code in this PR is added before line 14187, then this expected error will not be emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code you are adding and I don't think it is needed for the regular C path. Taking the address of a subscript expression e1[e2]
, where e1 or e2 is a pointer to T, will result in a pointer to T being created with the existing code. The subscript expression will have type T and the final statement will create a pointer to T.
I believe the code is really only needed for the Checked C path. It might be better to put this under a Checked C flag. You could then explain in the comment that the code avoids the unexpected result of &e1[e2] having a different kind of pointer type than the pointer type that is being subscripted. This can happen in unchecked scopes where the &
operator is applied to a subscript expression involving a checked pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code and think the change is only needed for Checked C. I think it would be helpful clarify that.
clang/lib/Sema/SemaExpr.cpp
Outdated
@@ -14201,6 +14199,15 @@ QualType Sema::CheckAddressOfOperand(ExprResult &OrigOp, SourceLocation OpLoc) { | |||
|
|||
CheckAddressOfPackedMember(op); | |||
|
|||
if (getLangOpts().C99) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code you are adding and I don't think it is needed for the regular C path. Taking the address of a subscript expression e1[e2]
, where e1 or e2 is a pointer to T, will result in a pointer to T being created with the existing code. The subscript expression will have type T and the final statement will create a pointer to T.
I believe the code is really only needed for the Checked C path. It might be better to put this under a Checked C flag. You could then explain in the comment that the code avoids the unexpected result of &e1[e2] having a different kind of pointer type than the pointer type that is being subscripted. This can happen in unchecked scopes where the &
operator is applied to a subscript expression involving a checked pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
* Revert "[BoundsWidening] Determine checked scope specifier per statement (#1139)" (#1141) This reverts commit 980321d. * Determine checked scopes per statement (#1142) We introduce a 2-bit field called CheckedScopeSpecifier in the Stmt class. During parsing when a compound statement is created we iterate the elements (statements) of the compound statement and set the checked scope specifier for each element to the checked scope specifier of the compound statement. We can get the checked scope specifier for a statement by calling the getCheckedScopeSpecifier method on the statement. * Update the instructions for upgrade of LLVM/Clang. (#1146) * Updated the instructions for upgrade of LLVM/Clang. Also added a new file LLVM-Upgrade-Notes.md to track important information related to upgrades. * Fixed typos. * Addressed review comments. * Fixed an inadvertent deletion. * Addressed review comments. * Incorporated review comments. * Fixed minor typos. * Fixed typos. * Add new flags for available facts analysis * Add the analysis into the build script and the sema bounds * Add utility functions to check whether a var is used in a Expr and a BoundsExpr * Add AbstractFact as a basic available fact; Add InferredFact and adjust WhereClauseFact to be a subclass of AbstractFact * Add data structures used in the analysis * Add print and dump functions * Add utility functions which are also used by BoundsWideningAnalysis. * Add other utility functions. `IsSwitchCaseBlock`: use `dyn_cast_or_null` to cover the null pointer case. `ConditionOnEdge`: do not test if there is no edge between pred to curr since it will only be called if there is an edge. `GetModifiedVars`: use `TranspareCasts` to bypass some casting. The feature to deal with membership access and the array indexing is still TODO. * Add fact comparision and fact-realted set oerations (contains TODO). * Add testscases (one covers basic features and the other is converted from the previous available facts analysis) * Dataflow analysis: Add statement-based Gen/Kill. * Dataflow analysis: Add block-edge-based Gen set. * Dataflow analysis: Add function to compute In and Out set. * Dataflow analysis: Addworklist algorithm. * Add desctrutors to release the memory * Fix: modify the Gen/Kill rules to match the design doc; It also fixes a bug to visit dead blocks. * Cleanup comments * Fix: use the exisiting functions to find a `VarDecl` in an expr * Change the equal check on fact collections to equal size check * Update the testcases with the updated Gen/Kill * Remove debug flag for available facts. * Use lexco-compare for `EqualityOpFact` and `InferredFact`. * Add a map to store the comparision results of facts. * Change the source location of a fact to its near expr. * Use a dedicated list to collect created facts and clean them finally. * Verify if an expr contains errors before checking invertibility (#1154) The community has introduced a new annotation called "contains-errors" on AST nodes that contain semantic errors. As a result, after the upgrade of Checked C sources to LLVM 12 we need to check if an expr contains errors before operating on the expr. One such place is in InverseUtil::IsInvertible where we need to check if the input modifying expr contains errors. * Added containsErrors checks to InverUtil::Inverse * [BoundsWidening] Handle complex conditionals in bounds widening (#1149) Support bounds widening in presence of complex conditionals like: "if (*p != 0)", "if ((c = *p) == 'a')", etc. * Don't record temporary equality between expressions such as x and x + 1 in TargetSrcEquality (#1162) * Add AllowTempEquality parameter to RecordEqualityWithTarget * Use a ModifiedSameValue variable to determine the return value for UpdateSameValueAfterAssignment * Rename ModifiedSameValue to RemovedAnyExprs and clean up comments * Treat address-of array subscripts the same way as address-of dereferences (#1163) * In CheckAddressOfOperand, add case for address-of array subscripts to C99-specific logic * Move address-of array subscript check after other checks such as taking the address of an lvalue * Adjust expected AST output to account for different types of address-of array subscripts * Restore deleted comment about checking for array subscript expressions * Add comment explaining the placement of the address-of array subscript logic * Put &e1[e2] typing rules under a Checked C flag * Update the available facts analysis. Co-authored-by: Mandeep Singh Grang <magrang@microsoft.com> Co-authored-by: Sulekha Kulkarni <Sulekha.Kulkarni@microsoft.com> Co-authored-by: Katherine Kjeer <6687333+kkjeer@users.noreply.github.com>
Fixes #1148
This PR modifies the type checker so that, if an expression
e
has typeT
, then&e[idx]
and&idx[e]
also have typeT
. This is similar to the current behavior where, ife
has typeT
, then&*e
also has typeT
.From the C spec section 6.5.3.2:
This is similar to the rules for
&*e
: