Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alias analysis #194

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

LukasKlenner
Copy link
Contributor

Note: This pull request is based on #178 and should only be merged after that pull request.

This pull request adds two alias analyses for the alias properties added in #171: One intraprocedural alias analysis and one based on the existing points-to analysis.

stmt match {
case goto: Goto =>
val targetPC = tac.stmts(goto.targetStmt).pc
if (targetPC <= pc && goto.pc >= pc) return false // jumping from behind the allocation site in front of it -> might break aliasing because allocation site is executed multiple times
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inefficient and incorrect:

Inefficient:

  • Checking every statement whether it is a goto, instead of using already available information (CFG, dominator tree)
  • Matching on the type of the instruction instead of its astID, which allows for an integer check instead of a type check
  • Extracting pcs instead of just working with stmt ids, which are ordered the same way

Incorrect:

  • Would interpret some linear control flow as loops (not unsound, but still):
...
5: goto 8
6: allocation site
7: goto 9
8: goto 6
...
  • Misses other branching instructions, such as if and switch, so it is unsound as it may miss loops

A better way to do this is to check whether a jump is dominated by its target. That is still not perfect (it misses non-natural loops). OPAL provides facilities to compute dominator trees and check dominance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correct, a better way to do it would be iterate over all basic block with a endPC >= allocationSitePC (don't care what previous bb's do) and check whether they are dominated by a bb in front of the allocation site? I don't see a way to directly check "whether a jump is dominated by its target". That would still require me to iterate over all goto statements to identify them. Or should i keep iterating over the statements and only perform the check, whether the goto is problematic, using the dominatorTree? Regarding the implementation: To check the dominance I only found the DominatorTree.foreachDom(int)(int => U) method. How can I map the given int back to the associated bb to check whether that is a problematic dominance?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to do basic blocks before the allocation, too. See the example above how (linear control flow) jumps can reorder code irrespective of program counters.
Jumps are only at the end of basic blocks, so you don't need to look for gotos (and also, you get all kinds of jumps not just gotos).
There is method strictlyDominates on DominatorTree (strictly just means it excludes the BB itself).
I can't come up with the correct procedure from my head just now, but I suppose it will be something like: Is there a dominator (loop header) of the allocation (loop body) which has a predecessor that it dominates (back edge).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. So far I have this code:

// check if the allocation site is dominated by a loop header, i.e., is executed multiple times
domTree.foreachDom(cfg.bb(tac.properStmtIndexForPC(pc)).nodeId)(dom => {

    // check if the dominator is a loop header
    cfg.foreachPredecessor(cfg.allBBs.toSeq(dom).nodeId)(pred => {

        // if the dominator itself dominates one of its predecessors, it is a loop header
        if (domTree.strictlyDominates(dom, cfg.bb(pred).nodeId)) {
            return false
        }
    })

}) 

It checks whether the allocation site is dominated by a loop header. It works so far, but overapproximates cases where the allocation site is after the loop and not inside of it. E.g.:

public static void mustAliasLoopInFront() {
    for (int i = 0; i < 10; i++) {
        [...]
    }

    Object o2 = new Object();

    o2.hashCode();
}

It computes that (o2,o2) is a may alias because the definition site is dominated by a loop header. Do you know whether it is possible to detect such cases, e.g., by checking if the allocation site is behind all predecessors of the loop header? Or would that cause other problems?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess also check that the backedge is a postdominator of the allocation site?

* @param stmts The statements of the method the UVar is defined in.
* @return A persistent UVar for the given UVar.
*/
final def persistentUVar(uVar: UVar[ValueInformation])(implicit stmts: Array[Stmt[V]]): PersistentUVar = {
Copy link
Collaborator

@errt errt Jun 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like it should be part of this package object (and neither does the rest of this file). Aren't they already available elsewhere? If not, place them in a more general place.

// they refer to the same allocation site but aren't necessarily the same object (e.g. if the allocation site
// is inside a loop or a different method and is executed multiple times)

val (pointsToContext: Context, pc: PC) = pointsTo1.allPointsTo.head
Copy link
Collaborator

@errt errt Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You used .head here. This might be correct, but most often it isn't because it ignores the other elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this line is executed we require that both pointsTo sets contain exactly one element. If that is not the case we directly return false. So that assumption shouldn't be a problem

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pointsTo1.size the same as pointsTo1.allPointsTo.size? Then yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants