Skip to content

Commit

Permalink
Fix undeterministic tail call identification results exposed in paral…
Browse files Browse the repository at this point in the history
…lel code parsing

The tail call identification algorithm contains two related heuristics:
(1) A jump to a block within the same function is not a tail call
(2) A jump to a known entry point is a tail call

For (1), whether or not the jump target is within the current function depends on
the parsing order of other functions, especially the callee functions. To have a
consistent tail call identification results, I add a tail call cleaning phase in
the parsing finalizing phase. Because we already know the complete function
boundary at function finalizing time, we can rectify the bogus tail calls and
removed the functions caused by those bogus tail calls.

For (2), if a known entry point is created by a bogus tail call, (2) will lead to
more bogus tail calls. This is exposed by a special case. Suppose function A
contains multiple jumps to block B. Block B has no incoming edges from any other
function, nor does it has a symbol associated. So, B should be part of A. One of
the jump in A (denoted as J1) will be marked as tail call because it tears down
the stack frame before the jump, while other jumps (denoted as non-J1) will not
be marked as tail calls. If J1 is parsed before non-J1, block B will be marked as
a tail call entry, and all non-J1 will be marked as tail calls because they jump
to ``a known entry''. Function A will thus not contain B, and the rectify method
mentioned in the previous parapgraph will not work. So, I change to only mark a
jump as a tail call when the target is a hint.
  • Loading branch information
mxz297 committed Aug 5, 2018
1 parent a89e8c9 commit 1e30a12
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 3 deletions.
5 changes: 4 additions & 1 deletion parseAPI/src/IA_aarch64.C
Expand Up @@ -144,7 +144,10 @@ bool IA_aarch64::isTailCall(const Function* context, EdgeTypeEnum type, unsigned
valid &&
callee &&
callee != context &&
!context->contains(target)
// We can only trust entry points from hints
callee->src() == HINT &&
/* the target can either be not parsed or not within the current context */
((target == NULL) || (target && !context->contains(target)))
)
{
parsing_printf("\tjump to 0x%lx, TAIL CALL\n", addr);
Expand Down
6 changes: 4 additions & 2 deletions parseAPI/src/IA_power.C
Expand Up @@ -139,8 +139,10 @@ bool IA_power::isTailCall(const Function* context, EdgeTypeEnum type, unsigned i
valid &&
callee &&
callee != context &&
target &&
!context->contains(target)
// We can only trust entry points from hints
callee->src() == HINT &&
/* the target can either be not parsed or not within the current context */
((target == NULL) || (target && !context->contains(target)))
)
{
parsing_printf("\tjump to 0x%lx, TAIL CALL\n", addr);
Expand Down
2 changes: 2 additions & 0 deletions parseAPI/src/IA_x86.C
Expand Up @@ -265,6 +265,8 @@ bool IA_x86::isTailCall(const Function *context, EdgeTypeEnum type, unsigned int
valid &&
callee &&
callee != context &&
// We can only trust entry points from hints
callee->src() == HINT &&
/* the target can either be not parsed or not within the current context */
((target == NULL) || (target && !context->contains(target)))
)
Expand Down
50 changes: 50 additions & 0 deletions parseAPI/src/Parser.C
Expand Up @@ -848,6 +848,27 @@ Parser::finalize(Function *f)
// finish delayed parsing and sorting
Function::blocklist blocks = f->blocks_int();

// Check whether there are tail calls to blocks within the same function
dyn_hash_map<Block*, bool> visited;
for (auto bit = blocks.begin(); bit != blocks.end(); ++bit) {
Block * b = *bit;
visited[b] = true;
}
for (auto bit = blocks.begin(); bit != blocks.end(); ++bit) {
Block * b = *bit;
for (auto eit = b->targets().begin(); eit != b->targets().end(); ++eit) {
Edge *e = *eit;
if (e->interproc() && (e->type() == DIRECT || e->type() == COND_TAKEN)) {
if (visited.find(e->trg()) != visited.end()) {
// Find a tail call targeting a block within the same function
// So, this edge is not a tail call
e->_type._interproc = false;
}
}
}
}


// is this the first time we've parsed this function?
if (unlikely( !f->_extents.empty() )) {
_parse_data->remove_extents(f->_extents);
Expand Down Expand Up @@ -922,6 +943,7 @@ Parser::finalize()
if(_parse_state < FINALIZED) {
finalize_funcs(hint_funcs);
finalize_funcs(discover_funcs);
clean_bogus_funcs(discover_funcs);
finalize_ranges(hint_funcs);
finalize_ranges(discover_funcs);
_parse_state = FINALIZED;
Expand Down Expand Up @@ -967,6 +989,34 @@ Parser::finalize_ranges(vector<Function *> &funcs)
}
}

void
Parser::clean_bogus_funcs(vector<Function*> &funcs)
{
for (auto fit = funcs.begin(); fit != funcs.end(); ) {
Function *f = *fit;
if (f->src() == HINT) {
fit++;
continue;
}
bool interprocEdge = false;
for (auto eit = f->entry()->sources().begin(); !interprocEdge && eit != f->entry()->sources().end(); ++eit)
if ((*eit)->interproc()) interprocEdge = true;
if (!interprocEdge) {
// This is a discovered function that has no inter-procedural entry edge.
// This function should be created because tail call heuristic makes a mistake
// We have already fixed such bogos tail calls in the previous step of finalizing,
// so now we should remove such bogus function
if (sorted_funcs.end() != sorted_funcs.find(f)) {
sorted_funcs.erase(f);
}
fit = funcs.erase(fit);
_parse_data->remove_func(f);
} else {
fit++;
}
}
}

void
Parser::record_func(Function *f)
{
Expand Down
1 change: 1 addition & 0 deletions parseAPI/src/Parser.h
Expand Up @@ -252,6 +252,7 @@ namespace Dyninst {
void finalize();

void finalize_funcs(vector<Function *> &funcs);
void clean_bogus_funcs(vector<Function*> &funcs);
void finalize_ranges(vector<Function *> &funcs);


Expand Down

0 comments on commit 1e30a12

Please sign in to comment.