Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of points-to analysis (PTA) to get indirect edges #4

Open
acidghost opened this issue Aug 16, 2023 · 3 comments
Open

Use of points-to analysis (PTA) to get indirect edges #4

acidghost opened this issue Aug 16, 2023 · 3 comments

Comments

@acidghost
Copy link

First, really cool work! :)

I was experimenting with your prototype and found out that it takes a really long time and memory to compute the graphs in the server component (SVF) for some large real-world programs, regardless of the use of the --get-indirect flag.

Digging into the code I see that regardless of the --get-indirect flag it's computing a full pointer analysis:

Andersen* ander = AndersenWaveDiff::createAndersenWaveDiff(pag);
/// Call Graph
PTACallGraph* callgraph = ander->getPTACallGraph();
spdlog::info("Performing electrification");
// ICFG
ICFG* icfg = pag->getICFG();
// Updating ICFG with indirect call resolution
if (GetIndirect) {
icfg->updateCallGraph(callgraph);
}

For reference, see SVF's code:

Hence, regardless of the --get-indirect flag, SieveFuzz is using a call graph augmented with the indirect edges found by PTA.

In case the --get-indirect flag is given, it will also add the indirect edges from the PTA to the ICFG.

Given that the paper does not discuss the use of PTA, I was wondering if the intended use of SieveFuzz (i.e. what is evaluated in the paper) is with or without PTA and the --get-indirect flag.

@prashast
Copy link
Member

prashast commented Aug 24, 2023

Thanks a lot for your interest and sorry for the late reply! As you correctly pointed out, we do use a PTA callgraph and then add those edges to the ICFG. This was intended as an optimization/ease-of-implementation tactic where instead of creating new edges from scratch in the ICFG upon being observed dynamically we instead would follow through on indirect edges overlaid on top of this ICFG during our reachability analysis only if we had seen that indirect edge dynamically before. For the purposes of the evaluation, we had the PTA callgraph along with the --get-indirect flag turned on for all targets. The only exception was mJS where we turned the --get-indirect flag turned off because the version of SVF we used at the time would segfault trying to overlay the indirect call edges onto the ICFG.

Let me know if you have any further questions.

@acidghost
Copy link
Author

@prashast Thanks for the explanation.

Unfortunately I've not been able to run PTA on some targets from the MAGMA dataset. PHP goes out-of-memory on a machine with 128GB of RAM and others (e.g., openssl, sqlite, etc.) consume tens of GBs making it impossible to run a decent amount of instances in parallel (for evaluation purposes).

Would it be possible to patch the prototype and not use PTA but add the indirect edges to the graphs as they're discovered?

@acidghost
Copy link
Author

Also, the version of SVF used in this prototype is quite old. Newer version haven't changed the API much but have lots of improvements and bug fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants