-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVFG: Obtain SVFGNode corresponding to argument of CallSite #13
Comments
Hi Oliver, Thanks for your question. You case looks a bit complex. If I understand correctly, do you want to track the value-flows of a particular pointer parameter at a callsite, say "* sp" in your example? Then, You may wish to refer to "getActualParmSVFGNode(const PAGNode* aparm,llvm::CallSite cs)". If you want to track the value-flows of an address-taken object, say the target pointed to by "*sp" at callsite "log2(*sp,foo)". Then, you may wish to refer to "getActualINSVFGNodes(llvm::CallSite cs)". Any object that may be modified inside callee "log2" is put as a ActualIN SVFGNode before the callsite. During the past few months, we've made SVF to support C++ programs, e.g., virtual calls. We tried to model as many C++ library calls as possible. However, when lowering a C++ program into a bitcode file, they are still things may not be modelled, especially some of the C++ internal classes, such as class string in your case. String is internal in C++, whose function bodies are not included in the a bc file. Any string operations e.g., =, +, <<, * are no longer simple assignment or dereference, rather, they are translated into a series of invocations to its (string) overloading functions. You can see some C++ function names like: line 624 in Logger.bc line 629 in Logger.bc The above function calls are not modeled by SVF, so the value-flows of the address-taken objects pointed to by "*sp" may be missed. We will need to add the side-effects of those functions in file Util/ExtAPI.cpp after demangling their names (Util/CPPUtil.cpp). For now, another way to achieve you goal is to change "string&" to be "char*&". Then the value-flows will be soundly generated. |
Thanks for the detailed help!! Seems like getActualParmSVFGNode() provides the functionality I was looking for. However I can't traverse the value flow graph from there because the returned SVFGNode doesn't have any incoming or outgoing edges. Is this intended? Fortunately, I managed to get the SVFGNode corresponding to a function argument using
From this SVFGNode I can traverse the Edges through With the information from your answer about C++ internal classes, I was able to track back the value flow of Thinking a little bit further the following question comes in my mind: |
A very good try!
You will see two ActualINSVFGNodes and one ActualOUTSVFGNode. These are the ones you want to obtain.
|
Hi Yulei, sorry for this delayed reply! Unfortunately my time to work with SVF is limited at the moment. The patch (53dafe6) you provided works great, thanks! I was able to get a big part of my work done after applying it.
So it seems that there is more work to be done with the standard c++ lib. But I will try to cover this topic in another issue on a higher level. Regarding points 1 and 2 of your last post: So far I accomplished (a) and (b) but there are a few bugs due to my lack of understanding of SVFG and how it is built in all the corner cases. Question 1: The FormalINSVFGNodeSet seems to not guarantee that its order corresponds to parameter ordering. How can I know that the FormalINSVFGNode represents the i-th parameter of function X? For (a) and (b): Sometimes while traversing SVFG I visit a FormalRetSVFGNode. Then it's obvious that value comes from the return value of another function. But sometimes I only get FormalOutSVFGNodes. It's obvious from which function it comes but it's hard to tell from which parameter (No. 0, 1, 2, ..., n) of the function it comes or if it actually comes from the return value of the function. Question 2: Is this workaround the way to go? In all corner cases? Or is there a simpler and reliable way to get the parameter No. i? For (c): Very rarely while traversing SVFG I visit a FormalParmSVFGNode. Then it's easy to know from which input parameter of which function the value comes from. But most of the time I get a FormalINSVFGNode. As with FormalOUTSVFGNodes I cannot tell to which parameter number the node corresponds. And I cannot use the workaround(*) as I can't detect any StmtSVFGNode around. Question 3: How can I know to which parameter number i the FormalINSVFGNode corresponds? Many thanks in advance! |
Hi Oliver, We didn’t label an index (e.g., My advice is to start the backtracking from a statement SVFGNode (as your tainted source) rather than a formal parameter on SVFG. It the can always reach a formal parameter if the object/pointer is passed directly or indirect via a parameter. I have listed several examples for you to understand.
A DirectSVFGEdge (value-flow of
A DirectSVFGEdge (value-flow of
A DirectSVFGEdge (value-flow of “x”) from parameter “x” (FormParmSVFGNode) to “*r=x” (StoreSVFGNode), and another IndirectSVFGEdge (value-flow of “o”) from “*r=x” (StoreSVFGNode) to “p=*q” (LoadSVFGNode)
A DirectSVFGEdge (value-flow of
An IndirectSVFGEdge (value-flow of Please let me know if you have anything more want to discuss. |
Hi Oliver, If possible, could you help us collect the summaries of some C++ library calls (as what you did for string operator=)? It will benefit other users too. Thanks |
Hi Yulei, yes I will help you collecting the libc++ summaries. After I get my little example to work, I plan to analyze a real world c++ project with lots of c++ library calls. I will provide a git patch or pull request or whatever you prefer to commit the ExtAPI summaries. But I am really confused at the moment. Your examples seem straight forward to me, except example 5. Where does x come from?(Just a typo and should be q? => I assume x is q for the following). Another case: if my particular parameter is not a pointer, but just an i8 for example. If I understood you correctly there is no Actual/FormalParmSVFGNode for this parameter and no ActualIN/FormalINSVFGNode, right? Do I have to use PAG then to track its value? |
Oliver,
Please see the revised one below:
In this example, when you perform backtracking from "p=*r" (tainted source) to "*x=z", the traversed value-flows have nothing to do with parameter L4 --o--> L3 --o--> L2 --o--> L1 We don't model value-flows (def-use chains) for no-pointer values, as they are explicit on LLVM SSA. Please let me know if the above still doesn't answer your question. |
Ah okay, that makes sense. So there are definitly CHI and MU nodes, that do not correspond to function parameters. What about this example, where x and y are local allocations in the main function. Is the placement of the CHIs and MUs correct and can we relate them to their actual and formal parameters?
|
I believe you can use You can apply |
Hi,
I recently tested SVF (commit 5355fc2). Great piece of work from my point of view!
Unfortunatly I have problems using the API correctly and I would be pleased if you could guide me a little.
I initialize SVF with the following Instuctions:
Later I obtain a llvm::CallSite and want to access the SVFGNodes corresponding to the arguments of that CallSite with
But the SVFG::callSiteToActualINMap (include/MSSA/SVFG.h:100) is empty everytime. What am I missing here? Do I have wrong initialization steps?
I attached the code of my LLVM Pass as well as source code and LLVM IR of the module under test.
Logger.zip
See LoggerOO.cpp: my ultimate goal is to track back the value of parameter 1 of Logger::log2() [line 75] so that SVF reports its value either originates as return value of Encryptor::encrypt() [line 69] or as output parameter of assign() [line 71]
It would be nice if you could help me with this.
Thank you.
The text was updated successfully, but these errors were encountered: