Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating SVF to LLVM 4 #14

Closed
jcarlson23 opened this issue Apr 1, 2017 · 11 comments
Closed

Migrating SVF to LLVM 4 #14

jcarlson23 opened this issue Apr 1, 2017 · 11 comments

Comments

@jcarlson23
Copy link

Hello,

I am very new to SVF and somewhat naively I took a quick spin in the 3.8 LLVM code base (as shipped so to speak), and liking the code base, and capabilities, I then started towards migrating SVF towards the LLVM 4.0 release. Perhaps not unsurprisingly I ran into a few problems, but primarily and this was one of the things that attracted me to SVF, the GraphTraits in 4.0 looks like it has some changes that are marked as a fixme:

typedef typename GTraits::NodeRef NodeRef;
typedef typename GTraits::nodes_iterator node_iterator;
typedef typename GTraits::ChildIteratorType child_iterator;
DOTTraits DTraits;

static_assert(std::is_pointer::value,
"FIXME: Currently GraphWriter requires the NodeRef type to be "
"a pointer.\nThe pointer usage should be moved to "
"DOTGraphTraits, and removed from GraphWriter itself.”);

(that’s in the GraphWriter.h from LLVM 4.0 release).

So, is there anywhere, a paper perhaps, that you could refer me towards for understanding some of the details as well as the high level implementation of the graphing algorithms being used? I come from a numerical background so I’m used to sparse solvers such as GMRES, etc, but the LLVM implementation (while I’m somewhat familiar with a few parts of LLVM) is somewhat new to me - I’ve generally re-used the SCC and other graph related algorithms without needing to dive in headfirst - that said, that looks like it’s on tap for my weekend :)

Another question I had was in the Memory separation capabilities, does this require disjoint regions? It seems to read as if “no”? But then that does limit any typical Formal Methods applications where analysis (that I’ve seen at least) typically requires disjoint regions. For example I’ve seen several FM related LLVM papers replacing PHI nodes with simplifications.

Thanks!

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 2, 2017

Hi Jared,

Thanks for your questions.

  1. On LLVM's GraphTraits
    Type traits is a good C++ feature for graph traversal. As SVF uses graph traversal frequently, LLVM's GraphTraits is a perfect fit for our purpose.

The original graph traits idea came from boost library:
http://www.boost.org/doc/libs/1_46_1/libs/graph/doc/graph_traits.html
http://stackoverflow.com/questions/3979766/how-do-traits-classes-work
http://www.drdobbs.com/cpp/c-type-traits/184404270
or feel free to search some others reference.

To correctly use LLVM's GraphTraits, first you need to write a graph representation (e.g., PAG) based on a generic graph ("MemoryModel/GenericGraph.h"). Then you need to instantiate a GraphTraits for the graph. Please refer to lines 640-654 in "MemoryModel/PAG.h". Finally, you have to specify iterators (lines 181-235 in "MemoryModel/GenericGraph.h") to be used by different graph clients, such as GraphWriter in "Util/GraphUtil.h" and SCC in "include/Util/SCC.h".

  1. On graph algorithms in SVF
    SVF abstracts LLVM IR into various higher level graph representations, on which we have implemented a few classic algorithms, e.g., SCC (include/Util/SCC.h). They are implemented based on papers (e.g., "On finding the strongly connected components in a directed graph") as specified in our C++ header files.

  2. On disjoint memory regions
    Do you refer to our memory region partitioning algorithms in "MSSA/MemRegion.h" and "MSSA/MemPartition.h"?
    We have several options for memory region partitioning. Disjoint partitioning is one of them. You may wish to refer to our TSE 2014 paper (Section 4.2.2). For pointer analysis and/or dependence analysis, other partitioning strategies can also be applied to trade precision for efficiency, or the other way around.

Hopefully the above answers your questions. Good luck!

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 2, 2017

Hi Jared,

Would you mind giving a pull request once you finish the migration? We can start testing the correctness and performance from then.

Migrating SVF to LLVM 4.0 would benefit other users too.

Thanks!

@jcarlson23
Copy link
Author

Hi Yulei,

Let me digest your above answers a bit (thank you for the details!) but in terms of the latter, of course, happy to give a pull request. One of my motivations is to hopefully to extend some formal proof capabilities, hence my question on disjointness, and on that yes, I was referring to the memory region partitioning in MSSA - I haven't looked at all the options, I thought that migrating to 4.0 was both a useful exercise to touching a large, or significant, part of the code base and getting familiar while helping...

@jcarlson23
Copy link
Author

Thanks again for comments Yulei. I can see most of what I was concerned about is easily fixed. Most of the instantiations become something along the line of:

template<> struct GraphTraits<PTACallGraph*> : public GraphTraits<GenericGraph<PTACallGraphNode,PTACallGraphEdge>* > { typedef PTACallGraphNode *NodeRef; };

Where the NodeRef is required and I just need to dig up the right type to pass back. It does looks to me like this might be unnecessary later if the NodeRef requirement drops but for 4.0 that doesn't seem to be the case. There are some other API changes but making my way through them...

@jcarlson23
Copy link
Author

Hi Yulei,

Running an error in pushing the nodes onto the work list. The error I'm seeing is:

In file included from /Users/jcarlson/Projects/SVF/include/WPA/WPAFSSolver.h:39: /Users/jcarlson/Projects/SVF/include/WPA/WPASolver.h:129:45: error: no viable conversion from 'child_iterator' (aka 'mapped_iterator<__tree_const_iterator<SVFGEdge *, std::__1::__tree_node<SVFGEdge *, void *> *, long>, pointer_to_unary_function<SVFGEdge *, SVFGNode *> >') to 'GNODE *' (aka 'SVFGNode **') pushIntoWorklist(Node_Index(EI)); ^~

Generally with adding the NodeRef I saw a few GNODE referring to a **T rather than *T. Generally it seems a relatively minor change as I was working through this, for example in WPASolver.h, lines 125-130 (I don't think I touched much more than this):

child_iterator EI = GTraits::direct_child_begin(*v); child_iterator EE = GTraits::direct_child_end(*v); for (; EI != EE; ++EI) { if (propFromSrcToDst(*(EI.getCurrent()))) pushIntoWorklist(Node_Index(EI)); } }

So looking at the former error, seems like I need to dereference the mapped iterator to the appropriate node type. From looking through the graph traits, it looks like this portion of the WPASolver is a base propagation method for the "points-to" analysis and the SCC node/representation to be added to the working list - for either the deep or wave propagation methods.

So is my high level understanding right? Close? Or completely off? And secondly, just looking for any quick thoughts as to working with the mapped_iterator and how to propagate that - I've skimmed the papers in the header but haven't delved into the details of the algorithm just yet. Thanks for any help!

@jcarlson23
Copy link
Author

Ah, think I figured this out...

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 6, 2017

Good!

@jcarlson23
Copy link
Author

jcarlson23 commented Apr 6, 2017

Alright, some products starting to compile, but linkage errors indicate that perhaps some virtual functions aren't implemented (so new methods to 4.0 classes) and running those down is my next step. Get started on that tomorrow or Monday...

Took a again this morning (cleaned out the build directory and started the compilation from scratch) and really, I think it's just a simple linkage error.

[ 16%] Linking CXX shared module Svf.dylib cd /Users/jcarlson/Projects/SVF/build/lib && /usr/local/Cellar/cmake/3.7.2/bin/cmake -E cmake_link_script CMakeFiles/Svf.dir/link.txt --verbose=1 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -fno-rtti -bundle -Wl,-headerpad_max_install_names -Wl,-dead_strip -o Svf.dylib CMakeFiles/Svf.dir/Util/AnalysisUtil.cpp.o CMakeFiles/Svf.dir/Util/CPPUtil.cpp.o CMakeFiles/Svf.dir/Util/DataFlowUtil.cpp.o CMakeFiles/Svf.dir/Util/PTACallGraph.cpp.o CMakeFiles/Svf.dir/Util/RaceAnnotator.cpp.o CMakeFiles/Svf.dir/Util/ThreadCallGraph.cpp.o CMakeFiles/Svf.dir/Util/BreakConstantExpr.cpp.o CMakeFiles/Svf.dir/Util/Conditions.cpp.o CMakeFiles/Svf.dir/Util/ExtAPI.cpp.o CMakeFiles/Svf.dir/Util/PathCondAllocator.cpp.o CMakeFiles/Svf.dir/Util/PTAStat.cpp.o CMakeFiles/Svf.dir/Util/ThreadAPI.cpp.o CMakeFiles/Svf.dir/MemoryModel/ConsG.cpp.o CMakeFiles/Svf.dir/MemoryModel/LocationSet.cpp.o CMakeFiles/Svf.dir/MemoryModel/LocMemModel.cpp.o CMakeFiles/Svf.dir/MemoryModel/MemModel.cpp.o CMakeFiles/Svf.dir/MemoryModel/PAGBuilder.cpp.o CMakeFiles/Svf.dir/MemoryModel/PAG.cpp.o CMakeFiles/Svf.dir/MemoryModel/CHA.cpp.o CMakeFiles/Svf.dir/MemoryModel/PointerAnalysis.cpp.o CMakeFiles/Svf.dir/MSSA/MemPartition.cpp.o CMakeFiles/Svf.dir/MSSA/MemRegion.cpp.o CMakeFiles/Svf.dir/MSSA/MemSSA.cpp.o CMakeFiles/Svf.dir/MSSA/SVFGBuilder.cpp.o CMakeFiles/Svf.dir/MSSA/SVFG.cpp.o CMakeFiles/Svf.dir/MSSA/SVFGOPT.cpp.o CMakeFiles/Svf.dir/MSSA/SVFGStat.cpp.o CMakeFiles/Svf.dir/SABER/DoubleFreeChecker.cpp.o CMakeFiles/Svf.dir/SABER/FileChecker.cpp.o CMakeFiles/Svf.dir/SABER/LeakChecker.cpp.o CMakeFiles/Svf.dir/SABER/ProgSlice.cpp.o CMakeFiles/Svf.dir/SABER/SaberAnnotator.cpp.o CMakeFiles/Svf.dir/SABER/SaberCheckerAPI.cpp.o CMakeFiles/Svf.dir/SABER/SaberSVFGBuilder.cpp.o CMakeFiles/Svf.dir/SABER/SrcSnkDDA.cpp.o CMakeFiles/Svf.dir/WPA/Andersen.cpp.o CMakeFiles/Svf.dir/WPA/AndersenLCD.cpp.o CMakeFiles/Svf.dir/WPA/AndersenStat.cpp.o CMakeFiles/Svf.dir/WPA/AndersenWave.cpp.o CMakeFiles/Svf.dir/WPA/AndersenWaveDiff.cpp.o CMakeFiles/Svf.dir/WPA/FlowSensitive.cpp.o CMakeFiles/Svf.dir/WPA/FlowSensitiveStat.cpp.o CMakeFiles/Svf.dir/WPA/WPAPass.cpp.o -Wl,-rpath,@loader_path/../lib -Wl,-rpath,/Users/jcarlson/Projects/llvm-4-build/./lib Undefined symbols for architecture x86_64: "llvm::cl::generic_parser_base::printOptionInfo(llvm::cl::Option const&, unsigned long) const", referenced from: vtable for llvm::cl::parser<PointerAnalysis::PTATY> in WPAPass.cpp.o vtable for llvm::cl::parser<WPAPass::AliasCheckRule> in WPAPass.cpp.o

No inclusion of what looks like are necessary LLVM libs..(might be because I'm on a Mac?)

So I'm going to look at the tools/CMakeLists a bit, but the virtual function warnings/errors I saw doesn't appear to be the main problem (once looking at the compilation verbosely from a clean build). I need to write up some of my own notes hopefully tomorrow/Monday and then I'll get back to this but the good news is that the compilation of the sources (minus linkage) looks like it's coming along fairly well (I'll need to run through again once to clean up after I fix the linkage errors - and potentially filling out virtuals) before issuing the pull request...

@jcarlson23
Copy link
Author

Confirmed the linkage issue. The CMakeFiles look good at a first glance but compiling by hand worked (so just typing it in). Likely my environment, or a Mac issue. So I'll put around with this today/tomorrow and see where I am, perhaps comparing to a clean build on a linux VM (seems like a good check). Anyway, confirmed the good news regarding the compilation, as I'm getting some delis with some intervention.

@jcarlson23
Copy link
Author

jcarlson23 commented Apr 17, 2017

Alright everything builds with llvm-4. I need to get go over my changes one last time and then I can issue a PR with some notes for what I hope is a relatively painless merge. It'll be a "llvm-4" branch PR.

@yuleisui
Copy link
Collaborator

Hi Jared,

Thanks a lot for your efforts!

Yes, please make some notes for important updates.
We will take a look at your patches and give you feedback later.

@yuleisui yuleisui closed this as completed May 7, 2017
yuleisui added a commit that referenced this issue Aug 30, 2019
delete oscc in AndersenHCD
yuleisui added a commit that referenced this issue Nov 21, 2022
change namespace to SVF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants