cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library #97

mewmew · 2015-01-06T20:48:07Z

The control flow graph generation tool is currently using the Go bindings for LLVM to generate CFGs from LLVM IR.

A pure Go library for interacting with LLVM IR will replace the Go bindings for LLVM in v0.2. This library is being developed at https://github.com/llir/llvm

mewmew · 2015-02-26T13:44:37Z

Marked as future ambition, closing for now. See the following comment for further information.

mewmew · 2016-05-25T10:38:06Z

This work is currently being tracked by the experimental ll2dot tool developed at https://github.com/decomp/exp/tree/master/cmd/ll2dot

mewmew · 2017-01-22T22:04:19Z

As of commit decomp/exp@3cfd64c, the ll2dot tool of the exp repository has become capable of producing control flow graphs in Graphviz DOT file format from LLVM IR assembly input. Only a subset of LLVM IR assembly is currently supported, and full support for parsing LLVM IR assembly is tracked by issue llir/llvm#15.

The ll2dot tool of the exp repository will be merged with the decomp repository within the next couple of months, after which there is only one tool remaining which makes use of the old bindings for LLVM IR; namely the ll2go tool. It will also be updated within the next couple of months, after which it should once again be fun to play with the decomp project as we've finally gotten rid of the C++ library dependency, which brought with it long compile times.

mewmew · 2017-02-08T04:50:36Z

Status update as of 2017-02-07, a preliminary version of the ll2go tool which relies on the llir/llvm package has been implemented in the exp repository. Once the llir/llvm implementation matures a little, the experimental versions of the ll2dot and ll2go tools will be cleaned up and merged with the decomp repository.

mewmew · 2017-03-18T20:39:18Z

The ll2dot tool has now reached maturity, and will be merged from the dev branch.

pfalcon · 2017-03-24T13:54:47Z

Once the llir/llvm implementation matures a little

Btw, a week ago I learned about 2 other pretty high-profile projects which waved LLIR bye-bye in preference of their own IRs: pfalcon/ScratchABlock@bf784f1

mewmew · 2017-03-24T21:43:54Z

Hej @pfalcon!

Very interesting to read about B3, it could be a contender to LLVM IR for a decompilation pipeline. I know you've previously expressed an opinion against SSA-form as the process of translating into and out of it is rather involved.

Personally, I very much enjoy working on a SSA-form level of abstraction, as it enables very interested information retrieval, such as recovering the individual scope of local variables that end up sharing the same register in the assembly representation of the CPU architecture. Simply tracing the data flow of SSA-variables is enough to provide a clear separation between two independent local variables that have ended up sharing/reusing the same CPU register. Further more, it vastly simplifies the type analysis stage, as the decoupled register accesses may now be analysed individually.

Well, for now at least, those are ideas in my mind that I've discussed with my friend Daniel, and we both believe that SSA has a use in these situations. Future exploration will tell. However, type analysis is not planned until version 0.6 of the decompilation pipeline. Before that we will explore control flow analysis in v0.4, data flow analysis in v0.5. Version 0.2 is all about getting rid of the LLVM C++ dependency, and we are almost there! Once the dev branch is merged with the master branch, we are there! This should happen within the next two weeks or so. After that version 0.3 is focused on the general robustness of the decompilation pipeline, being able to handle corner cases without crashing :)

We will see how these preliminary plans change as development continues. In either case, playing with decompilation is tremendously fun and we rejoice in the experience!

I'm happy to see that you are making steady progress on ScratchABlock. What are the major stumbling blocks (if any) that you've come across in the last three months? I'd be very interested to know what issues you are or have been struggling with, as the easy parts are, well, easy.

pfalcon · 2017-03-24T22:18:26Z

opinion against SSA-form as the process of translating into and out of it is rather involved

Well, the opinion was that: because SSA is a) rather involved to translate into/out of and b) it breaks "one-to-one" (well, at least "direct") correspondence between original code and decompilation intermediate code, my preference is not to call SSA in until all other options are explored. And I already find b) less relevant, because even without SSA, there're enough transformations which skew representation and its correspondence to the original code. And I already use poorman's SSA, of treating input registers to functions as 0-subscripted regs, which then get assigned to unsubscripted regs, as that's the prerequisite of doing proper propagation, stack var rewriting and preserveds tracking.

Personally, I very much enjoy working on a SSA-form

Did you write all the SSA code yourself? Can you explain how it works to 5-year old (not the trivialities, but all the dominance frontier cases and the need to perform graph coloring (==register allocation) to convert out of it)? I yet to meet a person who can do that. Otherwise, I too enjoy somebody else's SSA being crunched with somebody else's code on my CPU, but it has little to do with me actually...

What are the major stumbling blocks (if any) that you've come across in the last three months?

It's hard to bootstrap interprocedural analysis. Requires even more complex tools and even more complex usecases, no realworld cases work because they're too complex, etc.

mewmew · 2017-03-24T22:28:10Z

Did you write all the SSA code yourself?

No, indeed I have not. I reckon it will take me quite some time to figure out exactly how to. Hopefully it will turn out to be a fun learning adventure, as many other parts have been so far. For me personally, that's the main reason to keep playing with these kind of projects, for fun. If it is fun, all else naturally follows.

It's hard to bootstrap interprocedural analysis.

Do you mean constant propagation between procedures, type analysis, or what kind of interprocedural analysis?

In decomp we are still very much working on the basic block and control flow recovery level, so these stumbling blocks are still far away.

mewmew · 2017-03-24T22:29:32Z

Oh, and glad to hear you have revised your thoughts regarding SSA. It gives me confidence in exploring it more deeply.

pfalcon · 2017-03-25T07:30:50Z

Do you mean constant propagation between procedures, type analysis, or what kind of interprocedural analysis?

Interprocedural dataflow analysis, as required to recover function arguments/returns. Which is required for proper intraprocedural analysis of a function. But of course, you need to do intraprocedural analysis first before you can even approach interprocedural analysis.

In decomp we are still very much working on the basic block and control flow recovery level, so these stumbling blocks are still far away.

Well, we discussed that when we met - that selecting a tool is what guides how much progress one can have, and how fast :-P ;-).

pfalcon · 2017-03-25T19:52:32Z

Oh, and glad to hear you have revised your thoughts regarding SSA. It gives me confidence in exploring it more deeply.

Your SSA at its best: https://github.com/zneak/fcd-tests/blob/osx/output/1993-leo.c#L397 . My thoughts remain the same: people who did not write conversion into and out of SSA themselves should not use SSA, it's far too powerful sorcery. Hiding behind the back of LLVM leads to hilarious results, sorry.

On the good news, I thought that there's no hope not only on explaining SSA to 5-year old, but even PhDs gave up on explaining it to other PhD, with epic project http://ssabook.gforge.inria.fr/latest/book.pdf not updating since 2015. I however just pulled from their repo, and there're even fresh commits. But looks mostly like styling, not new chapters.

mewmew added the MUST label Jan 6, 2015

mewmew self-assigned this Jan 6, 2015

mewmew added this to the Meeting 13 milestone Jan 6, 2015

mewmew mentioned this issue Jan 8, 2015

Produce Control-Flow Graphs from LLVM IR basic blocks llir/llvm#5

Closed

mewmew added the future ambition label Feb 26, 2015

mewmew closed this as completed Feb 26, 2015

mewmew removed future ambition MUST labels May 31, 2015

mewmew modified the milestones: v0.2, Meeting 13 May 31, 2015

mewmew changed the title ~~requirement: R7, Produce CFGs from LLVM IR basic blocks~~ ll2dot: R7, Produce CFGs from LLVM IR basic blocks May 31, 2015

mewmew added llvm labels May 31, 2015

mewmew changed the title ~~ll2dot: R7, Produce CFGs from LLVM IR basic blocks~~ ll2dot: R7, Produce CFGs from LLVM IR using the new Go library May 31, 2015

mewmew reopened this May 31, 2015

mewmew changed the title ~~ll2dot: R7, Produce CFGs from LLVM IR using the new Go library~~ cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library May 25, 2016

mewmew mentioned this issue Aug 14, 2016

llvm: Replace the Go bindings for LLVM with a pure Go implementation. #167

Closed

2 tasks

mewmew closed this as completed Mar 18, 2017

mewmew added this to Decompilation components in v0.1 Mar 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library #97

cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library #97

mewmew commented Jan 6, 2015

mewmew commented Feb 26, 2015

mewmew commented May 25, 2016

mewmew commented Jan 22, 2017

mewmew commented Feb 8, 2017

mewmew commented Mar 18, 2017

pfalcon commented Mar 24, 2017

mewmew commented Mar 24, 2017 •

edited

Loading

pfalcon commented Mar 24, 2017

mewmew commented Mar 24, 2017

mewmew commented Mar 24, 2017

pfalcon commented Mar 25, 2017

pfalcon commented Mar 25, 2017

cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library #97

cmd/ll2dot: R7, Produce CFGs from LLVM IR using the new Go library #97

Comments

mewmew commented Jan 6, 2015

mewmew commented Feb 26, 2015

mewmew commented May 25, 2016

mewmew commented Jan 22, 2017

mewmew commented Feb 8, 2017

mewmew commented Mar 18, 2017

pfalcon commented Mar 24, 2017

mewmew commented Mar 24, 2017 • edited Loading

pfalcon commented Mar 24, 2017

mewmew commented Mar 24, 2017

mewmew commented Mar 24, 2017

pfalcon commented Mar 25, 2017

pfalcon commented Mar 25, 2017

mewmew commented Mar 24, 2017 •

edited

Loading