-*- org -*-
#- type inference
#- datalog interprocedural analysis?
subtitle: codemap, codegraph, codequery
terrified when I joined, 5M PHP, IMHO badly organized visualization + program analysis 30’ screen
first thing I did, try to visualize the whole codebase. If can not visualize the mess, cannot understand it
google maps for code treemap, color = aspect, e.g. test code (=> visual clue test coverage), core code help see huge subdirectory => I actually deleted 1M Loc at FB with that :) ##treemap code oriented, filter .o, etc also skip code as you zoom in, render the content of the file
precise identifier highligting, bad smells, tiling multi column (xmonad wm), use eyes to scroll emacs integration semantic feedback (bigger road = bigger functions, more important)
layers (age, number of authors, coverage, cyclomatic complexity, etc) layer nb users?
the more I was using it, the more I realized I wanted to understand the “software architecture”. focus not on source code, but on code relationships! what are entry points? What are core code, What is all the code depending on that, etc
package mode, external mode, module mode, gephi, flibotomy, lots of tentatives. graph? I tried but does not scale, and need flexibility, visualize a different granularity, different focus => DSM.
left, top, same, number when x use y (call, import, etc), aggregated. hypertree
good structure = layer = empty upper right (enforced by ocaml linker actually) core code at the top, entry points at the bottom unfold reslice see patterns more easily, visualize the mess (can’t fix what can’t easily see), usually when backward deps => ugly hacks, things that need to be documented anyway
PLUG: NP problem reorganize minimal so that more layers
hard to see value, but when plan to change something, I look at deps quickly, help evaluate difficulty.
more semantic feedback in codemap
layer bottomup (good macro level, good also micro level) layer nb users?
uses, users, file level (dead code) uses, users, fine grained level, e.g. fields (when something is immediate (not running git grep), ca change la donne, can scroll set of fields and immediately see if used or not, where,)
reslice => focus on current task
syncweb? it also helps for large codebase understanding in some sense ocamltarzan?
PLUG: uncaught exceptions are a recurring pb, in cron especially, change something and boom, later have to capture it
side note during the talk about ocaml complaints or positives stuff:
need ocamltarzan, -dump_xxx ast useful for beginners. need syncweb? :)
fil rouge: Huge codebase, terrified => set of tools to help. idea: intuition maybe visual could help, huge screen, make use of it
google maps on code => treemap + code thumbnails
better than emacs, identifier coloring syntactical use of refs := or <- in big and purple light db => semantic info, important stuff layers
codegraph, focus on deps, understand global orga, software architecture understand “layers”
QUESTION for audience: tool to help find better orga, NP complete probably, but heuristics? minimize elts in upper triangle, and property is hierarchical orga so operations are move parent, move children. monte-carlo?
last iteration: codemap + codegraph integration
bottom up layer is super nice.
demo on ocaml source code, they are familiar with that.
take what I presented at IRISA? Lessons learned :) Engler work … wait first need to detect easy bugs.
fil rouge: Huge codebase, terrified => set of tools to help. (reaper, t, coverage, lint, … codemap … codegraph)
stat #lines removed stat #bugfixes diffs stat #lint rules, sgrepLint pfff_logger stats?
tags (stats? hsh on all server and look if www has a symlinks to TAGS?) prolog sgrep, spatch (git log and search for codemod/spatch ?) codemap, codegraph software architecture!!
failures: codemap because X11? => pfff-web. weird, but small barrier and boom, they dont use. also no action. FBIDE focused on a few core things, that was useful, especially for beginner (search entity + string search, completion, goto defs, no config for good color) failures? codegraph need X11? not enough marketing? => pfff-web better? we have no soft architect af FB.
success: prolog, spatch, tags
lessons: need push for your idea a lot. Cf hack.
see google’s paper, similarity: cmf -n
skip_code => big improvment on my process. gradual fixing made tractable and reviewable.