pad edited this page Mar 10, 2011 · 11 revisions



Static Analysis for PHP

  • SEMI dataflow analysis,
  • class analysis,
  • type-inference,
  • interprocedural context-sensitive dataflow analysis.
  • analyze sql strings and database schema to help the type inference.
  • holistically analyze PHP and javascript and sql together.

Dataflow analysis could help to track bugs such as XSS holes statically, and type inference could help sgrep to have more precise queries (you could search for function calls that have an argument with a specific type). With better interprocedural static analysis we could try to enforce certain security/privacy rules.

Dynamic Analysis

More work on the tainted analysis in HPHP (in C++).


Sgrep helps only finding expressions. People would like to find more complex patterns, for instance all the methods with specific arguments, in a class that inherits from a specific class. One could write this sgrep query:

class X extends MySpecificClass {
   function M(..., $X:MySpecificType, ...) { ... }

that is generalizing sgrep patterns to the full PHP language.


DONE spatch (syntactical patch) where the goal is to help refactor PHP code by just writing sgrep patterns annotated with - and + (like in a patch) to transform generically code.

Like for sgrep, generalizing spatch patterns to the full PHP language.


Port to the web using ocsigen

Tabular code metrics statistics, per dir or per team, possibly with evolution visualization via sparklines.

SEMI Integrate some of the dynamic analysis we have (e.g. coverage) with the visualizer so we could visually track for instance request to home.php. We could visually see all the code involved in accessing home.php. Not sure it would be useful but it would be sexy.

SEMI Extending the visualizer to make more analysis accessible to the developer, for instance to visually see the dead code, to visually see all the callers to a function, to navigate this callgraph, etc.

Making the visualizer really fast when zooming in so it would be really like google maps. We could just pre-render the treemap at different levels and provide smooth transitions from one level to the other. Again, not sure this is very useful, but it would be sexy.

Source to source optimizations

Inlining. Type specialization and adding type hints. Method unvirtualization.

Javascript support

Doing all those kinds of work but for Javascript instead of PHP (I've already started the javascript parser).

C++ Support

Doing all those kinds of work but for C++ (again, I already have a good start at the C++ parser).