Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature MatchGraph #310

merged 3 commits into from
Jan 27, 2016

Feature MatchGraph #310

merged 3 commits into from
Jan 27, 2016


Copy link

@commial commial commented Jan 27, 2016

Introduce MatchGraph, which aims to be the counterpart of MatchExpr for graphs.
It is composed of MatchGraphJoker, which are joker for nodes, giving custom restrictions.

From the corresponding docstring:

    If j1, j2 and j3 are MatchGraphJoker, one can quickly build a matcher for
    the pattern:
                                    |  (j1)   |
                                    |  (j2)   |<---+
                                    +----+--+-+    |
                                         |  +------+
                                    |  (j3)   |
    >>> matcher = j1 >> j2 >> j3
    >>> matcher += j2 >> j2
    >>> matcher = j1 >> j2 >> j2 >> j3

Then, one can iterate over solutions using matcher.match(graph), an iterator on dictionnary joker ->matched graph node.

Restrictions can be applied on jokers, for instance:

  • restrict_in: if not set, the corresponding node can have more predecessors than in the pattern (for instance, in pattern heads)
  • restrict_out: idem, for successors
  • filt: a boolean function taking the candidate node in argument and applying a pre-filter.

In the previous example, if one wants to only catch basic blocks starting with a PUSH in j1, he can write:

j1 = MatchGraphJoker(restrict_in=False,
                     filt=lambda block: len(block.lines) and 
                                        block.line[0].name == "PUSH")

Operating on MatchGraphJokers is a quick way to build MatchGraph instance, specifying only edges (j1 >> j2 means an edge from j1 to j2).
As MatchGraph inherits from DiGraph, one can use .dot() to visualize the built pattern.

The algorithm uses for pattern matching is naive, but oriented to quickly provides solutions. In others word, it may have a lot of work between two yielded solutions but a first solution should appear quickly (as the difference between a Depth-first search and a Breadth-first search).
This choice has been made because its seems that a common use case is the retrieve of just one match. For instance, in a DiGraphSimplifier, one should only iterate once if he modifies the graph between two match (they might finish incoherent).

In reasonable graph size, as program functions CFG, the algorithm is fast enough to be usable.

serpilliere added a commit that referenced this pull request Jan 27, 2016
@serpilliere serpilliere merged commit c73fa6e into cea-sec:master Jan 27, 2016
@commial commial deleted the feature-matchgraph branch January 27, 2016 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

2 participants