Skip to content

Convert regular expressions into efficient matrix branching programs

License

Notifications You must be signed in to change notification settings

5GenCrypto/regex-fsm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

regex-fsm

Build Status License

The regex-fsm tool can be used to convert certain regular expressions to efficient matrix branching programs. These programs are suitable for use in obfuscation tools like the 5GenCrypto obfuscator.

Table of Contents

Build

cabal sandbox init
cabal install -j --depedencies-only

Development

cabal build

Nix build

To build with nix

nix-build

To develop regex-fsm with nix (highly recommended)

nix-shell -A regex-fsm.env

Usage

$ regex-fsm --regex '(0|1)*' --inputLength 10 --output output.json --verbose --chunks 1 --alphabet 01

Construction

AST

Rep (Union (Lit '0') (Lit '1'))

Epsilon NFA

ENFA
  { trans =
      fromList
	[ ( 1 , fromList [ ( Just '0' , fromList [ 2 ] ) ] )
	, ( 2 , fromList [ ( Nothing , fromList [ 5 ] ) ] )
	, ( 3 , fromList [ ( Just '1' , fromList [ 4 ] ) ] )
	, ( 4 , fromList [ ( Nothing , fromList [ 5 ] ) ] )
	, ( 5 , fromList [ ( Nothing , fromList [ 1 , 3 ] ) ] )
	]
  , start = 5
  , final = fromList [ 5 ]
  , states = fromList [ 1 , 2 , 3 , 4 , 5 ]
  }

Epsilon Closure

fromList
  [ ( 1 , fromList [ 1 ] )
  , ( 2 , fromList [ 1 , 2 , 3 , 5 ] )
  , ( 3 , fromList [ 3 ] )
  , ( 4 , fromList [ 1 , 3 , 4 , 5 ] )
  , ( 5 , fromList [ 1 , 3 , 5 ] )
  ]

DFA

DFA
  { trans =
      fromList
	[ ( ( fromList [ 1 , 2 , 3 , 5 ] , '0' )
	  , fromList [ 1 , 2 , 3 , 5 ]
	  )
	, ( ( fromList [ 1 , 2 , 3 , 5 ] , '1' )
	  , fromList [ 1 , 3 , 4 , 5 ]
	  )
	, ( ( fromList [ 1 , 3 , 4 , 5 ] , '0' )
	  , fromList [ 1 , 2 , 3 , 5 ]
	  )
	, ( ( fromList [ 1 , 3 , 4 , 5 ] , '1' )
	  , fromList [ 1 , 3 , 4 , 5 ]
	  )
	, ( ( fromList [ 1 , 3 , 5 ] , '0' ) , fromList [ 1 , 2 , 3 , 5 ] )
	, ( ( fromList [ 1 , 3 , 5 ] , '1' ) , fromList [ 1 , 3 , 4 , 5 ] )
	]
  , start = fromList [ 1 , 3 , 5 ]
  , finals =
      fromList
	[ fromList [ 1 , 2 , 3 , 5 ]
	, fromList [ 1 , 3 , 4 , 5 ]
	, fromList [ 1 , 3 , 5 ]
	]
  }

Minimized DFA

DFA
  { trans =
      fromList
	[ ( ( fromList [ 1 , 3 , 5 ] , '0' ) , fromList [ 1 , 3 , 5 ] )
	, ( ( fromList [ 1 , 3 , 5 ] , '1' ) , fromList [ 1 , 3 , 5 ] )
	]
  , start = fromList [ 1 , 3 , 5 ]
  , finals = fromList [ fromList [ 1 , 3 , 5 ] ]
  }

Matrix Branching Program

[ fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 1 ) , ( "1" , 1 ) ]
, fromList [ ( "0" , 0 ) , ( "1" , 0 ) ]
]

JSON

Suitable for use with the 5Gen obfuscation tool

{
  "steps": [{"0":[[1]],"1":[[1]],"position":"0"},
	    {"0":[[1]],"1":[[1]],"position":"1"},
	    {"0":[[1]],"1":[[1]],"position":"2"},
	    {"0":[[1]],"1":[[1]],"position":"3"},
	    {"0":[[1]],"1":[[1]],"position":"4"},
	    {"0":[[1]],"1":[[1]],"position":"5"},
	    {"0":[[1]],"1":[[1]],"position":"6"},
	    {"0":[[1]],"1":[[1]],"position":"7"},
	    {"0":[[1]],"1":[[1]],"position":"8"},
	    {"0":[[0]],"1":[[0]],"position":"9"}]
 ,"outputs": [["false","true"]]
}

Premultiplication

A further optimization before matrix encoding is premultiplication. This will decrease multilinearity, but can increase time spent encoding if the matrix count increases overall. To premultiply matrices, specify a chunk size that is a multiple of the input size.

Chunks

$ regex-fsm --regex '(0|1)*' --inputLength 10 --output output.json --verbose --chunks 2 --alphabet 01

Result

[ fromList [ ( "00" , 1 ) , ( "01" , 1 ) , ( "10" , 1 ) , ( "11" , 1 ) ]
, fromList [ ( "00" , 1 ) , ( "01" , 1 ) , ( "10" , 1 ) , ( "11" , 1 ) ]
, fromList [ ( "00" , 1 ) , ( "01" , 1 ) , ( "10" , 1 ) , ( "11" , 1 ) ]
, fromList [ ( "00" , 1 ) , ( "01" , 1 ) , ( "10" , 1 ) , ( "11" , 1 ) ]
, fromList [ ( "00" , 0 ) , ( "01" , 0 ) , ( "10" , 0 ) , ( "11" , 0 ) ]
]

Benchmarks

To execute criterion benchmarks of various compiler phases.

$ nix-build
$ ./result/bin/benchmarks -o index.html
$ open index.html

Tests

To run the entire pipeline once against the 5Gen obfuscator tool, execute obfuscator-tests with a security parameter (defaults to 40).

$ nix-build
$ ./result/bin/obfuscator-tests --secParam 8

To run the entire test suite (takes a long time and is very resource intensive).

$ nix-build
$ ./result/bin/obfuscator-tests --runTestSuites

To run unit tests, simply nix-build, unit tests are run automatically.

$ nix-build
...
Linking dist/build/tests/tests ...
running tests
Running 1 test suites...
Test suite tests: RUNNING...
Test suite tests: PASS
Test suite logged to: dist/test/regex-fsm-0.1.0.0-tests.log
1 of 1 test suites (1 of 1 test cases) passed.

LICENSE

BSD3 (c) Galois Inc.

About

Convert regular expressions into efficient matrix branching programs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Haskell 88.8%
  • Nix 9.9%
  • C++ 1.1%
  • Makefile 0.2%