Skip to content

diku-kmc/kleenexlang

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
crt
 
 
 
 
src
 
 
 
 
vim
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The KMC Kleenex compiler

CI

This is a compiler that can:

Build

To clone, run git clone --recursive https://github.com/diku-kmc/kleenexlang.git.

To build, run cabal build. You can then run the binary with cabal run kexc. You can install it to e.g. $HOME/.local/bin with

$ cabal install --install-method=copy --overwrite-policy=always --installdir=$HOME/.local/bin/

Use

First write a Kleenex program:

> cat add-commas.kex
main := (num /[^0-9]/ | other)*
num := digit{1,3} ("," digit{3})*
digit := /[0-9]/
other := /./

Next compile a transducer using the kexc executable:

> kexc compile add-commas.kex --out add-commas

Finally, pipe input to the transducer:

> echo "2016" | ./add-commas
2,016

Interpreter interface

The compiler has several interpreters for simulating the different transducer stages generated from Kleenex programs. Input data is read from standard input and written to standard output. This allows for faster debugging without having to invoke a C compiler for each edit cycle.

Invoke the interpreter via the simulate subcommand. You probably want --quiet to avoid compiler output interleaved with the output of the simulated Kleenex program. If using the default simulator (others can be specified using --sim) then adding --sb=false avoids running an unneccesary analysis phase. The following example illustrates usage:

> kexc --sb=false --quiet simulate bench/kleenex/src/apache_log.kex < test/data/apache_log/example.log

Test suite

A number of test suites are included.

  • To run the unit tests: cabal test.
  • To test the C runtime: cd crt_test && make. Note that this uses the Valgrind tool.
  • To run the end-to-end blackbox tests: cd test/test_compiled && make.

Benchmark suite

The repository includes a benchmark suite that compares the performance of string transformation programs written in Kleenex with equivalent programs written using other regular expression-based libraries and tools.

To run the benchmarks and generate the plots, first cd bench and then:

  1. generate the test data: make generate-test-data
  2. install the external benchmark dependencies (libraries, etc.): make install-benchmark-dependencies
  3. build the benchmark programs (not Kleenex programs): make build-benchmark-programs
  4. (optional) check that the benchmark programs are equivalent: make -k equality-check
  5. build the Kleenex programs: ./compiletime.sh -f
  6. run /all/ the benchmark programs N times with M warm-up rounds: ./runningtime.sh -r <N> -w <M> -f
  7. generate the plots: ./mkplots.py
  8. the plots are placed in bench/plots (unless otherwise specified to mkplots.py)