Tiny regular expression matcher for arbitrary data types.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
tests
.gitignore
.travis.yml
Cargo.toml
license.txt
readme.adoc

readme.adoc

nanore

Build Status Documentation

nanore is a tiny (≃ 0.2K SLOC) regular expression matcher for arbitrary data types written in Rust.

  • O(|regexp||sequence|) time and O(|regexp|) space. No exponential blowup.

  • Support online matching.

  • Matching states can be copied at any time.

  • Support regex weighting.

  • Support path marking.

Example

Basics

use nanore::*;

let re = RegExRoot::<_, ()>::new(
    atom(|_, e| *e % 2 != 0) * atom(|_, e| *e % 2 == 0) + rep(atom(|_, e| *e > 0))
);
let mut m = Matcher::new(&re);

assert!(m.is_match());
m.feed(&1);
assert!(m.is_match());
m.feed(&2);
assert!(m.is_match());
m.feed(&3);
assert!(m.is_match());
m.feed(&0);
assert!(!m.is_match());

Constructs: * (concatenation), + (alternation), rep(), opt(), eps(), atom(), any(), val(), weight(), mark().

Mark & Path

#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Foo, Bar }

let re = RegExRoot::new(
    rep(mark(Marker::Foo) * val('a') + mark(Marker::Bar) * val('b'))
);
let mut m = Matcher::new(&re);

m.feed(&'a');
m.feed(&'b');
m.feed(&'a');
m.feed(&'b');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Foo), (1, Marker::Bar), (2, Marker::Foo), (3, Marker::Bar)]);

Weight

let re = RegExRoot::new(
    rep(mark(Marker::Foo) * val('a')) * rep(weight(-1) * mark(Marker::Bar) * val('a'))
);
let mut m = Matcher::new(&re);

m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Bar), (1, Marker::Bar), (2, Marker::Bar), (3, Marker::Bar)]);

Application: Find longest fibonacci sequence

#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Bgn, End }

let xs = [1, 1, 2, 3, 5, 3, 2, 3, 5, 8, 13, 21, 34];
//        ^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^

let re = RegExRoot::new(
    rep(weight(1) * any()) * mark(Marker::Bgn) *
    any() * any() * rep(atom(|i, x| *x == xs[i - 2] + xs[i - 1])) *
    mark(Marker::End) * rep(weight(1) * any())
);
let mut m = Matcher::new(&re);
m.feed_iter(&xs);

assert!(m.path() == [(6, Marker::Bgn), (13, Marker::End)]);

Reference

Weighted RegExp Matching

nanore uses (the subset of) the method in this paper. Note that the idea is simple and elegant, but there are some non-trivial parts due to ε-transitions in implicitly generated ε-NFA (see empty, final and shift). nanore handles normal transitions and ε-transitions separately, which seems a bit different from this paper (see shift() and propagate() in nanore).

関数型的正規表現マッチ

The excellent article about the paper above, in Japanese.