Skip to content
/ nanore Public

Tiny regular expression matcher for arbitrary data types.

License

Notifications You must be signed in to change notification settings

y-fujii/nanore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nanore

Build Status Documentation

nanore is a tiny (≃ 0.2K SLOC) regular expression matcher for arbitrary data types written in Rust.

  • O(|regexp||sequence|) time and O(|regexp|) space. No exponential blowup.

  • Support online matching.

  • Matching states can be copied at any time.

  • Support regex weighting.

  • Support path marking.

Example

Basics

use nanore::*;

let re = RegExRoot::<_, ()>::new(
    atom(|_, e| *e % 2 != 0) * atom(|_, e| *e % 2 == 0) + rep(atom(|_, e| *e > 0))
);
let mut m = Matcher::new(&re);

assert!(m.is_match());
m.feed(&1);
assert!(m.is_match());
m.feed(&2);
assert!(m.is_match());
m.feed(&3);
assert!(m.is_match());
m.feed(&0);
assert!(!m.is_match());

Constructs: * (concatenation), + (alternation), rep(), opt(), eps(), atom(), any(), val(), weight(), mark().

Mark & Path

#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Foo, Bar }

let re = RegExRoot::new(
    rep(mark(Marker::Foo) * val('a') + mark(Marker::Bar) * val('b'))
);
let mut m = Matcher::new(&re);

m.feed(&'a');
m.feed(&'b');
m.feed(&'a');
m.feed(&'b');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Foo), (1, Marker::Bar), (2, Marker::Foo), (3, Marker::Bar)]);

Weight

let re = RegExRoot::new(
    rep(mark(Marker::Foo) * val('a')) * rep(weight(-1) * mark(Marker::Bar) * val('a'))
);
let mut m = Matcher::new(&re);

m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Bar), (1, Marker::Bar), (2, Marker::Bar), (3, Marker::Bar)]);

Application: Find longest fibonacci sequence

#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Bgn, End }

let xs = [1, 1, 2, 3, 5, 3, 2, 3, 5, 8, 13, 21, 34];
//        ^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^

let re = RegExRoot::new(
    rep(weight(1) * any()) * mark(Marker::Bgn) *
    any() * any() * rep(atom(|i, x| *x == xs[i - 2] + xs[i - 1])) *
    mark(Marker::End) * rep(weight(1) * any())
);
let mut m = Matcher::new(&re);
m.feed_iter(&xs);

assert!(m.path() == [(6, Marker::Bgn), (13, Marker::End)]);

Reference

Weighted RegExp Matching

nanore uses (the subset of) the method in this paper. Note that the idea is simple and elegant, but it has some non-trivial parts due to ε-transitions in implicitly generated ε-NFA (see empty, final and shift). nanore handles normal transitions and ε-transitions separately, which seems a bit different from this paper (see shift() and propagate() in nanore).

関数型的正規表現マッチ

The excellent article about the paper above, in Japanese.

About

Tiny regular expression matcher for arbitrary data types.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages