Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

A malloc-free, wchar_t based regular expression matching library

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 .my-git-push
Octocat-spinner-32 Makefile
Octocat-spinner-32 README
Octocat-spinner-32 TODO
Octocat-spinner-32 aov-rx.c
Octocat-spinner-32 aov-rx.h
Octocat-spinner-32 stress.c
README
aov-rx README
=============

aov-rx is a malloc-free [1], wide-char based regular expression matching
library. This piece of code is designed to be part of MPDM [2] in the
future, but can also be used separately.

Please see the TODO file to know the completion status of this project.

Released under the GPL.

Angel Ortega <angel@triptico.com>

 [1] Free as in 'without', not as in 'free()'
 [2] http://triptico.com/software/mpdm.html (Minimum Profit Data Manager)

Goals
-----

 - Not using dynamic memory,
 - using wide chars (wchar_t) natively,
 - returns the matching substring as offset + size,
 - source code being readable, and
 - being completely reentrant and thread-safe.

Features
--------

 - Start (^) and end of string ($) anchors
 - . (match any character)
 - Full quantifier support:
   - Curly bracket with optional minimum and maximum:
     - Exactly n matches ({n})
     - At least n matches ({n,})
     - At most m matches ({,m})
     - At least n and at most m matches ({n,m})
   - ? (match 0 or 1 occurrences, same as {0,1})
   - * (match 0 or many occurrences, same as {0,})
   - + (match 1 or many occurrences, same as {1,})
 - Alternate sets (str1|str2|str3)
 - Sub-regexes (between parentheses, also optionally
   quantifiable)
 - Bracket-wrapped character sets, positive and
   negative ([0-9], [^0-9])

Features that will (probably) be, but not soon
----------------------------------------------

 - Optional multiline matching (i.e. ^ and $ matching lines inside
   the string instead of the start and the end of the string)

Features that (probably) will never be
--------------------------------------

 - Backreferences

Usage
-----

aov-rx consists only of one function:

 wchar_t *aov_match(wchar_t *rx, wchar_t *tx, int *size);

And its usage is pretty straightforward:

 #include <stdio.h>
 #include <wchar.h>
 #include "aov-rx.h"
 
 int main(int argc, char *argv[])
 {
    int s;
    wchar_t *res;
 
    res = aov_match(L"/[0-9]+/", L"The number is 123, indeed", &s);
 
    if (s) {
        /* res points to 123, and s contains 3 */
        ...
    }
    ...

---
Angel Ortega
http://triptico.com
Something went wrong with that request. Please try again.