Skip to content

begriffs/flexicode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Utilities for unicode in Flex

charclass

Outputs a regex to match UTF-8 byte sequences for all codepoints matching an ICU unicode regex.

# all Chinese characters
./charclass '\p{Han}'

# horizontal whitespace
./charclass '\h'

The \p option is especially powerful because it can match unicode properties.

To use the regexes, give them aliases in your Flex file:

/* from charcode '\h' */
whitespace \x09|\x20|\xc2\xa0|\xe1\x9a\x80|\xe2\x80[\x80-\x8a]|\xe2\x80\xaf|\xe2\x81\x9f

%%

{whitespace}  { /* ... */ }

Installation

Requires C99, ICU, and pkg-config.

./configure
make

About

Tools scanning Unicode in Flex

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published