Skip to content
Tokenizer library for JavaScript.
JavaScript
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples
src
test
.gitignore
.npmignore
package.json
readme.hbs
readme.md
yarn.lock

readme.md

tokenary

Build tokenizers for javascript

Basic Usage (CSV tokenizer)

const {
    tokenary,
    reducer: { ifChar, single, makeToken, everythingUntil }
} = require('tokenary');

const TokenType = {
    comma: 'COMMA',
    value: 'VALUE',
    newline: 'NEWLINE',
};

const tokenizeCSV = tokenary([
    ifChar({
        ',': single(makeToken(TokenType.comma)),
        '\n': single(makeToken(TokenType.newline))
    }),
    consume(everythingUntil(',', '\n'))(makeToken(TokenType.value))
]);

const testCSV = 
`1,Up
2,Left
3,Right`;

const tokens = tokenizeCSV(testCSV);
console.log(prettyPrint(tokens));
/* prints:
[
  <Token type='VALUE' lexeme='1' offset=0>,
  <Token type='COMMA' lexeme=',' offset=1>,
  <Token type='VALUE' lexeme='Up' offset=2>,
  <Token type='NEWLINE' lexeme='\n' offset=4>,
  <Token type='VALUE' lexeme='2' offset=5>,
  <Token type='COMMA' lexeme=',' offset=6>,
  <Token type='VALUE' lexeme='Left' offset=7>,
  <Token type='NEWLINE' lexeme='\n' offset=11>,
  <Token type='VALUE' lexeme='3' offset=12>,
  <Token type='COMMA' lexeme=',' offset=13>,
  <Token type='VALUE' lexeme='Right' offset=14>
]
*/

See examples for more.

API

Classes

TokenError

Objects

predicate : object

Functions that return true or false.

reducer : object
token : object

Functions

tokenary(reducers, [settings])function

Creates a tokenizing function

Typedefs

TokState : reducer.Reducer
TokenCreator : tokState.TokState
ReducerTokState
TokState : tokState.TokState
Token
TokenCreatorToken | undefined
Token : token.Token
TokState
TokState : tokState.TokState

TokenError

Kind: global class

new TokenError(message, lexeme, offset, state)

Param Type Description
message string The error message
lexeme string The lexeme this error is for
offset number The index of the first lexeme character in the text
state TokState The state of the tokenizer when error was created

predicate : object

Functions that return true or false.

Kind: global namespace

predicate.is(truth) ⇒ Predicate

Checks if actual is strict equal (===) to truth

Kind: static method of predicate

Param Type
truth any

predicate.isOneOf(...truths) ⇒ Predicate

Checks if actual is any of the values in truths

Kind: static method of predicate

Param Type
...truths any

predicate.matches(regex) ⇒ Predicate

Checks if actual matches the regular expression

Kind: static method of predicate

Param Type
regex RegExp

predicate.not(predicate) ⇒ Predicate

Logical not operator on a predicate

Kind: static method of predicate

Param Type
predicate Predicate

predicate.or(...predicates) ⇒ Predicate

Logical or operator on predicates

Kind: static method of predicate

Param Type
...predicates Predicate

predicate.nor(...predicates) ⇒ Predicate

Logical nor operator on predicates. Short for not(or(...predicates))

Kind: static method of predicate

Param Type
...predicates Predicate

predicate.and(...predicates) ⇒ Predicate

Logical and operator on predicates

Kind: static method of predicate

Param Type
...predicates Predicate

predicate.nand(...predicates) ⇒ Predicate

Logical nand operator on predicates. Short for not(and(...predicates))

Kind: static method of predicate

Param Type
...predicates Predicate

predicate.xor(predicate1, predicate2) ⇒ Predicate

Logical xor operator on 2 predicates

Kind: static method of predicate

Param Type
predicate1 Predicate
predicate2 Predicate

predicate.Predicate ⇒ boolean

Kind: static typedef of predicate

Param Type
actual any

reducer : object

Kind: global namespace

reducer.keywords(keywordMap, [settings]) ⇒ Reducer

Extracts keywords from the text

Kind: static method of reducer

Param Type Description
keywordMap Object.<string, TokenCreator> Map of keywords to check for
[settings] object
[settings.charset] RegExp Charset allowed for a keyword
[settings.firstChar] RegExp Charset allowed for first character of keyword
[settings.noMatch] TokenCreator Token creator to use on invalid keywords

reducer.ifThen(predicate) ⇒ function

If the predicate is true, run the reducer

Kind: static method of reducer

Param Type Description
predicate Predicate Condition to be met

reducer.ifChar(reducerMap) ⇒ Reducer

If a matching character is found, runs the given reducer

Kind: static method of reducer

Param Type Description
reducerMap Object.<string, Reducer> character to reducer map to check

reducer.single(tokenCreator) ⇒ Reducer

Creates a token from the single current character

Kind: static method of reducer

Param Type
tokenCreator TokenCreator

reducer.consume(reducer) ⇒ function

Creates a token from the characters passed over in the given reducer Expects the given reducers to not return any tokens

Kind: static method of reducer

Param Type
reducer Reducer

reducer.sequence(reducers) ⇒ Reducer

Pipe output of a sequence of reducers together (left to right)

Kind: static method of reducer

Param Type
reducers Array.<Reducer>

reducer.everything() ⇒ Reducer

Creates a token from every character that follows

Kind: static method of reducer

reducer.everythingUntil(chars) ⇒ Reducer

Creates a token from every character that follows until one given is reached

Kind: static method of reducer

Param Type Description
chars Array.<string> Characters to possibly match

reducer.char(char) ⇒ Reducer

Advances past a single character, throws an error if it's not what is expected

Kind: static method of reducer

Param Type Description
char string Expected character

reducer.untilRegexFails(regex) ⇒ Reducer

Runs the regex until the regex fails on all characters advanced past

Kind: static method of reducer

Param Type
regex RegExp

reducer.whitespace() ⇒ Reducer

Advances past all contiguous whitespace characters

Kind: static method of reducer

token : object

Kind: global namespace

token.makeToken(type) ⇒ TokenCreator

Creates a token object

Kind: static method of token

Param Type Description
type string Type name of token

token.makeNothing() : TokenCreator

Creates nothing

Kind: static method of token

token.makeError(message) ⇒ TokenCreator

Throws an error

Kind: static method of token
Throws:

  • TokenError
Param Type
message string

token.stringifyToken(token) ⇒ string

Formats a token for printing

Kind: static method of token

Param Type Description
token Token Token to stringify

token.prettyPrint(tokens) ⇒ string

Formats an array of tokens for printing

Kind: static method of token

Param Type Description
tokens Array.<Token> Tokens to pretty print

tokenary(reducers, [settings]) ⇒ function

Creates a tokenizing function

Kind: global function

Param Type Description
reducers Array.<Reducer> Main reducers
[settings] object Settings for tokenizer
[settings.catcher] TokenCreator Token creator to use if an error is found

TokState : reducer.Reducer

Kind: global typedef

TokState.create(text, [current], [tokens]) ⇒ TokState

Creates a TokState object

Kind: static method of TokState

Param Type Default Description
text string The text represented by the state
[current] number 0 Current offset in text
[tokens] Array.<Token> Tokens created so far

TokState.advance(state) ⇒ TokState

Increments TokState.current by 1 (creates new object)

Kind: static method of TokState

Param Type
state TokState

TokenCreator : tokState.TokState

Kind: global typedef

Reducer ⇒ TokState

Kind: global typedef

Param Type Description
state TokState The state to modify

TokState : tokState.TokState

Kind: global typedef

TokState.create(text, [current], [tokens]) ⇒ TokState

Creates a TokState object

Kind: static method of TokState

Param Type Default Description
text string The text represented by the state
[current] number 0 Current offset in text
[tokens] Array.<Token> Tokens created so far

TokState.advance(state) ⇒ TokState

Increments TokState.current by 1 (creates new object)

Kind: static method of TokState

Param Type
state TokState

Token

Kind: global typedef
Properties

Name Type Description
type string Type of token
lexeme string The text this token represents
offset number The offset of the first character of the lexeme

TokenCreator ⇒ Token | undefined

Kind: global typedef

Param Type Description
lexeme string The text this token represents
offset number The offset of the first character of the lexeme
[state] TokState State of tokenizer when created (optional)

Token : token.Token

Kind: global typedef

TokState

Kind: global typedef
Properties

Name Type Description
text string The text represented by the state
current number Current offset in text
tokens Array.<Token> Tokens created so far

TokState.create(text, [current], [tokens]) ⇒ TokState

Creates a TokState object

Kind: static method of TokState

Param Type Default Description
text string The text represented by the state
[current] number 0 Current offset in text
[tokens] Array.<Token> Tokens created so far

TokState.advance(state) ⇒ TokState

Increments TokState.current by 1 (creates new object)

Kind: static method of TokState

Param Type
state TokState

TokState : tokState.TokState

Kind: global typedef

TokState.create(text, [current], [tokens]) ⇒ TokState

Creates a TokState object

Kind: static method of TokState

Param Type Default Description
text string The text represented by the state
[current] number 0 Current offset in text
[tokens] Array.<Token> Tokens created so far

TokState.advance(state) ⇒ TokState

Increments TokState.current by 1 (creates new object)

Kind: static method of TokState

Param Type
state TokState
You can’t perform that action at this time.