js-tokenizer

Fast JavaScript tokenizer. Not published on NPM. Single file. MIT License. If any of the below is a limitation, please use SWC.

1) 2.5 times faster than lydell/js-tokens (7s vs 17s | 1000 iterations | jquery-3.6.3.js)
2) does not emit whitespace nor line separators
3) does not support automatic semicolon insertion
4) / after } is always considered a regular expression literal
5) does not support JSX

The output format is:

['Punctuator',               startIndex, endIndex, string]
['SingleLineComment',        startIndex, endIndex, string]
['MultiLineComment',         startIndex, endIndex, string]
['RegularExpressionLiteral', startIndex, endIndex, string]
['StringLiteral',            startIndex, endIndex, string]
['NumericLiteral',           startIndex, endIndex, string]
['NoSubstitutionTemplate',   startIndex, endIndex, string]
['TemplateHead',             startIndex, endIndex, string]
['TemplateMiddle',           startIndex, endIndex, string]
['TemplateTail',             startIndex, endIndex, string]
['IdentifierName',           startIndex, endIndex, string]
['PrivateIdentifier',        startIndex, endIndex, string]

Usage

import { createTokenizer } from './tokenizer.js';

const tokenize = createTokenizer();
const code = `console.log('Hello, world!');`;

for (const token of tokenize(code)) {
    console.log(token);
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
tokenizer.js		tokenizer.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

js-tokenizer

Usage

About

Releases

Packages

Languages

License

szmarczak/js-tokenizer

Folders and files

Latest commit

History

Repository files navigation

js-tokenizer

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages