Skip to content

szmarczak/js-tokenizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

js-tokenizer

Fast JavaScript tokenizer. Not published on NPM. Single file. MIT License. If any of the below is a limitation, please use SWC.

1) 2.5 times faster than lydell/js-tokens (7s vs 17s | 1000 iterations | jquery-3.6.3.js)
2) does not emit whitespace nor line separators
3) does not support automatic semicolon insertion
4) / after } is always considered a regular expression literal
5) does not support JSX

The output format is:

['Punctuator',               startIndex, endIndex, string]
['SingleLineComment',        startIndex, endIndex, string]
['MultiLineComment',         startIndex, endIndex, string]
['RegularExpressionLiteral', startIndex, endIndex, string]
['StringLiteral',            startIndex, endIndex, string]
['NumericLiteral',           startIndex, endIndex, string]
['NoSubstitutionTemplate',   startIndex, endIndex, string]
['TemplateHead',             startIndex, endIndex, string]
['TemplateMiddle',           startIndex, endIndex, string]
['TemplateTail',             startIndex, endIndex, string]
['IdentifierName',           startIndex, endIndex, string]
['PrivateIdentifier',        startIndex, endIndex, string]

Usage

import { createTokenizer } from './tokenizer.js';

const tokenize = createTokenizer();
const code = `console.log('Hello, world!');`;

for (const token of tokenize(code)) {
    console.log(token);
}

About

Fast JavaScript tokenizer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published