Skip to content

A very small and very fast spec compliant css lexer

License

Notifications You must be signed in to change notification settings

keithamus/csslex

Repository files navigation

csslex

This aims to be a very small and very fast spec compliant css lexer (or scanner or tokenizer depending on your favourite nomenclature).

It is not the fastest, nor is it the smallest, but it chooses to trade size for speed and speed for correctness. Smaller lexers exist but they sacrifice speed and correctness. Faster lexers exist but they sacrifice code size, and the ability to easily run in the browser. More clearly written lexers exist, but usually at the sacrifice of both speed and size. For details on how fast, how small, and how correct, see below.

What is this good for?

The applications are quite limited. If you know what CSS is, and you know what a lexer/scanner/tokenizer is, then you probably know why you would want this. If you don't know those things or how you could use them, then this probably won't be helpful for you.

How do I import this?

If you're using node.js then running npm i csslex which will install the dependency in your node_modules folder. Then import it with:

import { lex, types, value } from "csslex";

If you're using Deno, then you can try the following line:

import { lex, types, value } from "https://deno.land/x/csslex/mod.ts";

If you're using a Browser, you can import using unpkg or esm.sh:

import { lex, types, value } from "https://esm.sh/csslex";

How do I use this?

If you can understand typescript, this will be helpful:

type Token = [type: typeof types[keyof typeof types], start: number, end: number]
lex(css: string): Generator<Token>

The main lex function takes a css string, and creates an iterable of "Tokens". Each "Token" is a tuple 3 (an array always with 3 elements inside it). The first item in the array is the number representing the type, the second is the start position of that token in the css string, the second is the end of that token in the string.

So for example:

import { lex, types, value } from "https://esm.sh/csslex";
Array.from(lex("margin: 1px"))[ // -> output
  ([types.IDENT, 0, 6],
  [types.COLON, 6, 7],
  [types.WHITESPACE, 7, 8][(types.DIMENSION, 8, 11)])
];

If you want to know the raw value of a token, simply take your original string and call .slice(start, end). However you can also give the string and a token tuple to value which will also do extra things like normalise escape characters and give you structural values:

import { lex, types, value } from "https://esm.sh/csslex";
value("margin: 1px", [types.IDENT, 0, 6]) == "margin";
value("margin: 1px", [types.COLON, 6, 7]) == ":";
value("margin: 1px", [types.DIMENSION, 8, 11]) ==
  { type: "integer", value: 1, unit: "px" };

Test Coverage

This uses css-tokenizer-tests which provides a set of difficult inputs intended to test the edge cases of the spec.

It also uses "snapshot testing" to avoid regressions, it tokenizes the postcss-parser-tests series of css files, as well as open-props.

Spec Conformance

@romainmenke maintains a comparison of CSS tokenizers with scores pertaining to each. csslex aims to always achieve a perfect score here, so if you visit the scores page an it does not have a perfect score, please file an issue!

Size Differentials

This package aims to be the smallest minified css tokenizer codebase. Here's a comparison of popular alternatives:

Name Minified Gzipped
@csstools/tokenizer 4.1kb 1.1kb
csslex (this) 4.7kb 1.9kb
@csstools/css-tokenizer 15.5kb 3.4kb
css-tokenize 19.1kb 5.7kb
parse-css 16kb 4.1kb
css-tree 157.9kb 45kb

Speed differentials

You can run node bench.js to get some benchmark numbers. Here's some I ran on the machine I developed the library on:

Name ops/sec
css-tree 3,080 ops/sec ±0.43% (96 runs sampled)
csslex (this) 2,314 ops/sec ±0.45% (93 runs sampled)
@csstools/css-tokenizer 1,622 ops/sec ±0.76% (96 runs sampled)

About

A very small and very fast spec compliant css lexer

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published