utf8-codec

A javascript-only (esm/cjs) utf8 codec that is abstract-encoding compatible, well-tested and pretty efficient. Bonus: It doesn't use Nodejs' Buffer object and comes with typescript types.

Usage

import { encode, encodingLength, decode } from 'utf8-codec' // require works too!

const str = 'Hello World / こんにちは世界'
const bytes = encode(
  str,
  new Uint8Array(endcodingLength(str)), // own buffer supplied, optional
  0 // offset, at which to write the str, optional
)
str === decode(bytes, 0, bytes.length)

Why?

The TextEncoder and TextDecoder classes exist to encode utf8 strings but they are not optimized for bigger byte processing and don't offer APIs to figure out how preemptively how many bytes are supposed to be written/read. Surprisingly this algorithm is even faster.

The other implementations found at the time do either not implement the edge cases properly and/or both directions of the codec.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
types		types
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
index.mjs		index.mjs
package.json		package.json
test.mjs		test.mjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

utf8-codec

Usage

Why?

License

About

Languages

License

dnsquery/utf8-codec

Folders and files

Latest commit

History

Repository files navigation

utf8-codec

Usage

Why?

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages