Skip to content

erikbrinkman/mhtml-stream

Repository files navigation

MHTML Stream

build docs npm license

Zero-dependency library for parsing MHTML data as streams using modern WHATWG streams and async iterators. Because it relies on modern cross javascript standards it works out-of-the-box in all javascript environments, with only a little tweaking necessary for module definitions.

Usage

import { parseMhtml } from "mhtml-stream";

for await (const { headers, content } of parseMhtml(...)) {
  // ... : an async iterable of ArrayBuffers. This is very similar to the
  //   interface of a ReadableStream, but is a little more platform agnostic
  //   given that node handles streams significantly differently.

  // headers : a key-value object with the header information

  // content : a Uint8Array of the raw data, if you want as a string, `new
  //   TextDecoder().decode(content)` should work if the contents were utf-8 /
  //   ascii encoded

  // NOTE in many MHTML files, the initial file is empty and contains headers
  // for how to parse each individual included file.
}

Notes

  • As far as I can tell, header folding behavior is not well defined when it comes to whether whitespace should be added when unfolding. This currently uses the first whitespace character to indicate folding, and preservers any others.

About

zero dependency library for parsing an MHTML file stream

Resources

License

Stars

Watchers

Forks

Releases

No releases published