Skip to content

A streaming HTML parser based on HTML Standard. 基于 HTML 标准的流式 HTML 解析器

License

Notifications You must be signed in to change notification settings

mantoufan/yzhanHTMLParser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yzhanHTMLParser

npm npm GitHub license coverage
A streaming HTML parser based on HTML Standard
一款基于 HTML 标准的流式 HTML 解析器

Demo

You could change HTML Code and view the result in realtime.
Online Demo
DEMO PNG

Setup

Node.js

npm i yzhanhtmlparser
import yzhanHTMLParser from 'yzhanhtmlparser'

Browser

<script src="https://cdn.jsdelivr.net/npm/yzhanhtmlparser@latest/docs/yzhanhtmlparser.min.js"></script>

Usage

Parser · Prase

const code = `<html lang="en">
<head>
<title>Page Name</title>
<meta charset="UTF-8"/>
</head>
<body>
<h1 class="text" id=a>Hello World</h1>
<input type="button" disabled/>
</body>
</html>`
const parseResult = yzhanHTMLParser.parse(code)

Streaming Usage

data.html

<html lang="en">
<head>
<title>Page Name</title>
<meta charset="UTF-8"/>
</head>
<body>
<h1 class="text" id=a>Hello World</h1>
<input type="button" disabled/>
</body>
</html>

index.js

const fs = require('node:fs')
const yzhanHTMLParser = require('yzhanhtmlparser')
const htmlParser = new yzhanHTMLParser.HtmlParser()
fs.createReadStream('./data.html', {
  encoding: 'utf8', 
  highWaterMark: 1 // * read only 1 byte once *
}).pipe(htmlParser)
htmlParser.on('data.html', char => {
  console.log(char)
})

Put index.js and data.html in the same folder, run:

node index.js

You will get line by line:

{"type":"openTag","content":"html","attributes":{"lang":"en"}}
{"type":"char","content":"\n"}
{"type":"openTag","content":"head"}
...
{"type":"selfClosingTag","content":"input","attributes":{"type":"button","disabled":"disabled"}}
{"type":"char","content":"\n"}
{"type":"closeTag","content":"body"}
{"type":"char","content":"\n"}
{"type":"closeTag","content":"html"}

Utility Methods

isMatched

Check if html tag is closed.

yzhanHTMLParser.isMatched('<html></html>') // True
yzhanHTMLParser.isMatched('<html>') // False

buildDOMTree

yzhanHTMLParser.buildTree('<div><a>123</a></div>')
// {"type":"Document","children":[{"type":"Element","tagName":"div","children":[{"type":"Element","tagName":"a","children":["123"]}]}]}

Development

Unit Testing

npm test

Unit Testing Coverage

npm run test:coverage

Build

npm run build

Preview

npm run dev

About

A streaming HTML parser based on HTML Standard. 基于 HTML 标准的流式 HTML 解析器

Resources

License

Stars

Watchers

Forks