基于JSON Schema的HTML解析器 #30

chenshenhai · 2018-06-03T17:06:44Z

前言

写这篇文章主要是近年来前端服务端渲染框架层出不穷，开发的选择眼花缭乱。最近有个想法，把HTML的DOM树抽象成用JSON schema 语言描述，可以用于HTML渲染描述中间层抹平框架的差异。

注意：只是作为中间层抹平差异，不是代替框架 。更好地抽象处理HTML的DOM节点在前端渲染，或者服务端渲染。

如果觉得文章啰嗦，我已经实现了该Node模块 html-schema-parser，可以直接安装使用。
或点击 github 查看源码https://github.com/chenshenhai/html-schema-parser/

具体的渲染设想如下：

输入 JSON Schema

{
  tag: 'div',
  attribute: {
    style: 'background:#f0f0f0;'
  },
  content: [
    {
      tag: 'ul',
      content: [
        {
          tag: 'li',
          content: ['item 001']
        }, {
          tag: 'li',
          content: ['item 002']
        }
      ]
    }
  ],
}

输出HTML

<div style="background:#f0f0f0;">
    <ul>
        <li>item 001</li>
        <li>item 002</li>
        <li>item 003</li>
    </ul>
</div>

HTML的基本要素

如果要实现以上解析功能，就先抽象出前端、后端的HTML渲染的元素。

tag类型 tag
tag属性 attribute
tag内容（文本和子节点）content

Schema解析实现

设定tag

合法的tag

const legalTags = [
  'html', 'head', 'title', 'base', 'link', 'meta', 'style', 'script',
  'noscript', 'body', 'section', 'nav', 'article', 'aside', 'h1', 'h2',
  'h3', 'h4', 'h5', 'h6', 'hgroup', 'header', 'footer', 'address', 'main',
  'p', 'hr', 'pre', 'blockquote', 'ol', 'ul', 'li', 'dl', 'dt', 'dd',
  'figure', 'figcaption', 'div', 'a', 'em', 'strong', 'small', 's', 'cite',
  'q', 'dfn', 'abbr', 'data', 'time', 'code', 'var', 'samp', 'kbd', 'sub',
  'sup', 'i', 'b', 'u', 'mark', 'ruby', 'rt', 'rp', 'bdi', 'bdo', 'span', 'br',
  'wbr', 'ins', 'del', 'img', 'iframe', 'embed', 'object', 'param', 'video',
  'audio', 'source', 'track', 'canvas', 'map', 'area', 'svg', 'math',
  'table', 'caption', 'colgroup', 'col', 'tbody', 'thead', 'tfoot', 'tr',
  'td', 'th', 'form', 'fieldset', 'legend', 'label', 'input', 'button',
  'select', 'datalist', 'optgroup', 'option', 'textarea', 'keygen',
  'output', 'progress', 'meter', 'details', 'summary', 'command', 'menu'
]

不闭合的tag

const notClosingTags = {
  'area': true,
  'base': true,
  'br': true,
  'col': true,
  'hr': true,
  'img': true,
  'input': true,
  'link': true,
  'meta': true,
  'param': true,
  'embed': true
}

解析tag的schema

解析schema入口

function parseSchema (schema) {
    let html = ''
    
    if (isJSON(schema)) {
      // 如果是JSON类型，就直接用tag类型解析
      html = parseTag(schema)
    } else if (isArray(schema)) {
      // 如果是JSON类型，就直接用tag上下文类型解析
      html = parseContent(schema)
    }
    return html
}

解析tag类型

function parseTag (schema) {
  let html = ''
  if (isJSON(schema) !== true) {
    return html
  }
  let tag = schema.tag || 'div'
  const content = schema.content
  // 检查是否为合法的tag, 如果不是，默认为div
  if (tags.legalTags.indexOf(tag) < 0) {
    tag = 'div'
  }
  const attrStr = parseAttribute(schema.attribute)
  // 判断是否为闭合tag
  if (tags.notClosingTags[tag] === true) {
    // 如果是闭合的tag
    html = `<${tag} ${attrStr} />`
  } else {
    // 如果是非闭合tag
    html = `<${tag} ${attrStr} >${parseContent(content)}</${tag}>`
  }
  return html
}

解析tag内容

// 内容的类型为 Array
// Array的内容为字符串，或者tag类型的JSON
function parseContent (content) {
  let html = ''
  if (isArray(content) !== true) {
    return html
  }
  for (let i = 0; i < content.length; i++) {
    const item = content[i]
    if (isJSON(item) === true) {
      html += parseTag(item)
    } else if (isString(item)) {
      html += item
    }
  }
  return html
}

参考

https://github.com/caolan/pithy

The text was updated successfully, but these errors were encountered:

chenshenhai added the HTML label Jun 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

基于JSON Schema的HTML解析器 #30

基于JSON Schema的HTML解析器 #30

chenshenhai commented Jun 3, 2018 •

edited

Loading

基于JSON Schema的HTML解析器 #30

基于JSON Schema的HTML解析器 #30

Comments

chenshenhai commented Jun 3, 2018 • edited Loading

前言

HTML的基本要素

Schema解析实现

设定tag

解析tag的schema

参考

chenshenhai commented Jun 3, 2018 •

edited

Loading