Skip to content

GitbookIO/brightml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Brightml

Smart utility rendering markdown-ready HTML.

Install

$ npm install brightml

Use

Clean all HTML at once :

var brightml = require('brightml');

var HTMLString = '<table><tr><td>Title 1</td><td>Title 2</td></tr><tr><td>Data 1</td><td>Data 2</td></tr></table>';

var cleanHTML = brightml.clean(HTMLString);
//  cleanHTML is :
//  <table>
//    <thead>
//      <tr>
//        <th>Title 1</th>
//        <th>Title 2</th>
//      </tr>
//    </thead>
//    <tbody>
//      <tr>
//        <td>Data 1</td>
//        <td>Data 2</td>
//      </tr>
//    </tbody>
//  </table>

Or use the module's functions as required :

var brightml = require('brightml');

var HTMLString = '<table><tr><td>Title 1</td><td>Title 2</td></tr><tr><td>Data 1</td><td>Data 2</td></tr></table>';

brightml.parse(HTMLString);
brightml.formatTables();
var cleanHTML = brightml.render();
//  cleanHTML is :
//  <table>
//    <thead>
//      <tr>
//        <th>Title 1</th>
//        <th>Title 2</th>
//      </tr>
//    </thead>
//    <tbody>
//      <tr>
//        <td>Data 1</td>
//        <td>Data 2</td>
//      </tr>
//    </tbody>
//  </table>

What it does

Using brightml.clean(html) performs the following operations in order.

brightml.parse(HTMLString)

Convert HTML to DOM using cheerio.

brightml.retrieveFootNotes()

For cross-referenced links, handle retrieving the foot/endnotes before the next <h1> tag to keep notes within a chapter section.

The footnotes are then formatted as follow:

<h1>Footnotes</h1>
<p>
  See how to properly format a footnote<sup id="footnote-ref"><a href="#footnote">1</a></sup>.
</p>
<!-- Some more content -->
<p>
  <sup id="footnote">
    Footnotes are in a paragraph and a sup tag. Link to go back to reference is at the end of the footnote.
    <a href="#footnote-ref"></a>
  </sup>
</p>

brightml.setAnchorsId()

Try to set <a> tags id attribute on their direct parent if possible.

brightml.cleanElements()

  • Remove empty tags.
  • Remove forbidden HTML tags and place their HTML content in a <p> instead.
  • Remove forbidden HTML attributes.
  • Remove unallowed links schema in HTML attributes.

This operation uses the rules.js file to determine which tags/attributes/schemes are allowed.

brightml.cleanImagesInTitles()

Move <img> tags in titles right after the concerned <h> tag.

brightml.normalizeTitlesId()

Set an id attribute on each <h> tag. The id is based on the title tag content.

Each reference to this id will be modified in consequence.

<h1 id="some-id">A great title</h1>
<a href="#some-id">Back to a great title</a>

will become:

<h1 id="a_great_title">A great title</h1>
<a href="#a_great_title">Back to a great title</a>

brightml.removeNestedTables()

Replace nested <table> tags by a warning message followed by their content in a simple <td> tag.

brightml.formatTables()

Ensure every <table> elements look the same.

Used schema :

<!-- Move caption before <table> if any -->
<caption></caption>

<table>
  <!-- Ensure the first row contains <th> tags in a <thead> element -->
  <thead>
    <tr>
      <th>Title 1</th>
      <th>Title 2</th>
    </tr>
  </thead>
  <!-- Ensure all remaining rows are inside a <tbody> element -->
  <tbody>
    <tr>
      <td>Row 1 - Data 1</td>
      <td>Row 1 - Data 2</td>
    </tr>
    <tr>
      <td>Row 2 - Data 1</td>
      <td>Row 2 - Data 2</td>
    </tr>
  </tbody>
</table>

brightml.cleanTableCells()

Ensure every <th> and <td> tags don't contain a <p> tag to prevent line breaking.

brightml.render()

Returns the current state of HTMLString passed to brightml.parse(HTMLString).

About

Smart utility rendering markdown-ready HTML

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published