A simple and efficient C++ HTML parser following Whatwg HTML specification.
Features:
- full HTML/XML parsing with browser like DOM tree corrections
- HTML 5 compatible
- browser like encoding detection
- HTML content cleanup
- simple DOM access API
- pretty printing
Made with ❤️ in Paris 🇫🇷.
Document document;
// Parse document
document.parse(
"<html><head><title>test</title></head><body><h1>Test</h1></body></html>",
71,
"utf8",
nullptr
);
// Clean document
document.clean(CleanFlags::SPACE);
// Access root node
Node root = document.root();
Requires CMake.
Builds static libraries only.
$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX=~/my-install-dir ..
$ make -j
$ make install
- compact_enc_det: head
- icu4c: 65-1
Dependencies are automatically downloaded and built during the build.
eclair-html-parser is distributed under MIT license.