Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
PHP port of html5lib, currently unmaintained.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
html5lib - php flavour This is an implementation of the tokenization and tree-building parts of the HTML5 specification in PHP. Potential uses of this library can be found in web-scrapers and HTML filters. Warning: This is a pre-alpha release, and as such, certain parts of this code are not up-to-snuff (e.g. error reporting and performance). However, the code is very close to spec and passes 100% of tests not related to parse errors. Nevertheless, expect to have to update your code on the next upgrade. Usage notes: <?php require_once '/path/to/HTML5/Parser.php'; $dom = HTML5_Parser::parse('<html><body>...'); $nodelist = HTML5_Parser::parseFragment('<b>Boo</b><br>'); $nodelist = HTML5_Parser::parseFragment('<td>Bar</td>', 'table'); Documentation: HTML5_Parser::parse($text) $text : HTML to parse return : DOMDocument of parsed document HTML5_Parser::parseFragment($text, $context) $text : HTML to parse $context : String name of context element return : DOMDocument of parsed document Developer notes: * To setup unit tests, you need to add a small stub file test-settings.php that contains $simpletest_location = 'path/to/simpletest/'; This needs to be version 1.1 (or, until that is released, SVN trunk) of SimpleTest. * We don't want to ultimately use PHP's DOM because it is not tolerant of certain types of errors that HTML 5 allows (for example, an element "foo@bar"). But the current implementation uses it, since it's easy. Eventually, this html5lib implementation will get a version of SimpleTree; and may possibly start using that by default. vim: et sw=4 sts=4