Low-level PHP extension for HTML5
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
debian
docker/debian-jessie
src
tests
travis
.gitignore
.travis.yml
CONDUCT.md
LICENSE.md
README.md
composer.json
config.m4
php_gumbo.h
phpunit.xml.dist
test.sh

README.md

Gumbo PHP

Gumbo PHP is low-level extension for HTML5 parsing.

Software License Build Status PHP 7 ready

Gumbo PHP builds DOMDocument using Gumbo HTML5 Parser. This solution solves all problems with HTML5 parsing or pages with inline JavaScript.

use Layershifter\Gumbo\Parser;

$document = Parser::load('<a>Apples and bananas.</a>');
var_dump($document->saveHTML());

string(33) "<a>Apples and bananas.</a>
"

Requirements

The following versions of PHP are supported.

  • PHP 5.6
  • PHP 7.0

Install

To build gumbo-php extenstion PHP-devel package is required. The package should contain phpize utility.

$ git clone https://github.com/layershifter/gumbo-php.git
$ cd gumbo-php
$ phpize
$ ./configure
$ make
$ make install

This will build a 'gumbo.so' shared extension, load it in php.ini using:

[gumbo]
extension = gumbo.so

Known issues

  • double encoding of entities (#6)
$doc = \Layershifter\Gumbo\Parser::load('<h1>Hello&nbsp;world!</h1>');
var_dump($doc->saveHTML());

string "<h1>Hello&amp;nbsp;world!</h1>"

Testing

$ composer install
$ composer test

Sponsors

SORGE
SORGE - website tracking tool

License

This library is released under the Apache 2.0 license. Please see License File for more information.