Add, delete, modify, get html tags, text, links by using css selector
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
tests
.coveralls.yml
.gitignore
.travis.yml
LICENSE
README.md
composer.json
phpunit.xml.dist

README.md

php-simply-html

Scrutinizer Code Quality Build Status Coverage Status SensioLabsInsight Dependency Status

Add, delete, modify, read html tags by using css selector.

Get all text, links, summary inside html file.

It's working with PHP DOM Extension and Symfony CssSelector

Installation

This library can be found on Packagist.

The recommended way to install is through composer.

Edit your composer.json and add :

{
    "require": {
       "glicer/simply-html": "dev-master"
    }
}

Install dependencies :

php composer.phar install

How to modify html ?

// Must point to composer's autoload file.
require 'vendor/autoload.php';

use GlHtml\GlHtml;

//read index.html contents
$html = file_get_contents("index.html");

$dom = new GlHtml($html);

//delete all style tags inside head
$dom->delete('head style');

//prepare a new style tag
$style = '<link href="solver.css" type="text/css" rel="stylesheet"></link>';

//add the new style tag
$dom->get("head")[0]->add($style);

//replace a node
$dom->get("span")[0]->replaceMe("<h1></h1>");

//write result in a new html file
file_put_contents("result.html",$dom->html());

How to get all text inside html ?

// Must point to composer's autoload file.
require 'vendor/autoload.php';

use GlHtml\GlHtml;

//read index.html contents
$html = file_get_contents("index.html");

$dom = new GlHtml($html);

//array of string sentences
$sentences = $dom->getSentences();

print_r($sentences);

How to get all links inside html ?

// Must point to composer's autoload file.
require 'vendor/autoload.php';

use GlHtml\GlHtml;

//read index.html contents
$html = file_get_contents("index.html");

$dom = new GlHtml($html);

//array of string url
$links = $dom->getLinks();

print_r($links);

How to extract html headings (h1,h2,...,h6)?

<?php
// Must point to composer's autoload file.
require 'vendor/autoload.php';

use GlHtml\GlHtml;

//read index.html contents
$html = file_get_contents("index.html");

$dom = new GlHtml($html);

//array of GlHtmlSummary object
$summary = $dom->getSummary();

echo $summary[0]->getNode()->getText() . ' ' . $summary[0]->getLevel();

/* 
  extract html headings tree
*/
$summaryTree = $dom->getSummaryTree();

Running Tests

Launch from command line :

vendor\bin\phpunit

License MIT

Contact

Authors : Emmanuel ROECKER & Rym BOUCHAGOUR

Web Development Blog - http://dev.glicer.com