Skip to content

A straightforward web scraper written in PHP, with support for parallel processing and HTML5.

License

Notifications You must be signed in to change notification settings

ppajer/WebScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebScraper

A straightforward web scraper written in PHP, with support for parallel processing and HTML5.

Installation

To start using this package, add it to your composer.json file and call composer install, then include the generated autoload.php in your project. Alternatively, download and include the package along with its dependencies directly into your project.

Dependencies

Usage

The scraper takes 2 inputs: an array of Request Options that define the resources to gather, and an array of Extracton Rules to specify what data we're looking for in those resources. For more information on Request Options or Extraction Rules, read the respective docs.

require 'autoload.php';

$rules = 'path/to/rules.json';
$options = [
	'foo' => ['URL' => 'https://...']
];

$scraper = new WebScraper($rules);
$result = $scraper->start($options);

About

A straightforward web scraper written in PHP, with support for parallel processing and HTML5.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages