scotteh/php-goose is no longer maintained, so I created this alternative that supports the recent PHP versions.
There may be some issues, but so far, it's working 'ok'. Feel free to contribute.
- Extracts title, description, canonical URL, main image, and cleaned article text
- Minimal dependencies; works in any PHP app (framework-agnostic)
- DOMDocument + XPath heuristics similar to Goose/Readability techniques
use Iserter\Goose\Goose;
$goose = new Goose();
$article = $goose->extract('https://example.com/some-article');
echo $article->getTitle();You can also pass raw HTML:
$article = $goose->extract($html, 'https://iserter.com');Add the path repository to your root composer.json and require dev-main while developing locally.
composer require iserter/php-goose dev-mainMIT