Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 #187

Merged
merged 14 commits into from
Jun 18, 2023
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,19 @@ Parts regarding the [documentation website](https://phpscraper.de), the [test pa

This project adheres to [Semantic Versioning](http://semver.org/).

## 2.0.0 (2023-06-01)

- [#187](https://github.com/spekulatius/PHPScraper/issues/187): Prepare v2: Improve typing, bringing PHPStan to --level=9. For details check the [CHANGELOG](https://github.com/spekulatius/PHPScraper/blob/master/UPGRADING.md#from-1x-to-2x).
- [#188](https://github.com/spekulatius/PHPScraper/issues/188): Support PHPStan for Windows Users
- [#185](https://github.com/spekulatius/PHPScraper/issues/185): Adding PHP 8.3 to test pipeline
- [#184](https://github.com/spekulatius/PHPScraper/issues/184): Adding PHPStan GitHub Action. Thank you @nadar!
- [#183](https://github.com/spekulatius/PHPScraper/issues/183): Switch from Goutte to BrowserKit
- [#182](https://github.com/spekulatius/PHPScraper/issues/182): Drop PHP 7.3 and 7.4
- [#174](https://github.com/spekulatius/PHPScraper/issues/174): Fix local testing
- [#173](https://github.com/spekulatius/PHPScraper/issues/173): Fix README example
- [#171](https://github.com/spekulatius/PHPScraper/issues/171): Various PHPStan improvements
- [#169](https://github.com/spekulatius/PHPScraper/issues/169): Adding `<meta charset=...>` extraction

## 1.0.2 (2022-12-15)

- [#167](https://github.com/spekulatius/PHPScraper/issues/167): Updating CHANGELOG.md
Expand Down
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@
</p>
</p>

PHPScraper is a universal web-util for PHP. The main goal is to get stuff done instead of getting distracted with selectors, preparing & converting data structures, etc. Instead, you can just *go* to a website and get the relevant information for your project.
PHPScraper is a versatile web-utility for PHP. Its primary objective is to streamline the process of extracting information from websites, allowing you to focus on accomplishing tasks without getting caught up in the complexities of selectors, data structure preparation, and conversion.

Under the hood, it uses

- [Goutte](https://github.com/FriendsOfPHP/Goutte) to access the web
- [BrowserKit](https://symfony.com/doc/current/components/browser_kit.html) (formerly [Goutte](https://github.com/FriendsOfPHP/Goutte)) to access the web
- [League/URI](https://github.com/thephpleague/uri) to process URLs
- [donatello-za/rake-php-plus](https://github.com/donatello-za/rake-php-plus) to extract and analyze keywords

Expand Down Expand Up @@ -210,13 +210,17 @@ The future development is organized into [milestones](https://github.com/spekula
- Organize code better (move websites into separate repos, etc.)
- Add support for feeds and some typical file types.

### v2: [Expand the functionality and cover more 'types'](https://github.com/spekulatius/PHPScraper/milestone/5)
### v2: Service Upgrade:

- Switch from Goutte to [Symfony BrowserKit](https://symfony.com/doc/current/components/browser_kit.html). Goutte has been archived.

### v3: [Expand the functionality and cover more 'types'](https://github.com/spekulatius/PHPScraper/milestone/5)

- Expand to parse a wider range of types, elements, embeds, etc.
- Improve performance with caching and concurrent fetching of assets
- Minor improvements for parsing methods

### v3: [Expand to provide more guidance on building custom scrapers on top of PHPScraper](https://github.com/spekulatius/PHPScraper/milestone/6)
### v4: [Expand to provide more guidance on building custom scrapers on top of PHPScraper](https://github.com/spekulatius/PHPScraper/milestone/6)

TBC.

Expand Down
10 changes: 8 additions & 2 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Upgrading PHPScraper

This document will help you upgrading PHPScraper from one version to a new one.
This document will help you upgrading PHPScraper from an earlier version to later versions.

## From `0.x` to `1.x`

Expand All @@ -16,4 +16,10 @@ This document will help you upgrading PHPScraper from one version to a new one.
```diff
-$web = new \spekulatius\phpscraper;
+$web = new \Spekulatius\PHPScraper\PHPScraper;
```
```

## From `1.x` to `2.x`

- Support for PHP 7.x was dropped. PHP 8.0 is the minimum for v2.
- The publicly accessable function `parseXML` was renamed to `parseXml`.
- The codebase has been analysed with PHPStan and hardened manually. Due to this, some return types have changed. See [v2 pull request](https://github.com/spekulatius/PHPScraper/pull/187/files) for details.
7 changes: 3 additions & 4 deletions composer.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "spekulatius/phpscraper",
"description": "PHPScraper, built with simplicity in mind. See tests/ for examples.",
"description": "PHPScraper, built with simplicity in mind. See tests/ for more examples.",
"keywords": [
"PHP scraper",
"PHP scraping",
Expand Down Expand Up @@ -46,9 +46,8 @@
}
},
"scripts": {
"test": "./vendor/phpunit/phpunit/phpunit --cache-result --cache-result-file=/tmp --order-by=defects --colors=always --stop-on-failure",
"ct": "while true; do composer run test; sleep 30; done",
"phpstan": "vendor/bin/phpstan -v"
"test": "./vendor/phpunit/phpunit/phpunit --cache-result --cache-result-file=.tmp/phpunit --order-by=defects --colors=always --stop-on-failure",
"phpstan": "vendor/bin/phpstan analyse"
},
"funding": [
{
Expand Down