Skip to content

Conversation

@Alkarex
Copy link
Member

@Alkarex Alkarex commented Jun 30, 2025

Alkarex and others added 20 commits June 25, 2025 10:42
Example of valid feed not working in SimplePie: https://haveibeenpwned.com/feed/breaches/
Regression due to simplepie#445
The final character `>` of a feed is encoded as `3E 00` in UTF-16LE, so calling `trim()` was removing the `\x00`, breaking the multibyte encoding and making the feed invalid.
Downstream issue FreshRSS/FreshRSS#7690
…or in FileClient (simplepie#909)

This PR renames the `SimplePie\Exception\HttpException` to `SimplePie\HTTP\ClientException` to avoid confusion (simplepie#905) that this exception is thrown on an 4xx or 5xx response from the server. Because the `ClientException` is always caught internally, I marked the class as `@internal`.

Additionally, change the `FileClient` to only throw the exception on non-HTTP errors to match PSR-18 and our `Psr18Client`. This is not a BC break since this API is not part of any release yet.

Also add unit tests to show that the responses (`File` or `Psr7Response` class) always contain the status code, even if it is 429. Plus integration tests using a local HTTP server.
Slight refactor of simplepie#916 to cover all paths.
Missing paths included the fsock method without gzip (e.g. deflate or plain).
As per https://stackoverflow.com/a/26554272 and https://docs.phpdoc.org/guide/guides/inheritance.html, phpDocumentor will inherit the doc comments from the interface. Keeping them in implementation is thus mostly pointless noise. The only benefit is that the docs are immediately available even when not using an IDE or LSP but keeping them in sync is too annoying to be worth it.
This is remnant of parsing HTML with regex, not used since 878c83c
Yet another remnant of parsing HTML with regex, not used since 878c83c
This has been unused since 408dd74
This will be needed for PHPStan to pass with the next commit.
`Registry::call()` is widely used in SimplePie to allow extensibility. Unfortunately, the pattern hides the actual method from PHPStan so it cannot determine the return value type, and there is no annotation we can add to make it work. , This results in an inadequate analysis.

Thankfully, most cases allow us to determine the method statically so we can create a PHPStan extension doing just that.

We use `SimplePieUtils` instead of `Utils` namespace since this extension might be useful for some SimplePie consumers and do not want to subject them to symbol conflicts. However, we do not currently want to commit to making this a proper Composer package so use it on your own risk, stability not guaranteed.

We may want to add more type checking extensions in the future.
Without this, PHPUnit will complain:

    1 test triggered 1 PHPUnit deprecation:

    1) SimplePie\Tests\Integration\HTTP\ClientsTest::testRedirectsChain
    Data Provider method SimplePie\Tests\Integration\HTTP\ClientsTest::clientsProvider() is not static
The built-in PHP server used by `donatj/mock-webserver` will happily include warnings in the response body.
If that happens before `Location` header is sent, such as with the implicit nullable types deprecation warning on PHP 8.4 from `Mf2\Parser` being pulled in by static autoloader, the redirect will never happen and later assertions will fail without any hint as to why.
Alternative of simplepie#914 with support for XML feeds using HTML entities

Simple feeds with XML preamble + DOCTYPE are crashing (see example in test) due to regression from simplepie@162a3d3
Its implementation is buggy, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
Downstream PR: #35

Note that the culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples
HTML entities are only allowed in XML when there is a DTD declaring them. SimplePie is even allowing Atom documents with undeclared HTML entities - which I am not sure is on purpose.

Downstream issue: FreshRSS/FreshRSS#7687
- Item: Simplify base path

  `Item::get_base()` calls `Item::get_permalink()` → `Item::get_link()` → `Item::get_links()`, which called `Item::get_base()`.
  The cycle was overly confusing and only get broken in second `Item::get_links()` call.
  Furthermore, the second call of `Item::get_permalink()` would return enclosure URL, which makes little sense to use as a base URI.
  (`Item::get_permalink()` was always like this: bce917d.)

  Let’s introduce a new `Item::get_own_base()` method restricted to `xml:base` and feed base, and use it instead of `Item::get_base()`.

  Also clarify in docs that `Item::get_base()` uses the enclosure link.

- Item: Support relative URIs everywhere

  Original RSS specification requires URLs to include scheme:
  https://www.rssboard.org/rss-specification#comments
  but relative URLs are somewhat commonly used:
  https://www.rssboard.org/news/151/relative-links

  We already construct some IRIs with `Item::get_own_base()` as base so that we support relative URLs.
  Let’s do that with the rest of the IRIs as well.

- SimplePie: Support self link as feed base URL

  If a RSS feed lacks the mandatory `/rss/channel/link` element and does not specify `xml:base`,
  the only base we can use for resolving relative URIs is the address of the feed itself.
  However, that might not be reliable when the feed is served from different location,
  or even available if loaded using `SimplePie::set_raw_data()`.

  Let’s try using `self` link if present, since it should provide more reliable location
  than the URI the feed was obtained from.

These changes were combined into a single PR because the tests are a bit tangled.
@Alkarex
Copy link
Member Author

Alkarex commented Jun 30, 2025

@Alkarex Alkarex merged commit d807572 into freshrss Jun 30, 2025
20 checks passed
@Alkarex Alkarex deleted the merge-upstream branch June 30, 2025 21:14
Alkarex added a commit to Alkarex/FreshRSS that referenced this pull request Jun 30, 2025
Alkarex added a commit to FreshRSS/FreshRSS that referenced this pull request Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants