New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split SimplePie\File class into HTTP client and response #774
Split SimplePie\File class into HTTP client and response #774
Conversation
looks good @Art4, before I merge can someone confirm that the tests are running? I'm seeing |
@mblaney Github seems to have some problems at the moment. Github Actions is experiencing degraded performance. See https://www.githubstatus.com/ |
I will try to review it later today. |
Thank you, I appreciate it. I just found another error and stabilized it with a test. |
Thank you for your comment. As I explained in #782 this PR was made under the assumption that new external dependencies will be a BC break (not allowed in 1.x, but allowed in 2.x), so I'm with you that directly require PSR-17/PSR-18 implementations would require much less code. This decision is up to @mblaney and the team and would impact #777.
I'm not sure if I unterstand you correctly. (I do not want to exclude the possibility that I fail in understanding because of the language barrier.) Do you want to implement (some of) the PSR-7/PSR-18 interfaces directly in/with I've implemented PSR-7, PSR-17 and PSR-18 already a few times in different projects. The last time was for Requests, which then resulted in its own project. Implementing these PSRs in SimplePie would require us to implement 12 interfaces with 57 methods.
For the interop layer in SimplePie I condensed them to 3 interfaces with 9 methods. And they will be removed again in SimplePie 2.
I cannot understand your point why maintaining this implementation concerns you more than maintaining a PSR implementation?
Yes, you are right. 👍 I just realized that we cannot accept the internal
No, I prefer option 3 (for SimplePie 2):
That's why I want to keep Update: (Sorry, just noted your follow up comment.)
Yes, that is my intention. Nearly all code I write for the forward compatibility layer will be removed again in SimplePie 2.
No, the migration should be directly custom |
* @license http://www.opensource.org/licenses/bsd-license.php BSD License | ||
*/ | ||
|
||
namespace SimplePie\Exception; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on having it under SimplePie\HTTP
? That way we can remove just a single namespace later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinions about the place for exceptions. We will potentially get more Exceptions while fixing #755.
@mblaney Is there any chance that this will be merged soon? |
I am still reviewing this. Since my head is too small to look at everything at once, I am extracting some of the unrelated cleanups into separate pull requests and rebased and squashed the fixes into the original commits. I have also added some notes about the changes into the commit messages: master...jtojnar:simplepie:split-file-into-client-and-response I have also switched to SPDX syntax since the long comments make it difficult to fit multiple files on the screen at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took so long, I have finally managed to review this in detail it deserves.
You did a great job with this and convinced me that this approach is a reasonable interim solution.
While reviewing, I did some rebase surgery to make it easier to handle with my attention span, and while at it fixed some issues. You can see that in #815.
* While header names are not case-sensitive, get_headers() will preserve the | ||
* exact case in which headers were originally specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true for the FileClient. I do not feel like it is important for now so I have removed it from #815.
src/HTTP/Response.php
Outdated
declare(strict_types=1); | ||
/** | ||
* SimplePie | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have credited PHP Framework Interoperability Group in #815.
if (strpos($header, ',') === false) { | ||
$header = [$header]; | ||
} else { | ||
$header_lines = explode(',', $header); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that this might disrupt some headers. Instead, in #815 I have decided to add support for HTTP\Parser
to produce PSR-7 compatible headers, use $parsed_headers
internally in File
as you did initially, and recompute $parsed_headers
when $headers
change (which should not really happen outside of tests but better safe than sorry).
* @see http://tools.ietf.org/html/rfc3986#section-4.1 | ||
* @return string the original (first requested) URL before following redirects (expect 301) | ||
*/ | ||
public function get_permanent_uri(): string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not wholly convinced the ResponseInterface
-like object should contain RequestInterface
-specific properties like URI but I guess it is fine for now.
@@ -103,7 +113,7 @@ public function find($type = \SimplePie\SimplePie::LOCATOR_ALL, &$working = null | |||
return $this->file; | |||
} | |||
|
|||
if ($this->file->method & \SimplePie\SimplePie::FILE_SOURCE_REMOTE) { | |||
if (preg_match('/^http(s)?:\/\//i', $this->file->get_requested_uri())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have extracted the regex to Misc::is_remote_uri()
in #815.
/** | ||
* Get a HTTP client | ||
*/ | ||
private function get_http_client(): Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be worth it to add support for singletons maintained by Registry
. Or maybe we could just create ClientFactory
and get it in constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not worth it. This method is private and will be removed in v2 again. Moving this to Registry will lead to more BC issues.
$orig_timeout = $this->timeout; | ||
$this->timeout = 1; | ||
$file = $this->get_http_client()->request(Client::METHOD_GET, $this->feed_url, $headers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird side-channel signalling. It might be nicer to pass $this->timeout/10
(or 1
for whatever reason you have decided to change it) as an argument to the get_http_client()
factory. Went with that in #815
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This request will check if the cache has been changed. The timeout was to 1 sec. I assume that the intention was to not wait to long for a response for this check.
It then was changed to $this->timeout/10
in fe7e333, but this could lead to a float instead of int, if e.g. $this->timeout
is set to 5
. That's why I changed it back to 1 sec. Alternative we could use ceil()
but this will be in most cases 1
or (if $this->timeout = 11-20) 2
. So I decided to use 1
directly.
About the side-channel signalling I decided to use this instead of a new argument because the timeout is moved to PSR-18 HTTP client and $this->timeout
will be deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context. Peeking further into history, it looks like it was 1
since the introduction of timeouts along with SimplePie_File
in 5bf1814.
But I think it is still useful to keep it proportional to the master timeout to allow people with slow connection to increase it. Even if we assume majority of cached resources will not get outdated, waiting for server to reply HTTP 304 Not Modified
can still take time (domain name resolution can be particularly slow). And if it turns out that the cached version is outdated, the web server will try to send an updated version. Having too low timeout can mean the request will be terminated, only to be started again with the original timeout.
So I do not actually think reducing the timeout makes much sense at all. The only explanation I can come up with is that it prevents essentially doubling the timeout when fetching a cached resource from an unresponsive server without force_cache_fallback
due to re-running the failed request.
I have split the change to a separate commit in #815, including the explanation.
try { | ||
$response = $this->get_http_client()->request(Client::METHOD_GET, $href, $headers); | ||
} catch (HttpException $th) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This terminates the loop early, skipping $done[] = $href;
. Fortunately, the success case is short so should be fine to move it inside the try
block. Fixed in #815
try { | ||
$response = $this->get_http_client()->request(Client::METHOD_GET, $value, $headers); | ||
} catch (HttpException $th) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, this skips unset($array[$key]);
. Fixed in #815
* @see http://tools.ietf.org/html/rfc3986#section-4.1 | ||
* @return string the final requested url after following redirects | ||
*/ | ||
public function get_requested_uri(): string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in #815, I renamed this to get_final_requested_uri
so it is clearer what this does.
I propose we go with #815 instead. This change set is complex enough to warrant not squashing everything into a single megacommit, where refactorings like the switch to |
sorry @Art4 you're right, this PR should've been dealt with before merging smaller patches. I'm happy with it so didn't need to comment, but was waiting for approval from @jtojnar before merging. I'll wait for the two of you to decide how to continue before merging anything else. |
@mblaney I just wanted to kindly let you know that I have already shared all my thoughts and comments in this and other PRs/issues. I believe I have covered everything that needs to be said. I think it's now time for you to make a decision. Once you have made a decision, I can move forward with the next steps in #731. Thank you so much for your time and consideration. |
@Art4 Just to be clear, are you okay with merging #815 in this PR’s stead? Or do you want to address the comments mentioned in above (e.g.#774 (comment)) in this PR? |
@jtojnar I think I have addressed all the comments relevant to this PR. Some issues in this PR are addressed in #777, because they can't be fixed here. All open comments are either "not important for now", "fine for now" or you have already suggested improvements in #815. As long as this is not explicitly requested, I don't see the need to replicate the improvements here, but would rather discuss them in #815 (and yes, I still have some thoughts on them). The discussion is otherwise even more scattered than it already is because of the many open issues. @mblaney This PR is the base of or mentioned by #777, #813, #815, #812 and #810. These issues are partly the basis for further issues like #826 and #833. Closing this PR will be quite confusing in retrospect. So if you are happy with this PR for the most part, I recommend that we merge this PR as a base and discuss the other changes in the other PRs and issues. I would also like to see a decision from you in #782 because it has far reaching meanings on issues like #777. And I'd appreciate any thoughts or reactions from you to #731. |
I would argue that having a clean commit history with the details in commit messages (as in #815) is more important than GitHub issues, as it facilitates git archaeology. Here, we at most get a single big squashed commit without any rationale for the changes, which will be more confusing when running Additionally, the git repo is a more permanent record – it already survived move from the previous hosting provider while the issues did not. You could also the contents of this PR with commits from #815 using the following:
That would allow us to merge this PR (#774) with a cleaner and more descriptive commit history. |
And here we are again at the point where a decision has to be made. @mblaney, please take over. 🙂 |
Thanks @Art4 and @jtojnar for the updates. I appreciate yours thoughts on commit history @jtojnar, but I would prefer authors of a PR completed their work, so will retract my earlier comment and now merge #774. I hope that's ok, if you want to open a new PR or modify #815 for any of the changes you've made it would be good to discuss them separately. |
That would be nice but the quality of the software should not suffer for it. Building software is a collaborative effort anyway, so it should not be much of a problem, as long as authorship is properly credited. Unfortunately, as it is now the history is a complete mess making any debugging using And even worse, there are also correctness errors in this PR (e.g. #774 (comment)). IMO the Edit: I have updated #815. Not much we can do about deeper history but at least blame of relevant changes will end there. |
Hi 👋
At the moment we have
SimplePie\File::__construct()
, where a request is handled usingcurl
,fsockopen
orfile_get_contents
. The response is saved in the class properties likebody
,headers
, and so on.This PR is a first step to allow other HTTP libraries in SimplePie. So this PR splits the File class into 2 parts:
request()
method. The client internally calls theFile
class like before.File
class now implements aResponse
interface, that allows getting all the data like before through methods.The concept and interfaces are interoperable with PSR-18 and PSR-7. The next step will be an implementation of
SimplePie\HTTP\Client
that works with a PSR-18 client under the hood.refs #520