-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support of http_header property in site config #61
Conversation
Well, nevermind for the first issue. It was a non-tracked manual change in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look interesting, thanks for that !
I'll soon test it
* @param bool $skipTypeVerification Avoid mime detection which means, force GET instead of potential HEAD | ||
* | ||
* @return array With keys effective_url, body & headers | ||
*/ | ||
public function fetch($url, $skipTypeVerification = false) | ||
public function fetch($url, $http_header = array(), $skipTypeVerification = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use httpHeader
instead?
All variables are lowerCamelCase.
@@ -341,6 +345,8 @@ public function parseLines(array $lines) | |||
} elseif ((substr($command, -1) == ')') && preg_match('!^([a-z0-9_]+)\((.*?)\)$!i', $command, $match) && $match[1] == 'replace_string') { | |||
array_push($config->find_string, $match[2]); | |||
array_push($config->replace_string, $val); | |||
} elseif ((substr($command, -1) == ')') && preg_match('!^([a-z0-9_]+)\(([a-z0-9_-]+)\)$!i', $command, $match) && $match[1] == 'http_header' && in_array($match[2], array('user-agent'))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of in_array($match[2], array('user-agent'))
can you use a simple 'user-agent' === $match[2]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used in_array()
to make addition of new headers easier. But I can replace it with a strict check if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let in_array()
for #62
3f9b341
to
51a6852
Compare
'.wikipedia.org' => 'Mozilla/5.2', | ||
'.fok.nl' => 'Googlebot/2.1', | ||
'getpocket.com' => 'PHP/5.2', | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove that, we should be sure that these values are in the config file for each domain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, skip it, I've checked, they are all already defined in the site config 👍
* @param bool $skipTypeVerification Avoid mime detection which means, force GET instead of potential HEAD | ||
* | ||
* @return array With keys effective_url, body & headers | ||
*/ | ||
public function fetch($url, $skipTypeVerification = false) | ||
public function fetch($url, $httpHeader = array(), $skipTypeVerification = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move that new parameter as third one? If we inject it in the second arg, it'll require to release a new major version because the signature of this public method will change. I know, it'll require to put the false
argument for $skipTypeVerification
every where you'll have to inject $httpHeader
but it'll avoid BC
{ | ||
$ua = $this->config['ua_browser']; | ||
|
||
if (array_key_exists('user-agent', $httpHeader) && !empty($httpHeader['user-agent'])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the array_key_exists()
, empty()
already perform the isset()
.
'.wikipedia.org' => 'Mozilla/5.2', | ||
'.fok.nl' => 'Googlebot/2.1', | ||
'getpocket.com' => 'PHP/5.2', | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, skip it, I've checked, they are all already defined in the site config 👍
Yeah it's ok, you should have an other option to inject it in the |
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
A static array is added to whitelist authorized headers during parsing. Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
These default options are no longer needed with the use of http_header(user-agent) Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
51a6852
to
b25abc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR adds the ability to graby to use user-agent provided in site configs instead of
default_ua
anduser_agents
fromHttpClient
.http_header is extracted from site config and provided to
HttpClient#fetch()
andHttpClient#getUserAgent()
to extract the user-agent, if any. In case of missing value we fall back to the original behavior: check host fromuser_agents
or usedefault_ua
.There are some issues however, at this time:
default_ua
is not used anymore as long as ahttp_header(user-agent)
is currently provided inglobal.txt
. A solution could be to ignore the user-agent provided by global, any thoughts?HttpClient
does not have access to SiteConfig, some tests began to fail because of the deletion of defaultuser_agents
array (e9f2981). I fixed them by providing a custom array withuser_agents
to theHttpClient
instance in the test. Not sure if it's okay for you.Fixes #49