You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hyphe get a random user agent for each crawl task from a webservice.
For some websites one might need to fix the user agent used by the crawler;
For instance website protected by cloudflare needs a cookie which is only valid for the user agent used to generate it.
Therefore for such websites, one needs to :
visit the website on a web browser and solve the potential captcha
get the cookie created and the user agent of the web browser used
set both the cookie and the user agent in the crawl config panel of this web entity in hyphe
So far setting the cookie is possible but not the User Agent.
One enhancement would be to add this parameter by crawl the same way than cookie.
The user agent settings at the crawl level would have precedence on the automatic random mechanism.
The text was updated successfully, but these errors were encountered:
Hyphe get a random user agent for each crawl task from a webservice.
For some websites one might need to fix the user agent used by the crawler;
For instance website protected by cloudflare needs a cookie which is only valid for the user agent used to generate it.
Therefore for such websites, one needs to :
So far setting the cookie is possible but not the User Agent.
One enhancement would be to add this parameter by crawl the same way than cookie.
The user agent settings at the crawl level would have precedence on the automatic random mechanism.
The text was updated successfully, but these errors were encountered: