-
Notifications
You must be signed in to change notification settings - Fork 139
Conversation
@codemanki I think that if this is going to be on by default then there should be at least a minor version bump. There also needs to be a warning in the README to inform people that sending a cookie that was acquired with a different user agent is a bad idea. The user agent should always be stored with the session cookie. This could very well be a breaking change if anybody is already using v3 and is storing session cookies for reuse. |
@pro-src so you are saying that it should be actually a flag that is off by default? |
@codemanki, what about adding a method to apply a random user agent? // Somewhere in the custom defaults function
cloudscraper.setUserAgent = setUserAgent;
// Pseudo definition
function setUserAgent(ua) {
if (ua === 0) {
ua = this.defaultParam.headers['User-Agent'];
var old = ua;
while (ua === old) ua = randomUA();
}
if (!ua || typeof ua !== 'string') {
throw new TypeError('Expected userAgent to be a non-empty string, got ' +
typeof (ua) + ' instead.';
}
this.defaultParam.headers['User-Agent'] = ua;
}
// Usage
var cloudscraper = require('cloudscraper');
cloudscraper.setUserAgent(0); |
I'm saying it should be off by default but it doesn't necessarily have to be a flag. At least a minor bump, preferably a major bump before we switch the default behavior. Maybe set a milestone for v4 that is to change this to on by default? |
Then I would just leave it as it is and add a notice to the readme, saying that UA randomization is a nice to have but should be handled manually by user. |
@codemanki Actually, I'd rather have this on by default and release it now in the patch version then to not have it all, lol. No convenience methods necessary. However we could just document the method as only randomly returning one of the ~5 most popular user agents. Because they won't be aware of the method unless they read the readme anyway... |
@pro-src but if it is on by default, then users should also keep a track of selected UA. But on other hand, if cloudscraper is used in non-advanced way, users won't probably save cookies to reuse them and etc. So should it be a method as you suggested, or a flag? :) |
@codemanki I think a method would better so we can change the user agent on demand. If getting a captcha, you might be able to circumvent it just by changing the UA. See this issue: Anorov/cloudflare-scrape#190 So, we only really need either a flag or method but if you wanted to you could add both. I'm neutral on it. |
@codemanki Did you decide on what to do with this? |
@pro-src oh snap. yes, i want to continue with a flag, as it looks like we are currently using this kind of strategy for configuration. 👍 ? |
@codemanki 🤔 If a flag is introduced, the UA would have to be set during the first |
@codemanki See my #62 (comment) about adding a flag that should apply to all requests. |
I'm just noting this here since changing the UA can potentially bypass a captcha. var cloudscraper = require('cloudscraper');
var errors = require('cloudscraper/errors');
cloudscraper.get('http://example-site.dev')
.catch(function(error) {
if (error instanceof errors.CaptchaError) {
cloudscraper.defaultParams.headers['User-Agent'] = 'new UA';
return cloudscraper.get('http://example-site.dev');
}
throw error;
})
.then(doSomethingWithBody) |
@pro-src hey. sorry, i was out for some time, now i'm back. I will take a look at your comments and will adjust my PR |
@codemanki Thanks for the update! |
To reopen this PR. Do we want to set a new UA on errors? Or on each request? |
I like the idea of having an option that causes the UA to be changed if we get a captcha. However, you can already do that using the current API.
Changing the UA on a per-request basis is definitely a bad idea. I really don't want to delay this PR any further, can we just merge it as is? I do have a couple concerns:
The first concern could normally be resolved by a major version bump but users have to update to the latest version otherwise cloudscraper won't work for the latest challenge. Once the parsing methods are separated from the main library, we should consider backporting so breaking changes aren't forced on users. The second concern would be addressed by updating the README but maybe later we could add a method to get the cookies and UA together? |
eff7f15
to
2874089
Compare
@codemanki I just realized that we haven't updated the project markdown to mention saving the UA with the cookies. Probably should do that... |
Fixes #77
@pro-src i first created a
randomizeUserAgent
flag on defaultParams, but then decided to just randomize the UA by default, as that flag was brining more confusion than actual value