Skip to content
This repository has been archived by the owner on May 9, 2020. It is now read-only.

Commit

Permalink
Merge pull request #198 from pro-src/v4.0.0
Browse files Browse the repository at this point in the history
* 4.0.0
  • Loading branch information
codemanki committed Apr 22, 2019
2 parents 6ade80e + 77e8929 commit 79760a7
Show file tree
Hide file tree
Showing 24 changed files with 1,599 additions and 340 deletions.
13 changes: 11 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
sudo: false

language: node_js

Expand All @@ -7,10 +8,18 @@ node_js:
- 8
- 6

sudo: false
matrix:
include:
- node_js: node
env: BROTLI=1
- node_js: 6
env: BROTLI=1
before_install: npm i --save-only request brotli

before_install: npm i --save-only request
install: npm i
after_success: npm run coverage

notifications:
webhooks: https://www.travisbuddy.com/?insertMode=update
on_success: never
on_success: never
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
## Change Log

### v4.0.0 (22/04/2019)
- Randomize `User-Agent` header with random chrome browser
- Recaptcha solving support
- Brotli non-mandatory support
- Various code changes and improvements

### v3.9.1 (11/04/2019)
- Fix for the timeout parsing

Expand All @@ -11,7 +17,7 @@

### v3.7.0 (07/04/2019)
- [#182](https://github.com/codemanki/cloudscraper/pull/182) Usage examples have been added.
- [#169](https://github.com/codemanki/cloudscraper/pull/169) Cloudscraper now automatically parses out timeout for a CF challenge. `cloudflareTimeout` still can be used, but will be deprecated soon
- [#169](https://github.com/codemanki/cloudscraper/pull/169) Cloudscraper now automatically parses out timeout for a CF challenge.

### v3.6.0 (03/04/2019)
- [#180](https://github.com/codemanki/cloudscraper/pull/180) Update code to parse latest CF challenge
Expand Down
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,10 @@ Cloudscraper wraps request and request-promise, so using cloudscraper is pretty
.catch(function (err) {
});
```

## Recaptcha
Cloudscraper may help you with the recaptcha page. Take a look at [this example](https://github.com/codemanki/cloudscraper/blob/master/examples/solve-recaptcha.js).

## Defaults method

`cloudscraper.defaults` is a very convenient way of extending the cloudscraper requests with any of your settings.
Expand All @@ -151,12 +155,16 @@ var options = {
jar: requestModule.jar(), // Custom cookie jar
headers: {
// User agent, Cache Control and Accept headers are required
// User agent is populated by a random UA.
'User-Agent': 'Ubuntu Chromium/34.0.1847.116 Chrome/34.0.1847.116 Safari/537.36',
'Cache-Control': 'private',
'Accept': 'application/xml,application/xhtml+xml,text/html;q=0.9, text/plain;q=0.8,image/png,*/*;q=0.5'
},
// Cloudflare requires a delay of 4 seconds, so wait for at least 5.
// Cloudscraper automatically parses out timeout required by Cloudflare.
// Override cloudflareTimeout to adjust it.
cloudflareTimeout: 5000,
// Reduce Cloudflare's timeout to cloudflareMaxTimeout if it is excessive
cloudflareMaxTimeout: 30000,
// followAllRedirects - follow non-GET HTTP 3xx responses as redirects
followAllRedirects: true,
// Support only this max challenges in row. If CF returns more, throw an error
Expand Down Expand Up @@ -227,3 +235,4 @@ Current Cloudflare implementation requires browser to respect the timeout of 5 s
* [request-promise](https://github.com/request/request-promise)



22 changes: 11 additions & 11 deletions errors.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,17 @@
// 1. There is a non-enumerable errorType attribute.
// 2. The error constructor is hidden from the stacktrace.

var EOL = require('os').EOL;
var original = require('request-promise-core/errors');
var http = require('http');
const EOL = require('os').EOL;
const original = require('request-promise-core/errors');
const http = require('http');

var BUG_REPORT = format([
const BUG_REPORT = format([
'### Cloudflare may have changed their technique, or there may be a bug.',
'### Bug Reports: https://github.com/codemanki/cloudscraper/issues',
'### Check the detailed exception message that follows for the cause.'
]);

var ERROR_CODES = {
const ERROR_CODES = {
// Non-standard 5xx server error HTTP status codes
'520': 'Web server is returning an unknown error',
'521': 'Web server is down',
Expand Down Expand Up @@ -48,22 +48,22 @@ ERROR_CODES[1006] =
ERROR_CODES[1007] =
ERROR_CODES[1008] = 'Access Denied: Your IP address has been banned';

var OriginalError = original.RequestError;
const OriginalError = original.RequestError;

var RequestError = create('RequestError', 0);
var CaptchaError = create('CaptchaError', 1);
const RequestError = create('RequestError', 0);
const CaptchaError = create('CaptchaError', 1);

// errorType 4 is a CloudflareError so this constructor is reused.
var CloudflareError = create('CloudflareError', 2, function (error) {
const CloudflareError = create('CloudflareError', 2, function (error) {
if (!isNaN(error.cause)) {
var description = ERROR_CODES[error.cause] || http.STATUS_CODES[error.cause];
const description = ERROR_CODES[error.cause] || http.STATUS_CODES[error.cause];
if (description) {
error.message = error.cause + ', ' + description;
}
}
});

var ParserError = create('ParserError', 3, function (error) {
const ParserError = create('ParserError', 3, function (error) {
error.message = BUG_REPORT + error.message;
});

Expand Down
19 changes: 19 additions & 0 deletions examples/solve-recaptcha.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env node

function solveReCAPTCHA (url, sitekey, callback) {
// Here you do some magic with the sitekey provided by cloudscraper
}

function onCaptcha (options, response, body) {
const captcha = response.captcha;
// solveReCAPTCHA is a method that you should come up with and pass it href and sitekey, in return it will return you a reponse
solveReCAPTCHA(response.request.uri.href, captcha.siteKey, (error, gRes) => {
if (error) return void captcha.submit(error);
captcha.form['g-recaptcha-response'] = gRes;
captcha.submit();
});
}

const cloudscraper = require('..').defaults({ onCaptcha });
var uri = process.argv[2];
cloudscraper.get({ uri: uri, headers: { cookie: 'captcha=1' } }).catch(console.warn).then(console.log); // eslint-disable-line promise/catch-or-return
Loading

0 comments on commit 79760a7

Please sign in to comment.