Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignoreHTTPSErrors is not working when request interception is on #1159

Closed
Khady opened this issue Oct 25, 2017 · 37 comments
Closed

ignoreHTTPSErrors is not working when request interception is on #1159

Khady opened this issue Oct 25, 2017 · 37 comments

Comments

@Khady
Copy link

Khady commented Oct 25, 2017

Steps to reproduce

Tell us about your environment:

What steps will reproduce the problem?

Just launch this code to see the problem:

// https.js
'use strict';

const puppeteer = require('puppeteer');

const URL = "https://halva.khady.info/";

(async() => {
  const args = [
    "--disable-setuid-sandbox",
    "--no-sandbox",
  ];
  const options = {
    args,
    headless: true,
    ignoreHTTPSErrors: true,
  };
  const browser = await puppeteer.launch(options);
  const page = await browser.newPage();
  await page.setRequestInterception(true);
  page.on("request", (request) => {
    if (request.resourceType === "Image") {
      request.abort();
    } else {
      request.continue();
    }
  });
  await page.goto(URL, { timeout: 8000, waitUntil: "load" });
  const html = await page.content();
  console.log(html);
  await page.close();
  await browser.close();
})();

What is the expected result?

With the request interception disabled:

  // await page.setRequestInterception(true);
  // page.on("request", (request) => {
  //   if (request.resourceType === "Image") {
  //     request.abort();
  //   } else {
  //     request.continue();
  //   }
  // });
$ nodejs misc/https_headless.js 
<html><head><title>potkw</title></head>
<body>
Khady Khady roule roule

</body></html>

What happens instead?

$ nodejs misc/https_headless.js 
(node:23556) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Navigation Timeout Exceeded: 8000ms exceeded
(node:23556) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
^C
@ali-habibzadeh
Copy link

Same issue here

@Khady
Copy link
Author

Khady commented Nov 10, 2017

If you can use puppeteer without chrome headless, a solution seems to be the --ignore-certificate-errors flags. Unfortunately it doesn't work with chrome headless.

@Khady
Copy link
Author

Khady commented Nov 10, 2017

And actually the problem is here even if ignoreHTTPSErrors is set to false:

'use strict';

const puppeteer = require('puppeteer');

const URL = "https://halva.khady.info/";

(async () => {
  const args = [
    "--disable-setuid-sandbox",
    "--no-sandbox",
  ];
  const options = {
    args,
    headless: true,
    ignoreHTTPSErrors: false,
  };
  const browser = await puppeteer.launch(options);
  const page = await browser.newPage();
  await page.setRequestInterception(true);
  page.on("request", (request) => {
    if (request.resourceType === "image") {
      request.abort();
    } else {
      request.continue();
    }
  });
  await page.goto(URL, { timeout: 8000, waitUntil: "load" });
  const html = await page.content();
  console.log(html);
  await page.close();
  await browser.close();
})();
$ nodejs bhttps.js 
(node:321) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Navigation Timeout Exceeded: 8000ms exceeded
(node:321) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
^C

@vc1
Copy link

vc1 commented Nov 21, 2017

#1441 same as yours

use this

args: [
    '--ignore-certificate-errors',
    '--ignore-certificate-errors-spki-list '
]

@Khady
Copy link
Author

Khady commented Nov 22, 2017

This work only when you launch puppeteer with headless: false. See #1159 (comment)

BTW, the --ignore-certificate-errors-spki-list flag is not doing anything if you don't use the --user-data-dir flag according to this documentation.

@boligolov
Copy link

boligolov commented Dec 20, 2017

I have '--ignore-certificate-errors' in arguments and ignoreHTTPSErrors is TRUE, but receive ERR_CERT_AUTHORITY_INVALID error from Chromium. Url for example: 'https://ostin.com/ru/ru/'

@jsteel
Copy link

jsteel commented Dec 22, 2017

I have all those ignore flags on and interception on. I only get an ERR_CERT_AUTHORITY_INVALID when running the script against the box I'm on if it uses a self signed cert. If I hit the same box from another server, it works fine.

@aslushnikov
Copy link
Contributor

Filed upstream: crbug.com/801426

@aslushnikov aslushnikov added the P1 label Jan 12, 2018
pocketjoso added a commit to pocketjoso/penthouse that referenced this issue Mar 11, 2018
This however does not resolve a current issue where chrome headless
hangs in self signed ceritificate errors, even with these flags.
It only works with `headless: false`, but should hopefully be fixed soon:

puppeteer/puppeteer#1159 (comment)
@maZahaca
Copy link

Is there any progress?

@kevinmu
Copy link

kevinmu commented Mar 23, 2018

I ran into the same problem, worked around it by doing the following:

const puppeteer = require('puppeteer');

async function visit() {
    const browser = await puppeteer.launch({
        args: [
            '--disable-setuid-sandbox',
            '--no-sandbox',
            '--ignore-certificate-errors',
        ],
        ignoreHTTPSErrors: true,
        headless: true,
    });

    // this initial visit is meant to bypass the certificate errors; don't intercept anything yet.
    var url = "SOME_URL_ON_THE_SAME_DOMAIN.com"
    const page = await browser.newPage();
    await page.setViewport({width: 1300, height: 1000});
    await page.goto(url); 

    // intercept requests; at this point we've already bypassed the certificate errors,
    // so the subsequent page visit will load.
    await page.setRequestInterception(true);
    await page.on('request', interceptedRequest => {
        console.log(interceptedRequest.url());
        interceptedRequest.continue();
    });

    // visit the actual page you want to do stuff on.
    var actualURL = "SOME_URL_ON_THE_SAME_DOMAIN.com/ACTUAL_PAGE"
    await page.goto(actualURL, {waitUntil: 'networkidle0'});
    await browser.close();
}

However, even with this work-around, I still think that the fix for this issue is super high priority. Thanks!

@leem32
Copy link

leem32 commented Apr 5, 2018

This is a major problemo. Hope it gets fixed soon.

@wrightm
Copy link

wrightm commented Apr 19, 2018

Any update on this front?

Looks like there is a solution using the chrome driver https://bugs.chromium.org/p/chromium/issues/detail?id=721739 and chrome 65/66 .

Capybara.register_driver :headless_chromium do |app|
      capabilities = Selenium::WebDriver::Remote::Capabilities.chrome(
        acceptInsecureCerts: true,
        binary: '/usr/bin/chromium-browser',
        chromeOptions: {
          'args' => ['--headless', '--disable-web-security', '--incognito',
                     '--no-sandbox', '--disable-gpu', '--window-size=1920,1080']
        })
      Capybara::Selenium::Driver.new(
        app,
        browser: :chrome,
        desired_capabilities: capabilities
      )
    end

I'll try and find sometime to figure how to apply this to puppeteer/chrome dev tools. If anyone gets there before me please post it up.

@c094728
Copy link

c094728 commented Apr 25, 2018

I'm having the same problem and the work-around given by kevinmu didn't work for me. It will run fine running on my linux virtualbox machine but will give a time-out running in a docker container.
I'm using a node:9 docker image. If I remove the {waitUntil: ['networkidle0']} it will run and not timeout on docker but then when I retrieve all the anchors from the page, most are missing because the ajax calls have not yet finished updating the page. I can put in a timed wait but that is not reliable

@vincentlong889
Copy link

Is there any progress?

@aslushnikov
Copy link
Contributor

This bug will be fixed once Chromium completes migration onto Network Service.

As a workaround, network service could be enabled via a runtime flag --enable-feature=NetworkService.

Beware! NetworkService is not completed yet and doesn't pass all pptr tests.

Still, the following works for me:

const puppeteer = require('puppeteer');

const URL = "https://halva.khady.info/";

(async() => {
  const browser = await puppeteer.launch({
    args: ['--enable-features=NetworkService'],
    headless: true,
    ignoreHTTPSErrors: true,
  });
  const page = await browser.newPage();
  await page.setRequestInterception(true);
  page.on("request", (request) => {
    if (request.resourceType === "Image")
      request.abort();
    else
      request.continue();
  });
  await page.goto(URL, { timeout: 8000, waitUntil: "load" });
  const html = await page.content();
  console.log(html);
  await page.close();
  await browser.close();
})();

@vincentlong889
Copy link

vincentlong889 commented Jun 15, 2018

When I use this args: --enable-feature=NetworkService
then --proxy-server=xxx is not working!

seems like I use '--no-sandbox',and proxy is not working,
https://groups.google.com/a/chromium.org/forum/#!topic/network-service-dev/qr7yCQeoT4o

browser = await puppeteer.launch({
      ignoreHTTPSErrors: true,
      args: [
        '--no-sandbox',
        '--disable-dev-shm-usage',
        '--enable-features=NetworkService',
        '--proxy-server=' + proxyUrl
      ]
    });

@garris
Copy link

garris commented Jun 15, 2018

@aslushnikov thanks for he info — I will give this a try tomorrow. Good to know a fix is on the roadmap!

@fudali
Copy link

fudali commented Jul 3, 2018

@aslushnikov

Thanks for the workaround with --enable-feature=NetworkService
It works for me, but I also can't use proxy now

@vincentlong889
Copy link

use google-chrome-unstable instead of puppeteer's chromium, setRequestInterception is working

const browser = await puppeteer.launch({
      executablePath: 'google-chrome-unstable',
      ignoreHTTPSErrors: true,
      args: ['--disable-dev-shm-usage']
    });

@bluepeter
Copy link

Side effect of using --enable-feature=NetworkService is that browser caching no longer appears to work.

@leem32
Copy link

leem32 commented Aug 3, 2018

Just tried --enable-feature=NetworkService with Puppeteer 1.6.2 and accompanying Chromium, but it's still not working.

I tested using the URL http://finishline.com which hangs when using with await page.setRequestInterception(true); and there was no difference using --enable-feature=NetworkService as without it. Puppeteer just hangs up until goto timeout ends,

browser launch:
browser = await puppeteer.launch({ args: ['--enable-features=NetworkService'], headless: true, timeout: 30000, ignoreHTTPSErrors: true });

goto:
await page.goto(url, { timeout: 60000, waitUntil: 'load' })

@fudali
Copy link

fudali commented Aug 21, 2018

Why is it closed? Isn't it gonna be resolved?

@ntzm
Copy link
Contributor

ntzm commented Aug 29, 2018

NetworkService seems to break setting and getting cookies as well

@maZahaca
Copy link

maZahaca commented Sep 3, 2018

Wow Wow Wow. The original issue from Chromium was just fixed! Yay!

Let's get this stuff moving...

aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Sep 3, 2018
This roll includes:
- https://crrev.com/588420 - DevTools: teach request interception to work with
  Security.setIgnoreCertificateErrors

Fixes puppeteer#1159.
@maZahaca
Copy link

maZahaca commented Sep 4, 2018

Hi @aslushnikov, any ETA for this to be released?

aslushnikov added a commit that referenced this issue Sep 4, 2018
This roll includes:
- https://crrev.com/588420 - DevTools: teach request interception to work with
  Security.setIgnoreCertificateErrors

Fixes #1159.
@aslushnikov
Copy link
Contributor

@maZahaca: pptr 1.8.0 will be released on September, 6.

@vincentlong889
Copy link

Great job!

@bluepeter
Copy link

Note: this is still an issue for those of us trying to run on Linux. See: https://bugs.chromium.org/p/chromium/issues/detail?id=877075

@alexandrzavalii
Copy link

alexandrzavalii commented Mar 17, 2019

as @bluepeter mentioned its still an issue.
Its not working on google cloud which is Linux based
Any workarounds?

@webhype
Copy link

webhype commented May 26, 2019

Why is this issue being "aggressively closed?" Nobody there uses Mac OS? Not one person tested this on Unix-style systems?

Not one of the above examples, with or without any special args[], works on Mac OS, not for secure sites nor for https sites. Puppeteer 1.17 & Chrome 76

@aslushnikov
Copy link
Contributor

aslushnikov commented Jun 2, 2019

Why is this issue being "aggressively closed?" Nobody there uses Mac OS? Not one person tested this on Unix-style systems?

@webhype The core issue was fixed and we added a test to make sure it doesn't regress. We run all our tests continuously on Mac, Linux and Windows - so yes, we fix across the variety of environments, including UNIX-style systems.

Not one of the above examples, with or without any special args[], works on Mac OS, not for secure sites nor for https sites. Puppeteer 1.17 & Chrome 76

The https://halva.khady.info/ is down - so the example doesn't work any more. I just tried with https://expired.badssl.com/ instead and it worked.

If something doesn't work for you - please file a new issue with a script that reproduces the problem. We'll be happy to help!

@rodoabad
Copy link

@aslushnikov puppeteer-firefox has this issue still. Is there a separate thread for it?

@joelharkes
Copy link

Seems to still be an issue setRequestInterception(true) in docker as well when using the following setup:

  var executablePath = inDocker ?  : undefined;
  return launch({
    headless: true,
    ["--no-sandbox", "--disable-setuid-sandbox", "--disable-dev-shm-usage"] ,
    "/usr/bin/chromium-browser",
    defaultViewport: { width: 1000, height: 1600 },
    // dumpio: false,
    // ignoreHTTPSErrors: true
  });

@daddydrac
Copy link

I am having same problem with https site, while running puppeteer in docker

@javierfuentesm
Copy link

@aslushnikov puppeteer-firefox has this issue still. Is there a separate thread for it?

Did you find a solution? I still have the same problem with firefox

@JohnDotOwl
Copy link

What the shit, after so many bloody years, still the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests