Skip to content

Client certificate authentication on pages. #1319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
konstantinblaesi opened this issue Nov 8, 2017 · 28 comments
Closed

Client certificate authentication on pages. #1319

konstantinblaesi opened this issue Nov 8, 2017 · 28 comments

Comments

@konstantinblaesi
Copy link

I've looked through open/closed issues and the puppeteer API documentation. Is an API planned/possible? Is there an alternative solution? Will puppeteer just use some global chrome profile + certificate store and use the client certificates installed on the host?

@aslushnikov
Copy link
Contributor

I don't understand. Can you please elaborate more on what's the usecaes?

@konstantinblaesi
Copy link
Author

@aslushnikov I mean websites with mandatory or optional authentication via client certificates. I think on windows google chrome uses the certificates from the certificate store, but wouldn't it make sense to have a chrome/puppeteer api for that similar to how it supports basic authentication?

@konstantinblaesi
Copy link
Author

@aslushnikov This chromium headless bug probably need to be solved first.

@aslushnikov
Copy link
Contributor

I mean websites with mandatory or optional authentication via client certificates.

@konstantinblaesi can you please give an example of such a website? Honestly, it looks like I have no idea about the topic.

@jasonparallel
Copy link

@aslushnikov This would support sites that use pki for authentication (vs user/pass or oauth). For testing work it would be helpful to be able to specify what cert chrome passes during the ssl handshake when the server requests a client certificate.

I can't think of a public site that uses pki for authentication. It is somewhat common for intranet sites.

@dapriett
Copy link

dapriett commented Mar 8, 2018

Here is how I got around client cert authentication, hope it helps others. Basically just intercept the request, then fire the request off yourself using your favorite httpclient lib, and repond to the intercepted request with the response info.

'use strict';

const puppeteer = require('puppeteer');
const request = require('request');
const fs = require('fs');

(async () => {
    const browser = await puppeteer.launch();
    let page = await browser.newPage();

    // Enable Request Interception
    await page.setRequestInterception(true);

    // Client cert files
    const cert = fs.readFileSync('/path/to/cert.crt.pem');
    const key = fs.readFileSync('/path/to/cert.key.pem');

    page.on('request', interceptedRequest => {
        // Intercept Request, pull out request options, add in client cert
        const options = {
            uri: interceptedRequest.url(),
            method: interceptedRequest.method(),
            headers: interceptedRequest.headers(),
            body: interceptedRequest.postData(),
            cert: cert,
            key: key
        };

        // Fire off the request manually (example is using using 'request' lib)
        request(options, function(err, resp, body) {
            // Abort interceptedRequest on error
            if (err) {
                console.error(`Unable to call ${options.uri}`, err);
                return interceptedRequest.abort('connectionrefused');
            }

            // Return retrieved response to interceptedRequest
            interceptedRequest.respond({
                status: resp.statusCode,
                contentType: resp.headers['content-type'],
                headers: resp.headers,
                body: body
            });
        });

    });

    await page.goto('https://client.badssl.com/');
    await browser.close();
})();

@aslushnikov
Copy link
Contributor

Dupe of #540.

@camjackson
Copy link

camjackson commented Oct 12, 2018

So I know that 'thanks' and 'me too' comments are terrible, but I really just have to say THANK YOU to @dapriett for that solution! The entire internet seems to be filled with people asking how to get around the whole cert selection thing with puppeteer and various other browser automation libraries, and the above comment solves it beautifully! 👌

Note that request also supports sending the cert as a single PFX/PKCS12 file, optionally with a passphrase. You can see the docs here.

@camjackson
Copy link

One more update to the above solution by @dapriett - there is a bug with puppeteer where gzipped, intercepted responses are not decompressed correctly. See #1707.

This is easily fixed though - when calling request, just add gzip: true to the options. This will cause request to do the unzipping for you, so the browser will receive the uncompressed response. I suppose technically you should probably then strip the Content-Encoding: gzip header, but it hasn't caused any problems for me so far.

@isakoala
Copy link

Thanks @dapriett, this solves the authentication, but when routing through request, pages don't load properly for me, with or without certs. I haven't tried alternative http libraries yet. Using puppeteer@1.11.0 and request@2.88.0.

screenshot 2019-01-31 at 12 08 08

@camjackson
Copy link

@isakoala Looks like the stylesheet for the page isn't being loaded so you're just seeing raw, unstyled html. Or possible there are XHRs not loading either. I would just try adding a bunch of console logs within the request interception to see which requests are being made, which are getting responses back, etc, etc.

@konstantinblaesi
Copy link
Author

FYI we've observed that when servers request certificate authentication from puppeteer/chromium it get's stuck when the request interception feature is enabled. We're basically hit by #3471

@isakoala
Copy link

@camjackson indeed most stylesheets, all images and some other script files aren't loading. In my case, the latter was fatal because all post login content on the site I'm scraping is loaded using said scripts. Thanks, @konstantinblaesi, I'll keep an eye on that.

@alexandrzavalii
Copy link

@dapriett sorry about noob questions,
where do I get this files from?

   // Client cert files
    const cert = fs.readFileSync('/path/to/cert.crt.pem');
    const key = fs.readFileSync('/path/to/cert.key.pem');

I only managed to download .crt file from broken website.

@konstantinblaesi
Copy link
Author

This ticket is about a client authenticating to the webserver using certificates instead of credentials (e.g. basic auth). You or your organization will probably generate these files.

@alexandrzavalii
Copy link

@konstantinblaesi
Got it ! Thanks !
I don't own the website which has insecure ssl certificate.
However, locally puppeteer works okay in headless, But whenever I deploy it to google cloud, I am stuck at await page.goto('bad_website.com). If I use google.com it works.

Another weird thing, is that same code works on AWS lambda.

I know its really difficult to tell what's the issue,but any suggestions would be greatly appreciated.

@konstantinblaesi
Copy link
Author

You can always tell puppeteer/chromium to ignore https errors by setting ignoreHTTPSErrors: true when launching the browser see
https://github.com/GoogleChrome/puppeteer/blob/v1.13.0/docs/api.md#puppeteerlaunchoptions
You should probably check which CA signed the certificate of the website you're trying to crawl and see whether that CA is installed on the system you're using puppeter on.

@alexandrzavalii
Copy link

alexandrzavalii commented Mar 18, 2019

@konstantinblaesi Thanks for your help!
Sorry I forgot to mention that I am using alreadyignoreHTTPSErrors: true.with puppeteer: 1.13.0 and I have no luck.

    const browser = await puppeteer.launch({
      headless: true,
      ignoreHTTPSErrors: true,
      args: [
        '--incognito',
        '--disable-gpu',
        '--disable-dev-shm-usage',
        '--disable-setuid-sandbox',
        '--no-first-run',
        '--no-sandbox',
        '--no-zygote',
        '--single-process', // <- this one doesn't works in Windows
        '--ignore-certificate-errors',
        '--ignore-certificate-errors-spki-list',
        '--user-data-dir',
        '--enable-features=NetworkService'
      ]
    });

Every time I run await page.goto('bad_website.com') I have timeout error TimeoutError: Navigation Timeout Exceeded: on google cloud functions

Not sure how to check what CA are installed on google cloud.

In documentation it is mentioned that page.goto might cause the timeout is exceeded during navigation. but its not saying what might cause it.

https://github.com/GoogleChrome/puppeteer/blob/v1.13.0/docs/api.md#pagegotourl-options

@camjackson
Copy link

Hi @alexandrzavalii, as @konstantinblaesi said, this issue is about client cert authentication, which is a method for logging into websites by using an SSL certificate, rather than a username and password. It's an approach sometimes used by corporates for managing internal employee identities, but almost never used on the regular public internet.

It sounds like your issue is more to do with SSL cert validation on a normal website, which is a completely different issue. Topic aside, github issues is really for bug and issue tracking rather than user support, so you might be better off asking a question on somewhere like stack overflow. Good luck! 🙂

@badeball
Copy link

Thanks for your intercept example, @dapriett. My use case isn't about certificate authentication specifically, but simply self-signed certificates created for the purpose of testing a webserver application (that cannot be server over non-https due to reasons).

It bothers me to no end that people seem to equate the problems of

  1. Using a provided, trusted certificate to request a webpage
  2. Ignoring all errors related to certificates

I obviously want my test to resemble how the software is used as closely as possible and I don't get that by disabling certificate verification.

Some initial experimentation with request interception seemed to indicate that it did interfere with cookies and websockets 😞

@filozof6
Copy link

Here is how I got around client cert authentication, hope it helps others. Basically just intercept the request, then fire the request off yourself using your favorite httpclient lib, and repond to the intercepted request with the response info.

'use strict';

const puppeteer = require('puppeteer');
const request = require('request');
const fs = require('fs');

(async () => {
    const browser = await puppeteer.launch();
    let page = await browser.newPage();

    // Enable Request Interception
    await page.setRequestInterception(true);

    // Client cert files
    const cert = fs.readFileSync('/path/to/cert.crt.pem');
    const key = fs.readFileSync('/path/to/cert.key.pem');

    page.on('request', interceptedRequest => {
        // Intercept Request, pull out request options, add in client cert
        const options = {
            uri: interceptedRequest.url(),
            method: interceptedRequest.method(),
            headers: interceptedRequest.headers(),
            body: interceptedRequest.postData(),
            cert: cert,
            key: key
        };

        // Fire off the request manually (example is using using 'request' lib)
        request(options, function(err, resp, body) {
            // Abort interceptedRequest on error
            if (err) {
                console.error(`Unable to call ${options.uri}`, err);
                return interceptedRequest.abort('connectionrefused');
            }

            // Return retrieved response to interceptedRequest
            interceptedRequest.respond({
                status: resp.statusCode,
                contentType: resp.headers['content-type'],
                headers: resp.headers,
                body: body
            });
        });

    });

    await page.goto('https://client.badssl.com/');
    await browser.close();
})();

this does not work for me it seems tho I intercept the request the browser just reloads and repeat the same request all over again :(

@dcr007
Copy link

dcr007 commented Apr 24, 2020

@dapriett , @camjackson - Are you referring to a private key to validate the cer ?
const key = fs.readFileSync('/path/to/cert.key.pem');
how do I get this key ? .

@konstantinblaesi
Copy link
Author

@dcr007 These are most likely public (cert) and private key (key) files of your client. They can be generated with openssl or similar tools.

@dcr007
Copy link

dcr007 commented Apr 24, 2020

@konstantinblaesi - thanks for responding !, I managed to export the public(cert) from chrome from the website and then executing openssl x509 -inform DER -in publicCertificate.cer -out mynewcert.pem -text
How do I do export the private key for the certificate from mac os ? I tried digging into my 'keychains' but couldn't figure out , any pointers will be highly appreciated.
I'm currently experiencing this error from chromium while running the test:

"Identity Provider could not process the authentication request received. Delete your browser cache and stored cookies, and restart your browser. If you still experience issues after doing this, please contact your administrator."

@konstantinblaesi
Copy link
Author

konstantinblaesi commented Apr 24, 2020

@dcr007 unfortunately I cannot be of any further help, because I have no practical experience with puppeteer and client certificate auth. I would suggest asking via https://github.com/puppeteer/puppeteer#q-i-have-more-questions-where-do-i-ask and https://gitter.im/puppeteer-chat/Lobby

@ldrahnik
Copy link

My workaroud is use headfull puppeter running in docker without GUI using xvfb:

Docker

  1. Install Google Chrome & xvfb
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \    && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
RUN apt-get update
RUN apt-get install -yq gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget x11vnc x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps
RUN apt-get install -yq xvfb google-chrome-stable
  1. Add SSL certificate to Google Chrome
RUN apt-get install -yq libnss3-tools
RUN mkdir -p $HOME/.pki/nssdb
RUN pk12util -d sql:$HOME/.pki/nssdb -i /tmp/cert.pfx -w /tmp/cert.pfx.pass
  1. Set up autoselecting added SSL certificate in Google Chrome
{ "AutoSelectCertificateForUrls": ["{\"pattern\":\"*\",\"filter\":{\"ISSUER\":{\"CN\":\"XY name\"}}}"]}

mkdir -p /etc/opt/chrome/policies/managed
cp policy.json /etc/opt/chrome/policies/managed
  1. Run nodejs part with rendering redirected to xvfb
CMD xvfb-run --setver-args='-screen 0 1024x768x24" node index.js

Nodejs

  1. Install npm Puppeteer package
npm install puppeteer
  1. Run Puppeteer
const browser = await puppeteer.launch({ executablePath: '/usr/bin/google-chrome-stable', headless: false, args: [' - no-sandbox',' - disable-setuid-sandbox'] }); 

Tested on render.com. Example project. More detailed Medium story

@aallvi
Copy link

aallvi commented Nov 30, 2023

Here is how I got around client cert authentication, hope it helps others. Basically just intercept the request, then fire the request off yourself using your favorite httpclient lib, and repond to the intercepted request with the response info.

'use strict';

const puppeteer = require('puppeteer');
const request = require('request');
const fs = require('fs');

(async () => {
    const browser = await puppeteer.launch();
    let page = await browser.newPage();

    // Enable Request Interception
    await page.setRequestInterception(true);

    // Client cert files
    const cert = fs.readFileSync('/path/to/cert.crt.pem');
    const key = fs.readFileSync('/path/to/cert.key.pem');

    page.on('request', interceptedRequest => {
        // Intercept Request, pull out request options, add in client cert
        const options = {
            uri: interceptedRequest.url(),
            method: interceptedRequest.method(),
            headers: interceptedRequest.headers(),
            body: interceptedRequest.postData(),
            cert: cert,
            key: key
        };

        // Fire off the request manually (example is using using 'request' lib)
        request(options, function(err, resp, body) {
            // Abort interceptedRequest on error
            if (err) {
                console.error(`Unable to call ${options.uri}`, err);
                return interceptedRequest.abort('connectionrefused');
            }

            // Return retrieved response to interceptedRequest
            interceptedRequest.respond({
                status: resp.statusCode,
                contentType: resp.headers['content-type'],
                headers: resp.headers,
                body: body
            });
        });

    });

    await page.goto('https://client.badssl.com/');
    await browser.close();
})();

this code is not working for me :( the page throw blank after accept the use of certificate with

page.on('dialog', async dialog => {
await dialog.accept();
});

@igorovisk
Copy link

When i try doing the interception, I get an error:
Error: Request is already handled!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests