Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial headless chrome implementation #1211

Closed
wants to merge 4 commits into from
Closed

initial headless chrome implementation #1211

wants to merge 4 commits into from

Conversation

roryrjb
Copy link

@roryrjb roryrjb commented Jul 31, 2017

Hey all,

This is an initial implementation for a major re-write of Nightmare to use
headless Chrome instead of relying on Electron.

The aim here is to keep the API the same as much as possible, but things
have changed, mostly internal functions and APIs because they were tied to Electron.

On one hand Electron affords more flexibility and control, but on the other,
headless Chrome allows Nightmare to be used in many different scenarios
without too much setup and without as much baggage, at least directly, obviously Chrome
will need to be setup, but in a number of scenarios Chrome may already be present.
There is also potential performance improvements although I haven't performed any "scientific" tests yet, so would be good to get
some feedback.

As Chrome can be instantiated in different ways with different options and
the core of Nightmare (at least in this PR) is much simpler, Chrome isn't installed,
configured or spawned directly. Bearing in mind that I have only tested on
Linux so far, this is the command you'll need.

$ google-chrome --headless --disable-gpu --remote-debugging-port=9222

I have tested on a vanilla Ubuntu Server 16.04 install (without X, i.e. no GUI environment) and without any special config,
and using the .deb provided by Google for the current Chrome stable 60.0.3112.78 and everything just works.

The core functionality for Chrome replacing all the Electron logic is provided by
Chrome DevTools Protocol Viewer specifically using this module which
exposes the API in a direct way. This is a new API for me, but there is a lot in there and a lot of potential for adding new features to Nightmare, as well as improving existing features.

As part of this PR I have also attempted to add some linting to the repo, this is something
I always do when working on my own projects and there wasn't one single style in Nightmare,
but I've basically just used the dominant style in lib/nightmare.js and stuck with that.
The basic config is in package.json and uses xo. I've converted as much as possible.

It was tempting to use ES6 everywhere but I have ensured compatibility with Node 4+
and tested with Node 4.x, Node 6.x and Node 8.x.

peek 2017-07-31 18-58

cc @stevenmiller888 @schickling

Related issues: #1092, #224, #675

@stevenmiller888
Copy link

@casesandberg
Copy link

I appreciate the linting but can you please split it out into a different PR? It's pretty difficult to see what has changed.

@kensoh
Copy link

kensoh commented Aug 1, 2017

@roryrjb this is really cool news for NightmareJS users!

Adding on some notes, regarding Chrome invocations and killing. I understand to do it from Node.js can use chrome-launcher package from Lighthouse project. It takes care of finding the paths to launch and then kill, for Linux, macOS and Windows.

However, should you want to do it through shell command, macOS can use /Applications/Google Chrome.app/Contents/MacOS/Google Chrome instead of google-chrome. Killing I was using pkill google-chrome and pkill Google\ Chrome respectively for Linux/macOS. For Windows, start "" "%full_path_to_chrome.exe%" switches can be used to start and taskkill /IM chrome.exe /T /F to kill.

This doesn't take into account some users want to use Chrome Canary bleeding edge version. In a web automation project I was working on, I did a lot of these stuff to invoke Chrome, get the websocket URL, etc because I didn't want to have Node.js dependency.

@roryrjb
Copy link
Author

roryrjb commented Aug 3, 2017

@casesandberg absolutely, I don't want to add any roadblocks to getting this PR merged and moving towards headless Chrome in general. I've pushed an additional update to remove the linting additions.

@stevenmiller888
Copy link

@casesandberg Do you have some time to review this? :)

@casesandberg
Copy link

@roryrjb I am trying to run the example using node example.js and the console returns an empty line instead of the link href. Am I doing something wrong?

@johnferro
Copy link

@casesandberg Do you have chrome started as a separate process? When I tried it without that it was failing silently; although it was not logging any line as opposed to an actual empty line which might be your case?

@johnferro
Copy link

As a general note this requires node >= 4.5 due to chrome-remote-interface requiring that (cyrus-and/chrome-remote-interface#111) even though it only specifies >= 4. Since the minor versions are in Nightmare's package.json anyway might make sense to bump it to 4.5 because of this.

@johnferro
Copy link

johnferro commented Aug 11, 2017

@roryrjb In one of the projects I'm working on I've swapped out the Nightmare version we were using with this version and have been seeing how things work. One area that I had problems with was setting the viewport size. I was able to set it on creation but not after that . Looking at some examples from chrome-remote-interface (https://medium.com/@dschnr/using-headless-chrome-as-an-automated-screenshot-tool-4b07dffba79a?1) I replaced the current viewport function with the following and got it work. I was wondering if you were able to get it working with the Browser.setWindowBounds method currently used on this branch?

exports.viewport = function (width, height, done) {
  debug('.viewport()');

  var device = {
    width: width,
    height: height,
    deviceScaleFactor: 0,
    mobile: false,
    fitWindow: false
  };

  this.chrome.Emulation.setDeviceMetricsOverride(device)
    .then(() => {
      return this.chrome.Emulation.setVisibleSize({width: width, height: height});
    })
    .then(() => {
      done();
    });
};

@neekolas
Copy link

I am getting an error trying to add a custom action. Looking at the PR, it seems like the scope of what actions can do has been reduced significantly. Probably need to update docs and/or create a new actions system based on interacting with the CDP instance directly.

nightmare_1   | /app/node_modules/nightmare/lib/nightmare.js:390
nightmare_1   |       Nightmare.childActions[name] = childfn;
nightmare_1   |                                    ^
nightmare_1   |
nightmare_1   | TypeError: Cannot set property 'noScripts' of undefined
nightmare_1   |     at Function.Nightmare.action (/app/node_modules/nightmare/lib/nightmare.js:390:36)
nightmare_1   |     at Object.<anonymous> (/app/crawlers/nightmare.js:39:11)

@stevenvachon
Copy link

Maybe this lib should somehow support both. I mean, testing an Electron app might be easier? I don't know as I've never tried this.

@shellscape
Copy link

I hope I'm not the one walking in and pooping in the cereal... but how will this project differentiate from puppeteer, given that they offer nearly identical functionality, chrome dev team support, and utilize headless chrome as well? One of the draws of nightmare was that it used electron, which has many advantages over chrome (and headless chrome) due to some limitations in the Chrome DevTools Protocol. I can't seem to find a discussion, post, or list of reasonings for the change.

@TimNZ
Copy link

TimNZ commented Sep 4, 2017

@shellscape Very much agree.

@matthewmueller
Copy link
Contributor

@shellscape @TimNZ curious what features are missing from the remote devtools protocol that are present in electron?

The downsides I see with browser automation using electron are:

  • security issues with node.js baked into the runtime
  • hard to deploy on linux

But I'm very interested to hear what we'd be giving up with this move 🙂

@shellscape
Copy link

@matthewmueller code execution and object serialization and inspection immediately come to mind. try passing an object of any complexity back to the app using the DevTools Protocol and you'll start to feel some of the pain. Other than that I can't recall specifically what else I had in mind - my post up there was just about 6 months ago - but the limitations and baked-in advantages to Electron over the DevTools is pretty thoroughly documents on the interwebs and the googs.

@TimNZ
Copy link

TimNZ commented Jan 20, 2018

Forget all that.

There is Nightmare, and there is Puppeteer, designed for different scenarios.

Moot anyway, since the Nightmare team seem to have abandoned this project.

@matthewmueller
Copy link
Contributor

@shellscape Good to know! I think the object serialization stuff has been sorted out here: https://github.com/GoogleChrome/puppeteer/blob/62597bf89780a7fde91a350e5eabf3be15bde02d/lib/ExecutionContext.js#L78, though I haven't tested it yet (maybe things like DOM nodes don't come through)

I'll be updating this project, so we're trying to decide if we should go the the headless chrome route or not. Maybe it makes sense to support both – I'm mostly just trying to gather information about what we can do with electron that can't be done with headless chrome.

@matthewmueller
Copy link
Contributor

Hi @roryrjb – thanks for sharing your proof of concept! Since puppeteer is already doing a great job with headless chrome, we're going to stick with electron and put this transition on hold. Thanks for your hard work though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants