Turn SlimerJS into an headless browser #80

laurentj · 2013-09-13T13:05:15Z

First solution : port the patch of bug 446591 into XulRunner (so we need to compile and provide our own XulRunner). It will be a huge work.

Second solution: since Gecko 23, it seems there is a new method nsIAppShellService::createWindowlessBrowser(). We should investigate if we could use this method to load our webpages.

drasill · 2013-10-03T17:00:45Z

I'd love it :)

ariya · 2013-10-12T21:30:56Z

+1

allquixotic · 2013-11-07T23:14:25Z

+1

kogeva · 2013-11-07T23:42:54Z

+1

laurentj · 2013-11-08T07:49:49Z

I started to work on this feature by using this new API. Unfortunately, there are still some huge issues in its implementation, for our case, which prevent us to use it. We have to wait after improvements on this new API.

allquixotic · 2013-11-25T14:56:31Z

Has anyone communicated with Mozilla regarding the limitations of the createWindowlessBrowser API? Are they considering addressing it so that we don't need an X11 server (headless or otherwise) to run SlimerJS on a GNU/Linux dedicated server? It probably will not be improved unless we let them know our needs...

matanox · 2014-01-06T14:16:23Z

What may actually be the downsides of just using xvfb for obtaining headlessness?
In my naive example code it works just fine, and the running time seems to be the same (typically ~2.5 seconds, but every few trials ~12 seconds, not sure why the variance, but it's the same variance with and without xvfb; using XULRunner not Firefox).

Isn't it just a harmless way to go, emulating a screen and letting Gecko just do its stuff as it normally would, or has anyone run into specific problems, or aware of problems inherent to this combination?

Here is the code, which you may find familiar from the documentation:

var webpageModule = require("webpage");

var page = require("webpage").create();
page.open("http://slimerjs.org")
    .then(function(status){
         if (status == "success") {
             console.log("The title of the page is: "+ page.title);
         }
         else {
             console.log("Sorry, the page is not loaded");
         }
         page.close();
         phantom.exit();
    })

allquixotic · 2014-01-06T14:36:08Z

The downsides of having it headed on what should be a headless box are:

Administrative burden: Having to maintain and keep track of instances of Xvfb (or some other X server);
Setting the DISPLAY environment variable properly and keeping track of which Xvfb is running for which user and access rights;
Security: making sure that unauthorized users are not able to access the Xvfb server as X11 clients;
Memory/CPU usage: Having the browser actually draw the graphics elements takes up more memory than having the draw commands simply be a no-op. A buffer has to be allocated in RAM where the pixels can be drawn to, and because Xvfb renders in software, all of the graphics elements of the webpage have to be rendered on the CPU. This can be pretty expensive if you're doing fancy things with HTML5. Rendering fonts in software takes a fair bit of CPU.

Startup time is not very much of a concern; rather, the main concern is being able to quickly and efficiently start up a headless browser environment, without needing to make sure an X11 server of some kind is running and the DISPLAY environment variable is set, and without the associated performance overhead. It would be ideal if you could take render screenshots "on-demand", but only have the actual rendering to a pixmap take place in the event that you actually request it from code, instead of having it go on all the time. You can achieve much shorter runtimes of automated tests by making all the drawing operations a no-op, and if you are doing this on a large enough scale, it can actually save significant heat, power, CI test time, etc.

matanox · 2014-01-06T14:44:27Z

Thanks for the deliberation. But wouldn't accomplishing exactly that require changing Gecko itself?
In addition, some javascript may rely on the result of rendering, e.g. commands that explore dimensions of elements after they have been laid out, or am I going too quick here?

allquixotic · 2014-01-06T14:54:51Z

I think Laurent's hope was that the nsIAppShellService::createWindowlessBrowser() method would implement the changes in Gecko itself that we actually need in order to achieve this functionality. At the very least, even if it did rendering in RAM for each frame, it wouldn't require it to be rendered in an X server, which requires inter-process communication (sending X11 drawing commands over a UNIX domain socket is how it's currently implemented). It would even be a considerable speed-up, for small systems running SlimerJS, to just do all the rendering in-process, compared to the overhead of the IPC.

The ideal headless DOM implementation here would keep track of things like dimensions, JavaScript state, CSS state, and basically anything you can read back from the DOM or from the JavaScript engine, but it wouldn't perform actual rendering. As far as I understand it, you only need a model of the state machine in order to perform automated tests faithfully as a normal web browser would execute them; you don't need to actually render it.

...And rendering is fairly expensive, especially in pure software as Xvfb does (the CPU and performance cost of rendering is offset significantly by offloading to the GPU of various operations on a modern desktop, but of course this consumes its fair share of power and heat because you're still doing a lot of work to render the frames).

To see how rendering is expensive, imagine doing the following in a tight loop: create a bunch of iframes with random dimensions (width x height); fill them with a random color; then remove them; and repeat. Regardless of whether you are rendering or not, you have to keep track of which elements are on the page, their size, what their contents are, what their properties are, etc. But, if you're doing rendering, you also have to figure out pixel RGBA values for the entire "frame" of the iframe, which includes its border, its internal contents, etc., drawing scrollbars as needed, and then you have to re-render any elements on the page that are either interacting with the iframe (above or below it, transparent stuff, etc), or things that got moved (up/down/left/right) by the new presence of the iframe. Without rendering, all of that stuff, you can simply skip over, because it doesn't matter. It's enough to simply know that the contents of the iframe is a red fill, for instance.

An example of a JS and DOM engine that does a fair bit of JavaScript and webpage processing without any rendering is jsdom, which runs on NodeJS. The support for web standards isn't complete there, though, compared to the completeness that we see in Gecko and WebKit today. It's a really hard problem to solve to get all the web standards implemented and nailed down.

So the next step for SlimerJS would be to leverage Gecko's excellent support of web standards, but find a way to gut the rendering completely. This could have a real impact on, for instance, the hardware spec requirements of a large continuous integration server where each source code checkin requires running numerous automated unit tests using SlimerJS.

As a bonus -- and I think this is either required or highly recommended by the WebDriver interface -- it would be great if SlimerJS could still do normal rendering to an image on demand, which would basically involve flipping a switch to tell Gecko, "hey, start rendering now!" and then capture the output of that rendering into a bitmap, and then do a little compression (JPEG, PNG, etc) and provide it to the programmer as a binary output stream in their programming language of choice, either via WebDriver or the Phantom API.

One thing we should keep in mind is that, since SlimerJS is very unlikely to take advantage of hardware acceleration when it's used, it is not useful to benchmark the performance of a website based on SlimerJS's performance. To do that, you would need to run a production release of Firefox or Chrome on Windows, Mac, and ideally an iPad and a Nexus 7, and count the amount of time it takes for page loads, scrolling, etc. and make sure it's smooth. You simply can't judge fairly from SlimerJS, whether you do rendering or not: if you don't do rendering, then your performance is going to be unrealistically fast; if you do rendering, it's in software, and software rendering is always slow, so you won't be looking at a realistic render time anyway (since almost all clients these days will have hardware acceleration of some sort).

matanox · 2014-01-06T15:18:14Z

Yes I concur it's better to drive a popular browser engine rather than a weak imitation such as the one you mention. BTW, Chrome moved from WebKit to blink, and I think apple is developing webkit2 to supersede, so WebKit may possibly lose relevance over time in this context of headlessness.

Anyway, I am not sure however what we include in 'rendering' (in my own fault) here on this thread, so I'm not sure how can html5 canvas drawing operations be simulated without calculating what's on each pixel, in case run-time code relies on getting pixels back e.g. for 'making' a screenshot. Or, say your code draws some stuff on html5 canvas and you simulate a click and need your code to determine whether the click location is inside a shape or not (sorry if this sounds a bit like game development but it can occur also in 'normal' applications). Could you clarify just a bit about the relationship between 'only having a model' and the actual computation of pixel values based on abstract drawing instructions?

Or did you maybe mean that there'd be a queue of rendering instructions that will be left unhandled until every moment when an instruction to fetch the state of the virtual display is encountered?

Thanks!

allquixotic · 2014-01-06T15:31:03Z

Actually, with something like HTML5 canvas, you may indeed be correct that the entire model would need to be rendered under some circumstances, in order to compute per-pixel values so that they can be read back. In the ordinary context of e.g. HTML4 forms, rendering can almost always be skipped entirely.

Your last paragraph brought up the interesting idea of applying "lazy evaluation" (a concept from functional programming) to the headless rendering model. This would be a great optimization. In the case of taking a screenshot, the entire evaluation chain for rendering the current state would be called. In the case of a certain API reading back a specific pixel value, the renderer might be able to figure out a minimal set of drawing operations that need to be invoked in order to determine that pixel's value. But when neither of these conditions are hit, you'd simply have a set of potentially-applicable drawing operations in a queue that could be flushed when the state is updated via DOM manipulation or similar. The drawing operations in the queue would be something like a set of function pointers with their applicable arguments, which, in "normal" operation, would never actually get invoked, but if the drawing is required for some functionality, then the queue would get emptied and each function executed.

Not sure if something like this is already implemented anywhere (PhantomJS?) or if it would be such a huge change in code that no one has attempted it just for the sake of optimizing headless browser implementations.

Obviously, from the perspective of "if it works, use it", SlimerJS with Xvfb or PhantomJS without Xvfb is more than fine, but in that case, it's equally easy to run Firefox in Xvfb and automate it with Selenium WebDriver. This works for code that is not particularly performance-sensitive, or where a little bit extra CPU/RAM is not a hindrance on the project. But I could certainly see render as being a potential bottleneck on something like an ARM SoC, where a "pure" model-tracking DOM/JS implementation would fly, but if you add in drawing commands for every frame, it would be as slow as the browser in a low-end Android phone.

To sum it up, since SlimerJS is ultimately taking the performance and memory cost of rendering due to its headed-ness, I see absolutely no reason for this project to be used in lieu of Firefox itself, unless your goal is to use a large amount of code written for PhantomJS's API and run it on Gecko. I'm looking at this from the perspective of someone who has never written a single line of PhantomJS API, though; I always use Selenium WebDriver with PhantomJS/SlimerJS. Firefox, Chrome, IE, and PhantomJS all support WebDriver, as does SlimerJS, so it's really the most flexible browser automation solution we have. SlimerJS is not going to be noticeably faster or lighter-weight or render noticeably differently than Firefox until its rendering pipeline can be turned off when it's not needed, hence why I can't really see what the use case is right now.

mariomash · 2014-02-06T15:59:57Z

+1

ragingSloth · 2014-06-13T17:59:54Z

+1

alfredwesterveld · 2014-07-18T22:38:28Z

+1

anback · 2014-08-04T09:39:02Z

+1

nchicong · 2014-10-14T11:36:38Z

+1

dirkluijk · 2014-10-22T17:29:32Z

👍 I also want to use it with Selenium WebDriver, in which situation any performance improvement would be great, so that it would be more suitable then Firefox.

professorplumb · 2014-10-26T17:10:45Z

👍

vbauer · 2014-11-13T17:28:32Z

👍

moiseevigor · 2014-12-03T16:42:38Z

👍

JLarky · 2014-12-11T07:01:19Z

👍

allquixotic · 2014-12-11T13:13:16Z

@matanster Uh, SlimerJS is already built on the Gecko codebase....

rmsphd · 2015-04-18T16:03:25Z

👍

ghost · 2015-04-25T09:16:17Z

👍

nicolsondsouza · 2015-07-02T15:15:16Z

👍

hadim · 2015-07-04T20:30:44Z

+1

nicolai86 · 2015-07-05T14:31:05Z

👍

ecLAllanon · 2015-12-04T07:19:57Z

👍

tostercx · 2016-01-06T11:56:38Z

+1

devanshah1 · 2016-04-08T11:26:37Z

+1

boyomarinov · 2016-04-22T14:12:35Z

+1

laurentj · 2016-05-02T11:00:54Z

Please stop comments with "+1". Instead use the "add your reaction" (smiley) button above the issue description ;-) Thank you

brendandahl · 2017-02-18T02:13:46Z

For those interested, I've started some work on this in https://bugzilla.mozilla.org/show_bug.cgi?id=1338004 . It's still in a very early stage, but I have a simple slimer snapshot a page script working. Also, very early perf show's this shaving off around .1-.2s on this very simple script and I also see a react benchmark go from ~24fps to 40fps. If anyone has some exceptionally slow slimer tests, I'd be curious to see them.

Branch:
https://github.com/brendandahl/gecko/tree/headless-slimerjs
Linux 64 Build:
https://queue.taskcluster.net/v1/task/W7FVxw9tQtWZqYAESasJtg/runs/0/artifacts/public/build/target.tar.bz2
Use MOZ_HEADLESS environment variable e.g.:
DISPLAY=77 MOZ_HEADLESS=1 /home/bdahl/Downloads/firefox/firefox -app /home/bdahl/projects/slimerjs/src/application.ini -no-remote -profile /home/bdahl/projects/gecko/obj.debug.noindex/tmp/scratch_user slimer.js

laurentj · 2017-02-20T07:18:55Z

@brendandahl this is a very good news! However SlimerJS has some issues with Fx>52 and I cannot launch your test (even with an "official" nightly). I will fix this issue and I will test your build :-)

laurentj · 2017-03-10T08:29:18Z

@brendandahl Awesome! It works with the latest release of SlimerJS, 0.10.3!! (after changing the max version to 54.* into application.ini)

export DISPLAY=77
export MOZ_HEADLESS=1
export SLIMERJSLAUNCHER=/home/laurent/tmp/fxheadless/firefox 
slimerjs test.js

jefleponot · 2017-03-10T08:44:00Z

@laurentj ,

Just On question 👍
SlimerJS 0.10.3 is recommended to test Firefox between 38 to 54 -> ok

Which version do you recognize to test Firefox between 17 to 37 ?

I ask to you because of casperJS tests... Have you Long Term Stable version ?

laurentj · 2017-03-10T08:47:32Z

@jefleponot SlimerJS 0.9.x

laurentj · 2017-04-11T09:07:43Z

The latest nightlies of Firefox support the headless mode, even if there are some crashes with some unit tests of SlimerJS.

jefleponot · 2017-04-11T09:22:17Z

HI @laurentj
Thanks for news
Do you have a command line example ?

Does that mean that SlimerJS will not be maintain anymore ?

Thanks in advance

laurentj · 2017-04-11T17:00:11Z

No more crashes with the build 2017-04-11 of Firefox nightly 55.0a1

laurentj · 2017-04-11T17:02:20Z

@jefleponot

Do you have a command line example ?

see one of my latest comment here

Does that mean that SlimerJS will not be maintain anymore ?

Why ??? Firefox does not have features of SlimerJS. But SlimerJS needs Firefox to run.

brendandahl · 2017-04-11T19:05:54Z

Just to note, headless is currently only supported on Linux. If you want to follow along for other platforms see:
MacOS:
https://bugzilla.mozilla.org/show_bug.cgi?id=1355147
Windows:
https://bugzilla.mozilla.org/show_bug.cgi?id=1355150

mykmelez · 2017-06-15T23:51:13Z

Note that Firefox 55 (currently in beta) will support headless browsing on Linux, per this announcement:

https://groups.google.com/forum/#!topic/firefox-dev/TEhvuBXcJCg

Beta builds are available from https://www.mozilla.org/en-US/firefox/channel/desktop/. As @brendandahl noted, Windows and macOS support is still in progress, and you can follow along on the bugs he references.

puravida · 2017-06-16T00:24:06Z

Aside from the obvious resource savings of running headless, are there other benefits as well? Would this be more or less likely to crash (or properly load complex/problematic pages) than running a full instance? Are there other downsides to running Firefox headless? If this has all been discussed, ad naseum, before then please do point me there with a link. ;)

In the past, I spent more than 500 hours evaluating every available script, open source and paid, for capturing screenshots to use as a backup for my own, proprietary method. From that testing, I concluded that running headless had some serious issues with certain kinds of pages, such as pages that format based on viewport dimensions and such. The problem is that running headless meant there was no viewport dimensions reported, so some pages were crunched or had elements extending off the page (improper page height) or the backgrounds did not extend fully, etc. This is a major reason that CutyCapt, PhantomJS, and similar failed to meet my needs as a backup solution.

Lastly, has anyone run any metrics yet to judge the gains (less ram, less cpu) from running headless?

Just curious... Thanks!

drasill · 2017-06-16T14:45:32Z

I don't know how it works on other platforms, but the major point on linux is that non-headless needs a Xorg server started, which is quite complicated on servers / docker instances.

puravida · 2017-06-16T15:07:37Z

@drasill Thanks for the note. You are so right. Managing the "window manager" is one of my top headaches, and I would LOVE to eliminate that piece of the process.

However, the viewport issues are too problematic, since I have to render ANY page as accurately as possible. I've captured screenshots of more than 500 million URLs (conservatively), and I think I've probably actually seen a million of them, haha! So, I know that the other headless methods were only good for specific use-cases where the web page(s) was/were known to be compatible.

However, as technology and methods advance, it would be GREAT to see a headless implementation that could overcome those limitations. :)

mykmelez · 2017-06-16T16:47:41Z

Aside from the obvious resource savings of running headless, are there other benefits as well? Would this be more or less likely to crash (or properly load complex/problematic pages) than running a full instance? Are there other downsides to running Firefox headless? If this has all been discussed, ad naseum, before then please do point me there with a link. ;)

I don't expect it to be less likely to crash, as it still exercises all of the rendering pipeline at the moment. However, future optimizations might avoid rendering and compositing (until you take a screenshot, anyway), which would reduce crashes in that code and improve performance. (Out-of-process compositing will also help with crashiness, for both headed and headless Firefox.)

Headless mode may also be more consistent, especially if you script Firefox on multiple platforms, since parts of the headless "widget backend" are cross-platform. A possible downside of that is that headless doesn't produce exactly the same results as headed with a platform-specific widget backend.

The problem is that running headless meant there was no viewport dimensions reported, so some pages were crunched or had elements extending off the page (improper page height) or the backgrounds did not extend fully, etc. This is a major reason that CutyCapt, PhantomJS, and similar failed to meet my needs as a backup solution.

Headless mode for Firefox should report viewport dimensions (i.e. window.innerWidth/innerHeight) correctly for any window you open (or that SlimerJS opens for you). Screen dimensions are hardcoded to 1366x768, per HeadlessScreenHelper::GetScreenRect, but can be configured at runtime via the MOZ_HEADLESS_WIDTH and MOZ_HEADLESS_HEIGHT environment variables.

puravida · 2017-06-16T17:53:35Z

@mykmelez Nice. That is very insightful. Thank you.

Headless mode for Firefox should report viewport dimensions (i.e. window.innerWidth/innerHeight) correctly for any window you open (or that SlimerJS opens for you). Screen dimensions are hardcoded to 1366x768, per HeadlessScreenHelper::GetScreenRect, but can be configured at runtime via the MOZ_HEADLESS_WIDTH and MOZ_HEADLESS_HEIGHT environment variables.

This would be awesome! Other scripts claimed to send the viewport, but really did not. If slimerjs could hook into the ability to adjust, like via MOZ_HEADLESS_XXX, then that would be a game-changer.

Thanks for the insight...

mykmelez · 2017-08-16T23:31:50Z

Support for Windows and Mac has since landed in Firefox 56, which is currently in Beta.

@laurentj Is there anything to change in SlimerJS code to resolve this issue, or is it sufficient to document that you can run SlimerJS in headless mode on Firefox 56+ by specifying the right Firefox command-line flag/environment variable?

birtles · 2017-08-21T23:20:26Z

I suppose at very least the MaxVersion in application.ini needs to be updated?

brendanhowell · 2017-10-16T07:32:39Z

This is awesome! Will I be able to use my old scripts that render to PDF or is this only for PNG?

jdfreder mentioned this issue Mar 11, 2014

Add support for Firefox JS testing ipython/ipython#5323

Merged

laurentj added the needs gecko patch label May 7, 2014

allquixotic mentioned this issue Jan 25, 2016

Use Nightmare.js 1.8.2 Zirak/SO-ChatBot#266

Open

EndangeredMassa mentioned this issue Apr 13, 2017

Add Support for Headless Chrome testiumjs/testium-core#32

Open

laurentj added Doc needed and removed needs gecko patch labels Oct 4, 2017

laurentj added this to the SlimerJS 1.0 milestone Oct 4, 2017

laurentj closed this as completed in 641eae8 Oct 13, 2017

Turn SlimerJS into an headless browser #80

Turn SlimerJS into an headless browser #80

Comments

laurentj commented Sep 13, 2013

drasill commented Oct 3, 2013

ariya commented Oct 12, 2013

allquixotic commented Nov 7, 2013

kogeva commented Nov 7, 2013

laurentj commented Nov 8, 2013

allquixotic commented Nov 25, 2013

matanox commented Jan 6, 2014

allquixotic commented Jan 6, 2014

matanox commented Jan 6, 2014

allquixotic commented Jan 6, 2014

matanox commented Jan 6, 2014

allquixotic commented Jan 6, 2014

mariomash commented Feb 6, 2014

ragingSloth commented Jun 13, 2014

alfredwesterveld commented Jul 18, 2014

anback commented Aug 4, 2014

nchicong commented Oct 14, 2014

dirkluijk commented Oct 22, 2014

professorplumb commented Oct 26, 2014

vbauer commented Nov 13, 2014

moiseevigor commented Dec 3, 2014

JLarky commented Dec 11, 2014

allquixotic commented Dec 11, 2014

rmsphd commented Apr 18, 2015

ghost commented Apr 25, 2015

nicolsondsouza commented Jul 2, 2015

hadim commented Jul 4, 2015

nicolai86 commented Jul 5, 2015

ecLAllanon commented Dec 4, 2015

tostercx commented Jan 6, 2016

devanshah1 commented Apr 8, 2016

boyomarinov commented Apr 22, 2016

laurentj commented May 2, 2016

brendandahl commented Feb 18, 2017

laurentj commented Feb 20, 2017

laurentj commented Mar 10, 2017

jefleponot commented Mar 10, 2017 • edited Loading

laurentj commented Mar 10, 2017

laurentj commented Apr 11, 2017

jefleponot commented Apr 11, 2017

laurentj commented Apr 11, 2017

laurentj commented Apr 11, 2017

brendandahl commented Apr 11, 2017

mykmelez commented Jun 15, 2017

puravida commented Jun 16, 2017

drasill commented Jun 16, 2017

puravida commented Jun 16, 2017

mykmelez commented Jun 16, 2017

puravida commented Jun 16, 2017

mykmelez commented Aug 16, 2017

birtles commented Aug 21, 2017

brendanhowell commented Oct 16, 2017

jefleponot commented Mar 10, 2017 •

edited

Loading