Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn SlimerJS into an headless browser #80

Closed
laurentj opened this issue Sep 13, 2013 · 55 comments
Closed

Turn SlimerJS into an headless browser #80

laurentj opened this issue Sep 13, 2013 · 55 comments

Comments

@laurentj
Copy link
Owner

@laurentj laurentj commented Sep 13, 2013

First solution : port the patch of bug 446591 into XulRunner (so we need to compile and provide our own XulRunner). It will be a huge work.

Second solution: since Gecko 23, it seems there is a new method nsIAppShellService::createWindowlessBrowser(). We should investigate if we could use this method to load our webpages.

@drasill
Copy link

@drasill drasill commented Oct 3, 2013

I'd love it :)

@ariya
Copy link

@ariya ariya commented Oct 12, 2013

+1

2 similar comments
@allquixotic
Copy link

@allquixotic allquixotic commented Nov 7, 2013

+1

@kogeva
Copy link

@kogeva kogeva commented Nov 7, 2013

+1

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Nov 8, 2013

I started to work on this feature by using this new API. Unfortunately, there are still some huge issues in its implementation, for our case, which prevent us to use it. We have to wait after improvements on this new API.

@allquixotic
Copy link

@allquixotic allquixotic commented Nov 25, 2013

Has anyone communicated with Mozilla regarding the limitations of the createWindowlessBrowser API? Are they considering addressing it so that we don't need an X11 server (headless or otherwise) to run SlimerJS on a GNU/Linux dedicated server? It probably will not be improved unless we let them know our needs...

@matanster
Copy link

@matanster matanster commented Jan 6, 2014

What may actually be the downsides of just using xvfb for obtaining headlessness?
In my naive example code it works just fine, and the running time seems to be the same (typically ~2.5 seconds, but every few trials ~12 seconds, not sure why the variance, but it's the same variance with and without xvfb; using XULRunner not Firefox).

Isn't it just a harmless way to go, emulating a screen and letting Gecko just do its stuff as it normally would, or has anyone run into specific problems, or aware of problems inherent to this combination?

Here is the code, which you may find familiar from the documentation:

var webpageModule = require("webpage");

var page = require("webpage").create();
page.open("http://slimerjs.org")
    .then(function(status){
         if (status == "success") {
             console.log("The title of the page is: "+ page.title);
         }
         else {
             console.log("Sorry, the page is not loaded");
         }
         page.close();
         phantom.exit();
    })
@allquixotic
Copy link

@allquixotic allquixotic commented Jan 6, 2014

The downsides of having it headed on what should be a headless box are:

  • Administrative burden: Having to maintain and keep track of instances of Xvfb (or some other X server);
  • Setting the DISPLAY environment variable properly and keeping track of which Xvfb is running for which user and access rights;
  • Security: making sure that unauthorized users are not able to access the Xvfb server as X11 clients;
  • Memory/CPU usage: Having the browser actually draw the graphics elements takes up more memory than having the draw commands simply be a no-op. A buffer has to be allocated in RAM where the pixels can be drawn to, and because Xvfb renders in software, all of the graphics elements of the webpage have to be rendered on the CPU. This can be pretty expensive if you're doing fancy things with HTML5. Rendering fonts in software takes a fair bit of CPU.

Startup time is not very much of a concern; rather, the main concern is being able to quickly and efficiently start up a headless browser environment, without needing to make sure an X11 server of some kind is running and the DISPLAY environment variable is set, and without the associated performance overhead. It would be ideal if you could take render screenshots "on-demand", but only have the actual rendering to a pixmap take place in the event that you actually request it from code, instead of having it go on all the time. You can achieve much shorter runtimes of automated tests by making all the drawing operations a no-op, and if you are doing this on a large enough scale, it can actually save significant heat, power, CI test time, etc.

@matanster
Copy link

@matanster matanster commented Jan 6, 2014

Thanks for the deliberation. But wouldn't accomplishing exactly that require changing Gecko itself?
In addition, some javascript may rely on the result of rendering, e.g. commands that explore dimensions of elements after they have been laid out, or am I going too quick here?

@allquixotic
Copy link

@allquixotic allquixotic commented Jan 6, 2014

I think Laurent's hope was that the nsIAppShellService::createWindowlessBrowser() method would implement the changes in Gecko itself that we actually need in order to achieve this functionality. At the very least, even if it did rendering in RAM for each frame, it wouldn't require it to be rendered in an X server, which requires inter-process communication (sending X11 drawing commands over a UNIX domain socket is how it's currently implemented). It would even be a considerable speed-up, for small systems running SlimerJS, to just do all the rendering in-process, compared to the overhead of the IPC.

The ideal headless DOM implementation here would keep track of things like dimensions, JavaScript state, CSS state, and basically anything you can read back from the DOM or from the JavaScript engine, but it wouldn't perform actual rendering. As far as I understand it, you only need a model of the state machine in order to perform automated tests faithfully as a normal web browser would execute them; you don't need to actually render it.

...And rendering is fairly expensive, especially in pure software as Xvfb does (the CPU and performance cost of rendering is offset significantly by offloading to the GPU of various operations on a modern desktop, but of course this consumes its fair share of power and heat because you're still doing a lot of work to render the frames).

To see how rendering is expensive, imagine doing the following in a tight loop: create a bunch of iframes with random dimensions (width x height); fill them with a random color; then remove them; and repeat. Regardless of whether you are rendering or not, you have to keep track of which elements are on the page, their size, what their contents are, what their properties are, etc. But, if you're doing rendering, you also have to figure out pixel RGBA values for the entire "frame" of the iframe, which includes its border, its internal contents, etc., drawing scrollbars as needed, and then you have to re-render any elements on the page that are either interacting with the iframe (above or below it, transparent stuff, etc), or things that got moved (up/down/left/right) by the new presence of the iframe. Without rendering, all of that stuff, you can simply skip over, because it doesn't matter. It's enough to simply know that the contents of the iframe is a red fill, for instance.

An example of a JS and DOM engine that does a fair bit of JavaScript and webpage processing without any rendering is jsdom, which runs on NodeJS. The support for web standards isn't complete there, though, compared to the completeness that we see in Gecko and WebKit today. It's a really hard problem to solve to get all the web standards implemented and nailed down.

So the next step for SlimerJS would be to leverage Gecko's excellent support of web standards, but find a way to gut the rendering completely. This could have a real impact on, for instance, the hardware spec requirements of a large continuous integration server where each source code checkin requires running numerous automated unit tests using SlimerJS.

As a bonus -- and I think this is either required or highly recommended by the WebDriver interface -- it would be great if SlimerJS could still do normal rendering to an image on demand, which would basically involve flipping a switch to tell Gecko, "hey, start rendering now!" and then capture the output of that rendering into a bitmap, and then do a little compression (JPEG, PNG, etc) and provide it to the programmer as a binary output stream in their programming language of choice, either via WebDriver or the Phantom API.

One thing we should keep in mind is that, since SlimerJS is very unlikely to take advantage of hardware acceleration when it's used, it is not useful to benchmark the performance of a website based on SlimerJS's performance. To do that, you would need to run a production release of Firefox or Chrome on Windows, Mac, and ideally an iPad and a Nexus 7, and count the amount of time it takes for page loads, scrolling, etc. and make sure it's smooth. You simply can't judge fairly from SlimerJS, whether you do rendering or not: if you don't do rendering, then your performance is going to be unrealistically fast; if you do rendering, it's in software, and software rendering is always slow, so you won't be looking at a realistic render time anyway (since almost all clients these days will have hardware acceleration of some sort).

@matanster
Copy link

@matanster matanster commented Jan 6, 2014

Yes I concur it's better to drive a popular browser engine rather than a weak imitation such as the one you mention. BTW, Chrome moved from WebKit to blink, and I think apple is developing webkit2 to supersede, so WebKit may possibly lose relevance over time in this context of headlessness.

Anyway, I am not sure however what we include in 'rendering' (in my own fault) here on this thread, so I'm not sure how can html5 canvas drawing operations be simulated without calculating what's on each pixel, in case run-time code relies on getting pixels back e.g. for 'making' a screenshot. Or, say your code draws some stuff on html5 canvas and you simulate a click and need your code to determine whether the click location is inside a shape or not (sorry if this sounds a bit like game development but it can occur also in 'normal' applications). Could you clarify just a bit about the relationship between 'only having a model' and the actual computation of pixel values based on abstract drawing instructions?

Or did you maybe mean that there'd be a queue of rendering instructions that will be left unhandled until every moment when an instruction to fetch the state of the virtual display is encountered?

Thanks!

@allquixotic
Copy link

@allquixotic allquixotic commented Jan 6, 2014

Actually, with something like HTML5 canvas, you may indeed be correct that the entire model would need to be rendered under some circumstances, in order to compute per-pixel values so that they can be read back. In the ordinary context of e.g. HTML4 forms, rendering can almost always be skipped entirely.

Your last paragraph brought up the interesting idea of applying "lazy evaluation" (a concept from functional programming) to the headless rendering model. This would be a great optimization. In the case of taking a screenshot, the entire evaluation chain for rendering the current state would be called. In the case of a certain API reading back a specific pixel value, the renderer might be able to figure out a minimal set of drawing operations that need to be invoked in order to determine that pixel's value. But when neither of these conditions are hit, you'd simply have a set of potentially-applicable drawing operations in a queue that could be flushed when the state is updated via DOM manipulation or similar. The drawing operations in the queue would be something like a set of function pointers with their applicable arguments, which, in "normal" operation, would never actually get invoked, but if the drawing is required for some functionality, then the queue would get emptied and each function executed.

Not sure if something like this is already implemented anywhere (PhantomJS?) or if it would be such a huge change in code that no one has attempted it just for the sake of optimizing headless browser implementations.

Obviously, from the perspective of "if it works, use it", SlimerJS with Xvfb or PhantomJS without Xvfb is more than fine, but in that case, it's equally easy to run Firefox in Xvfb and automate it with Selenium WebDriver. This works for code that is not particularly performance-sensitive, or where a little bit extra CPU/RAM is not a hindrance on the project. But I could certainly see render as being a potential bottleneck on something like an ARM SoC, where a "pure" model-tracking DOM/JS implementation would fly, but if you add in drawing commands for every frame, it would be as slow as the browser in a low-end Android phone.

To sum it up, since SlimerJS is ultimately taking the performance and memory cost of rendering due to its headed-ness, I see absolutely no reason for this project to be used in lieu of Firefox itself, unless your goal is to use a large amount of code written for PhantomJS's API and run it on Gecko. I'm looking at this from the perspective of someone who has never written a single line of PhantomJS API, though; I always use Selenium WebDriver with PhantomJS/SlimerJS. Firefox, Chrome, IE, and PhantomJS all support WebDriver, as does SlimerJS, so it's really the most flexible browser automation solution we have. SlimerJS is not going to be noticeably faster or lighter-weight or render noticeably differently than Firefox until its rendering pipeline can be turned off when it's not needed, hence why I can't really see what the use case is right now.

@mariomash
Copy link

@mariomash mariomash commented Feb 6, 2014

+1

@ragingSloth
Copy link

@ragingSloth ragingSloth commented Jun 13, 2014

+1

3 similar comments
@alfredwesterveld
Copy link

@alfredwesterveld alfredwesterveld commented Jul 18, 2014

+1

@anback
Copy link

@anback anback commented Aug 4, 2014

+1

@nchicong
Copy link

@nchicong nchicong commented Oct 14, 2014

+1

@dirkluijk
Copy link

@dirkluijk dirkluijk commented Oct 22, 2014

👍 I also want to use it with Selenium WebDriver, in which situation any performance improvement would be great, so that it would be more suitable then Firefox.

@professorplumb
Copy link

@professorplumb professorplumb commented Oct 26, 2014

👍

3 similar comments
@vbauer
Copy link

@vbauer vbauer commented Nov 13, 2014

👍

@moiseevigor
Copy link

@moiseevigor moiseevigor commented Dec 3, 2014

👍

@JLarky
Copy link

@JLarky JLarky commented Dec 11, 2014

👍

@allquixotic
Copy link

@allquixotic allquixotic commented Dec 11, 2014

@matanster Uh, SlimerJS is already built on the Gecko codebase....

@rmsphd
Copy link

@rmsphd rmsphd commented Apr 18, 2015

👍

4 similar comments
@ghost
Copy link

@ghost ghost commented Apr 25, 2015

👍

@nicolsondsouza
Copy link

@nicolsondsouza nicolsondsouza commented Jul 2, 2015

👍

@hadim
Copy link

@hadim hadim commented Jul 4, 2015

+1

@nicolai86
Copy link

@nicolai86 nicolai86 commented Jul 5, 2015

👍

@ecLAllanon
Copy link

@ecLAllanon ecLAllanon commented Dec 4, 2015

👍

1 similar comment
@tostercx
Copy link

@tostercx tostercx commented Jan 6, 2016

+1

@devanshah1
Copy link

@devanshah1 devanshah1 commented Apr 8, 2016

+1

1 similar comment
@boyomarinov
Copy link

@boyomarinov boyomarinov commented Apr 22, 2016

+1

@laurentj
Copy link
Owner Author

@laurentj laurentj commented May 2, 2016

Please stop comments with "+1". Instead use the "add your reaction" (smiley) button above the issue description ;-) Thank you

@brendandahl
Copy link

@brendandahl brendandahl commented Feb 18, 2017

For those interested, I've started some work on this in https://bugzilla.mozilla.org/show_bug.cgi?id=1338004 . It's still in a very early stage, but I have a simple slimer snapshot a page script working. Also, very early perf show's this shaving off around .1-.2s on this very simple script and I also see a react benchmark go from ~24fps to 40fps. If anyone has some exceptionally slow slimer tests, I'd be curious to see them.

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Feb 20, 2017

@brendandahl this is a very good news! However SlimerJS has some issues with Fx>52 and I cannot launch your test (even with an "official" nightly). I will fix this issue and I will test your build :-)

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Mar 10, 2017

@brendandahl Awesome! It works with the latest release of SlimerJS, 0.10.3!! (after changing the max version to 54.* into application.ini)

export DISPLAY=77
export MOZ_HEADLESS=1
export SLIMERJSLAUNCHER=/home/laurent/tmp/fxheadless/firefox 
slimerjs test.js
@jefleponot
Copy link
Contributor

@jefleponot jefleponot commented Mar 10, 2017

@laurentj ,

Just On question 👍
SlimerJS 0.10.3 is recommended to test Firefox between 38 to 54 -> ok

Which version do you recognize to test Firefox between 17 to 37 ?

I ask to you because of casperJS tests... Have you Long Term Stable version ?

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Mar 10, 2017

@jefleponot SlimerJS 0.9.x

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Apr 11, 2017

The latest nightlies of Firefox support the headless mode, even if there are some crashes with some unit tests of SlimerJS.

@jefleponot
Copy link
Contributor

@jefleponot jefleponot commented Apr 11, 2017

HI @laurentj
Thanks for news
Do you have a command line example ?

Does that mean that SlimerJS will not be maintain anymore ?

Thanks in advance

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Apr 11, 2017

No more crashes with the build 2017-04-11 of Firefox nightly 55.0a1

@laurentj
Copy link
Owner Author

@laurentj laurentj commented Apr 11, 2017

@jefleponot

Do you have a command line example ?

see one of my latest comment here

Does that mean that SlimerJS will not be maintain anymore ?

Why ??? Firefox does not have features of SlimerJS. But SlimerJS needs Firefox to run.

@brendandahl
Copy link

@brendandahl brendandahl commented Apr 11, 2017

Just to note, headless is currently only supported on Linux. If you want to follow along for other platforms see:
MacOS:
https://bugzilla.mozilla.org/show_bug.cgi?id=1355147
Windows:
https://bugzilla.mozilla.org/show_bug.cgi?id=1355150

@mykmelez
Copy link

@mykmelez mykmelez commented Jun 15, 2017

Note that Firefox 55 (currently in beta) will support headless browsing on Linux, per this announcement:

https://groups.google.com/forum/#!topic/firefox-dev/TEhvuBXcJCg

Beta builds are available from https://www.mozilla.org/en-US/firefox/channel/desktop/. As @brendandahl noted, Windows and macOS support is still in progress, and you can follow along on the bugs he references.

@puravida
Copy link

@puravida puravida commented Jun 16, 2017

Aside from the obvious resource savings of running headless, are there other benefits as well? Would this be more or less likely to crash (or properly load complex/problematic pages) than running a full instance? Are there other downsides to running Firefox headless? If this has all been discussed, ad naseum, before then please do point me there with a link. ;)

In the past, I spent more than 500 hours evaluating every available script, open source and paid, for capturing screenshots to use as a backup for my own, proprietary method. From that testing, I concluded that running headless had some serious issues with certain kinds of pages, such as pages that format based on viewport dimensions and such. The problem is that running headless meant there was no viewport dimensions reported, so some pages were crunched or had elements extending off the page (improper page height) or the backgrounds did not extend fully, etc. This is a major reason that CutyCapt, PhantomJS, and similar failed to meet my needs as a backup solution.

Lastly, has anyone run any metrics yet to judge the gains (less ram, less cpu) from running headless?

Just curious... Thanks!

@drasill
Copy link

@drasill drasill commented Jun 16, 2017

I don't know how it works on other platforms, but the major point on linux is that non-headless needs a Xorg server started, which is quite complicated on servers / docker instances.

@puravida
Copy link

@puravida puravida commented Jun 16, 2017

@drasill Thanks for the note. You are so right. Managing the "window manager" is one of my top headaches, and I would LOVE to eliminate that piece of the process.

However, the viewport issues are too problematic, since I have to render ANY page as accurately as possible. I've captured screenshots of more than 500 million URLs (conservatively), and I think I've probably actually seen a million of them, haha! So, I know that the other headless methods were only good for specific use-cases where the web page(s) was/were known to be compatible.

However, as technology and methods advance, it would be GREAT to see a headless implementation that could overcome those limitations. :)

@mykmelez
Copy link

@mykmelez mykmelez commented Jun 16, 2017

Aside from the obvious resource savings of running headless, are there other benefits as well? Would this be more or less likely to crash (or properly load complex/problematic pages) than running a full instance? Are there other downsides to running Firefox headless? If this has all been discussed, ad naseum, before then please do point me there with a link. ;)

I don't expect it to be less likely to crash, as it still exercises all of the rendering pipeline at the moment. However, future optimizations might avoid rendering and compositing (until you take a screenshot, anyway), which would reduce crashes in that code and improve performance. (Out-of-process compositing will also help with crashiness, for both headed and headless Firefox.)

Headless mode may also be more consistent, especially if you script Firefox on multiple platforms, since parts of the headless "widget backend" are cross-platform. A possible downside of that is that headless doesn't produce exactly the same results as headed with a platform-specific widget backend.

The problem is that running headless meant there was no viewport dimensions reported, so some pages were crunched or had elements extending off the page (improper page height) or the backgrounds did not extend fully, etc. This is a major reason that CutyCapt, PhantomJS, and similar failed to meet my needs as a backup solution.

Headless mode for Firefox should report viewport dimensions (i.e. window.innerWidth/innerHeight) correctly for any window you open (or that SlimerJS opens for you). Screen dimensions are hardcoded to 1366x768, per HeadlessScreenHelper::GetScreenRect, but can be configured at runtime via the MOZ_HEADLESS_WIDTH and MOZ_HEADLESS_HEIGHT environment variables.

@puravida
Copy link

@puravida puravida commented Jun 16, 2017

@mykmelez Nice. That is very insightful. Thank you.

Headless mode for Firefox should report viewport dimensions (i.e. window.innerWidth/innerHeight) correctly for any window you open (or that SlimerJS opens for you). Screen dimensions are hardcoded to 1366x768, per HeadlessScreenHelper::GetScreenRect, but can be configured at runtime via the MOZ_HEADLESS_WIDTH and MOZ_HEADLESS_HEIGHT environment variables.

This would be awesome! Other scripts claimed to send the viewport, but really did not. If slimerjs could hook into the ability to adjust, like via MOZ_HEADLESS_XXX, then that would be a game-changer.

Thanks for the insight...

@mykmelez
Copy link

@mykmelez mykmelez commented Aug 16, 2017

Support for Windows and Mac has since landed in Firefox 56, which is currently in Beta.

@laurentj Is there anything to change in SlimerJS code to resolve this issue, or is it sufficient to document that you can run SlimerJS in headless mode on Firefox 56+ by specifying the right Firefox command-line flag/environment variable?

@birtles
Copy link
Contributor

@birtles birtles commented Aug 21, 2017

I suppose at very least the MaxVersion in application.ini needs to be updated?

@laurentj laurentj added this to the SlimerJS 1.0 milestone Oct 4, 2017
@laurentj laurentj closed this in 641eae8 Oct 13, 2017
@brendanhowell
Copy link

@brendanhowell brendanhowell commented Oct 16, 2017

This is awesome! Will I be able to use my old scripts that render to PDF or is this only for PNG?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.