-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn SlimerJS into an headless browser #80
Comments
I'd love it :) |
+1 |
2 similar comments
+1 |
+1 |
I started to work on this feature by using this new API. Unfortunately, there are still some huge issues in its implementation, for our case, which prevent us to use it. We have to wait after improvements on this new API. |
Has anyone communicated with Mozilla regarding the limitations of the createWindowlessBrowser API? Are they considering addressing it so that we don't need an X11 server (headless or otherwise) to run SlimerJS on a GNU/Linux dedicated server? It probably will not be improved unless we let them know our needs... |
What may actually be the downsides of just using xvfb for obtaining headlessness? Isn't it just a harmless way to go, emulating a screen and letting Gecko just do its stuff as it normally would, or has anyone run into specific problems, or aware of problems inherent to this combination? Here is the code, which you may find familiar from the documentation:
|
The downsides of having it headed on what should be a headless box are:
Startup time is not very much of a concern; rather, the main concern is being able to quickly and efficiently start up a headless browser environment, without needing to make sure an X11 server of some kind is running and the DISPLAY environment variable is set, and without the associated performance overhead. It would be ideal if you could take render screenshots "on-demand", but only have the actual rendering to a pixmap take place in the event that you actually request it from code, instead of having it go on all the time. You can achieve much shorter runtimes of automated tests by making all the drawing operations a no-op, and if you are doing this on a large enough scale, it can actually save significant heat, power, CI test time, etc. |
Thanks for the deliberation. But wouldn't accomplishing exactly that require changing Gecko itself? |
I think Laurent's hope was that the nsIAppShellService::createWindowlessBrowser() method would implement the changes in Gecko itself that we actually need in order to achieve this functionality. At the very least, even if it did rendering in RAM for each frame, it wouldn't require it to be rendered in an X server, which requires inter-process communication (sending X11 drawing commands over a UNIX domain socket is how it's currently implemented). It would even be a considerable speed-up, for small systems running SlimerJS, to just do all the rendering in-process, compared to the overhead of the IPC. The ideal headless DOM implementation here would keep track of things like dimensions, JavaScript state, CSS state, and basically anything you can read back from the DOM or from the JavaScript engine, but it wouldn't perform actual rendering. As far as I understand it, you only need a model of the state machine in order to perform automated tests faithfully as a normal web browser would execute them; you don't need to actually render it. ...And rendering is fairly expensive, especially in pure software as Xvfb does (the CPU and performance cost of rendering is offset significantly by offloading to the GPU of various operations on a modern desktop, but of course this consumes its fair share of power and heat because you're still doing a lot of work to render the frames). To see how rendering is expensive, imagine doing the following in a tight loop: create a bunch of iframes with random dimensions (width x height); fill them with a random color; then remove them; and repeat. Regardless of whether you are rendering or not, you have to keep track of which elements are on the page, their size, what their contents are, what their properties are, etc. But, if you're doing rendering, you also have to figure out pixel RGBA values for the entire "frame" of the iframe, which includes its border, its internal contents, etc., drawing scrollbars as needed, and then you have to re-render any elements on the page that are either interacting with the iframe (above or below it, transparent stuff, etc), or things that got moved (up/down/left/right) by the new presence of the iframe. Without rendering, all of that stuff, you can simply skip over, because it doesn't matter. It's enough to simply know that the contents of the iframe is a red fill, for instance. An example of a JS and DOM engine that does a fair bit of JavaScript and webpage processing without any rendering is jsdom, which runs on NodeJS. The support for web standards isn't complete there, though, compared to the completeness that we see in Gecko and WebKit today. It's a really hard problem to solve to get all the web standards implemented and nailed down. So the next step for SlimerJS would be to leverage Gecko's excellent support of web standards, but find a way to gut the rendering completely. This could have a real impact on, for instance, the hardware spec requirements of a large continuous integration server where each source code checkin requires running numerous automated unit tests using SlimerJS. As a bonus -- and I think this is either required or highly recommended by the WebDriver interface -- it would be great if SlimerJS could still do normal rendering to an image on demand, which would basically involve flipping a switch to tell Gecko, "hey, start rendering now!" and then capture the output of that rendering into a bitmap, and then do a little compression (JPEG, PNG, etc) and provide it to the programmer as a binary output stream in their programming language of choice, either via WebDriver or the Phantom API. One thing we should keep in mind is that, since SlimerJS is very unlikely to take advantage of hardware acceleration when it's used, it is not useful to benchmark the performance of a website based on SlimerJS's performance. To do that, you would need to run a production release of Firefox or Chrome on Windows, Mac, and ideally an iPad and a Nexus 7, and count the amount of time it takes for page loads, scrolling, etc. and make sure it's smooth. You simply can't judge fairly from SlimerJS, whether you do rendering or not: if you don't do rendering, then your performance is going to be unrealistically fast; if you do rendering, it's in software, and software rendering is always slow, so you won't be looking at a realistic render time anyway (since almost all clients these days will have hardware acceleration of some sort). |
Yes I concur it's better to drive a popular browser engine rather than a weak imitation such as the one you mention. BTW, Chrome moved from WebKit to blink, and I think apple is developing webkit2 to supersede, so WebKit may possibly lose relevance over time in this context of headlessness. Anyway, I am not sure however what we include in 'rendering' (in my own fault) here on this thread, so I'm not sure how can html5 canvas drawing operations be simulated without calculating what's on each pixel, in case run-time code relies on getting pixels back e.g. for 'making' a screenshot. Or, say your code draws some stuff on html5 canvas and you simulate a click and need your code to determine whether the click location is inside a shape or not (sorry if this sounds a bit like game development but it can occur also in 'normal' applications). Could you clarify just a bit about the relationship between 'only having a model' and the actual computation of pixel values based on abstract drawing instructions? Or did you maybe mean that there'd be a queue of rendering instructions that will be left unhandled until every moment when an instruction to fetch the state of the virtual display is encountered? Thanks! |
Actually, with something like HTML5 canvas, you may indeed be correct that the entire model would need to be rendered under some circumstances, in order to compute per-pixel values so that they can be read back. In the ordinary context of e.g. HTML4 forms, rendering can almost always be skipped entirely. Your last paragraph brought up the interesting idea of applying "lazy evaluation" (a concept from functional programming) to the headless rendering model. This would be a great optimization. In the case of taking a screenshot, the entire evaluation chain for rendering the current state would be called. In the case of a certain API reading back a specific pixel value, the renderer might be able to figure out a minimal set of drawing operations that need to be invoked in order to determine that pixel's value. But when neither of these conditions are hit, you'd simply have a set of potentially-applicable drawing operations in a queue that could be flushed when the state is updated via DOM manipulation or similar. The drawing operations in the queue would be something like a set of function pointers with their applicable arguments, which, in "normal" operation, would never actually get invoked, but if the drawing is required for some functionality, then the queue would get emptied and each function executed. Not sure if something like this is already implemented anywhere (PhantomJS?) or if it would be such a huge change in code that no one has attempted it just for the sake of optimizing headless browser implementations. Obviously, from the perspective of "if it works, use it", SlimerJS with Xvfb or PhantomJS without Xvfb is more than fine, but in that case, it's equally easy to run Firefox in Xvfb and automate it with Selenium WebDriver. This works for code that is not particularly performance-sensitive, or where a little bit extra CPU/RAM is not a hindrance on the project. But I could certainly see render as being a potential bottleneck on something like an ARM SoC, where a "pure" model-tracking DOM/JS implementation would fly, but if you add in drawing commands for every frame, it would be as slow as the browser in a low-end Android phone. To sum it up, since SlimerJS is ultimately taking the performance and memory cost of rendering due to its headed-ness, I see absolutely no reason for this project to be used in lieu of Firefox itself, unless your goal is to use a large amount of code written for PhantomJS's API and run it on Gecko. I'm looking at this from the perspective of someone who has never written a single line of PhantomJS API, though; I always use Selenium WebDriver with PhantomJS/SlimerJS. Firefox, Chrome, IE, and PhantomJS all support WebDriver, as does SlimerJS, so it's really the most flexible browser automation solution we have. SlimerJS is not going to be noticeably faster or lighter-weight or render noticeably differently than Firefox until its rendering pipeline can be turned off when it's not needed, hence why I can't really see what the use case is right now. |
+1 |
+1 |
3 similar comments
+1 |
+1 |
+1 |
👍 I also want to use it with Selenium WebDriver, in which situation any performance improvement would be great, so that it would be more suitable then Firefox. |
👍 |
3 similar comments
👍 |
👍 |
👍 |
@matanster Uh, SlimerJS is already built on the Gecko codebase.... |
👍 |
4 similar comments
👍 |
👍 |
+1 |
👍 |
👍 |
1 similar comment
+1 |
+1 |
1 similar comment
+1 |
Please stop comments with "+1". Instead use the "add your reaction" (smiley) button above the issue description ;-) Thank you |
For those interested, I've started some work on this in https://bugzilla.mozilla.org/show_bug.cgi?id=1338004 . It's still in a very early stage, but I have a simple slimer snapshot a page script working. Also, very early perf show's this shaving off around .1-.2s on this very simple script and I also see a react benchmark go from ~24fps to 40fps. If anyone has some exceptionally slow slimer tests, I'd be curious to see them.
|
@brendandahl this is a very good news! However SlimerJS has some issues with Fx>52 and I cannot launch your test (even with an "official" nightly). I will fix this issue and I will test your build :-) |
@brendandahl Awesome! It works with the latest release of SlimerJS, 0.10.3!! (after changing the max version to 54.* into application.ini)
|
Just On question 👍 Which version do you recognize to test Firefox between 17 to 37 ? I ask to you because of casperJS tests... Have you Long Term Stable version ? |
@jefleponot SlimerJS 0.9.x |
The latest nightlies of Firefox support the headless mode, even if there are some crashes with some unit tests of SlimerJS. |
HI @laurentj Does that mean that SlimerJS will not be maintain anymore ? Thanks in advance |
No more crashes with the build 2017-04-11 of Firefox nightly 55.0a1 |
see one of my latest comment here
Why ??? Firefox does not have features of SlimerJS. But SlimerJS needs Firefox to run. |
Just to note, headless is currently only supported on Linux. If you want to follow along for other platforms see: |
Note that Firefox 55 (currently in beta) will support headless browsing on Linux, per this announcement: https://groups.google.com/forum/#!topic/firefox-dev/TEhvuBXcJCg Beta builds are available from https://www.mozilla.org/en-US/firefox/channel/desktop/. As @brendandahl noted, Windows and macOS support is still in progress, and you can follow along on the bugs he references. |
Aside from the obvious resource savings of running headless, are there other benefits as well? Would this be more or less likely to crash (or properly load complex/problematic pages) than running a full instance? Are there other downsides to running Firefox headless? If this has all been discussed, ad naseum, before then please do point me there with a link. ;) In the past, I spent more than 500 hours evaluating every available script, open source and paid, for capturing screenshots to use as a backup for my own, proprietary method. From that testing, I concluded that running headless had some serious issues with certain kinds of pages, such as pages that format based on viewport dimensions and such. The problem is that running headless meant there was no viewport dimensions reported, so some pages were crunched or had elements extending off the page (improper page height) or the backgrounds did not extend fully, etc. This is a major reason that CutyCapt, PhantomJS, and similar failed to meet my needs as a backup solution. Lastly, has anyone run any metrics yet to judge the gains (less ram, less cpu) from running headless? Just curious... Thanks! |
I don't know how it works on other platforms, but the major point on linux is that non-headless needs a Xorg server started, which is quite complicated on servers / docker instances. |
@drasill Thanks for the note. You are so right. Managing the "window manager" is one of my top headaches, and I would LOVE to eliminate that piece of the process. However, the viewport issues are too problematic, since I have to render ANY page as accurately as possible. I've captured screenshots of more than 500 million URLs (conservatively), and I think I've probably actually seen a million of them, haha! So, I know that the other headless methods were only good for specific use-cases where the web page(s) was/were known to be compatible. However, as technology and methods advance, it would be GREAT to see a headless implementation that could overcome those limitations. :) |
I don't expect it to be less likely to crash, as it still exercises all of the rendering pipeline at the moment. However, future optimizations might avoid rendering and compositing (until you take a screenshot, anyway), which would reduce crashes in that code and improve performance. (Out-of-process compositing will also help with crashiness, for both headed and headless Firefox.) Headless mode may also be more consistent, especially if you script Firefox on multiple platforms, since parts of the headless "widget backend" are cross-platform. A possible downside of that is that headless doesn't produce exactly the same results as headed with a platform-specific widget backend.
Headless mode for Firefox should report viewport dimensions (i.e. window.innerWidth/innerHeight) correctly for any window you open (or that SlimerJS opens for you). Screen dimensions are hardcoded to |
@mykmelez Nice. That is very insightful. Thank you.
This would be awesome! Other scripts claimed to send the viewport, but really did not. If slimerjs could hook into the ability to adjust, like via MOZ_HEADLESS_XXX, then that would be a game-changer. Thanks for the insight... |
Support for Windows and Mac has since landed in Firefox 56, which is currently in Beta. @laurentj Is there anything to change in SlimerJS code to resolve this issue, or is it sufficient to document that you can run SlimerJS in headless mode on Firefox 56+ by specifying the right Firefox command-line flag/environment variable? |
I suppose at very least the |
This is awesome! Will I be able to use my old scripts that render to PDF or is this only for PNG? |
First solution : port the patch of bug 446591 into XulRunner (so we need to compile and provide our own XulRunner). It will be a huge work.
Second solution: since Gecko 23, it seems there is a new method nsIAppShellService::createWindowlessBrowser(). We should investigate if we could use this method to load our webpages.
The text was updated successfully, but these errors were encountered: