Benchmarks are misleading #19

twalpole · 2019-01-16T03:41:12Z

After looking at the reason for the speed improvements the benchmarks are misleading/invalid. The majority of the speedup comes because cuprite is set Capybara.default_max_wait_time to 0 for a large part of the Capybara specs, which means a large number of them aren't actually testing anything and makes the timing non-comparable to Selenium headless Chrome. If Selenium headless Chrome is run with the same modified wait times on Travis it's time drops from around 14 minutes to under 10 (but again it's not valid to do so since it means a large number of tests aren't actually doing anything). Assuming a similar percentage speedup for the invalid tests, running Selenium headless Chrome on the hardware you benchmarked it on should be in the 7 minute range just like cuprite

route · 2019-01-16T12:41:43Z

@twalpole so then cuprite vs poltergeist is only valid

route · 2019-01-16T15:15:02Z

@twalpole in fact I don't see where it's set for selenium as where it's set in poltergeist/cuprite to 0

twalpole · 2019-01-16T16:56:07Z

@route Capybara resets the time to 1 at https://github.com/teamcapybara/capybara/blob/master/lib/capybara/spec/spec_helper.rb#L26 which is called after every test. Cuprite is resetting that to 0 in a before block at - https://github.com/machinio/cuprite/blob/master/spec/spec_helper.rb#L122 - which technically means that it's not actually valid to claim compatibility with Capybara since tests aren't actually using the wait times Capybara expects (same with Poltergeist which I had not noticed does the same thing). It would be interesting to see what the timing is for Cuprite on your hardware without the wait time time since there really isn't anything that should make it much faster than selenium locally.

route · 2019-01-16T19:57:47Z

@twalpole oh wow I didn't notice it either even in poltergeist :) haha) ok let's see what we have

route · 2019-01-16T20:07:32Z

If I remove this line from spec helper I get for cuprite:

Finished in 8 minutes 44 seconds (files took 1.22 seconds to load)
1533 examples, 0 failures, 147 pending

yea that's more close to selenium:

Finished in 9 minutes 3 seconds (files took 5.98 seconds to load)

twalpole · 2019-01-16T20:12:00Z

That's more like I would expect with the tests being skipped, etc -- they really should be approximately equal in speed, when run with the same settings, just with Cuprite able to support more features.

route · 2019-01-16T20:27:07Z

with poltergiest being on the 3rd place I'm surprised

Finished in 11 minutes 49 seconds (files took 0.54019 seconds to load)

twalpole · 2019-01-16T20:28:31Z

Not really a surprise -- the Capybara test suite is not like a real projects tests suite. It purposely does a lot of things a user would/should not do to test edge case behaviors. This means any timings for its test suite really aren't relevant to real world project timing.

route · 2019-01-16T20:33:02Z

yea but still all three in the same environment right? but phantomjs is being slower than chrome

twalpole · 2019-01-16T20:35:05Z

yea -- chrome has come a long way in the time while no phantoms development occurred -- speedup in the browser (and headless mode) should have made it similar in speed when a large part of the Capybara test suite slowness is it intentionally waiting for things to happen/not happen.

route · 2019-01-17T08:51:59Z

Updated README, thanks for pointing this out!

lawso017 · 2020-01-26T21:25:17Z

@route Thank you for a fantastic library -- we have been working on upgrading a large Rails app from 4.2 and capybara-webkit to 5.0, and have found Cuprite/Ferrum to be pretty close to a drop-in replacement for capybara-webkit. We use too many JS features for selenium, and Apparition started with a ton of test failures mostly due to timing issues.

The one issue I'm trying to understand is why our test suite has slowed by 50-100% as we re-enter the modern era of browser-based testing. We have ~2,000 tests, and with Capybara 2.18/capybara-webkit/Ruby 2.5 we ran in ~20 minutes.

With Capybara 3.30/Cuprite using headless Chrome and a whitelist, we started at 40 minutes with Ruby 2.5. Upgrading to Ruby 2.6 improved that to 33 minutes, but that's still a 50% performance penalty.

Benchmarks have been hard to come by online; are you aware of any reason why headless CDP would be significantly slower than the relatively ancient capybara-webkit? I was trying to dig into the CDP implementation of Ferrum versus Apparition to better understand how the two libraries deal with waiting for asynchronous browser events -- it seems that Apparition is not working for us because it is trying to be too fast, and not waiting long enough for basic things like the page's application.js file to be loaded... while Cuprite passes all of our capybara-webkit tests albeit relatively slowly.

Any thoughts greatly appreciated! Trying to have the best of both worlds :-)

route · 2020-01-27T08:06:36Z

@lawso017 You are correct about the waiting, Cuprite indeed waits for some events to happen and only then proceeds and this happens for many methods not only goto. If you click or evaluate JS this can start page navigation and of course we have to wait until page fully loads.

As for speed I guess we had the same issues after switching from Poltergeist. I've seen that time almost has doubled (if you run all your tests subsequently) from 10m to 19m and investigated it with merging some improvements to Ferrum that worked. What I can say now is CDP as protocol is not slow in spite of many messages passing between client and server, I thought network interruption maybe a reason for slowing down tests but looks like it's insignificant either though has some impact. Comparing whole tests suits gave me conclusions that Poltergeist starts to speed up on resets between tests and subsequent requests to the application (which may involve cache?) but comparing tests one by one there's no clear winner Chrome as fast (or even faster sometimes) as Poltergeist.

Anyways after spending some time on speed improvements and comparing results on CircleCI with parallel builds the difference was only 1-3m in comparison to Poltergeist so we decided that modern browser is better than outdated one even though it is slower, but I'm afraid it's not that simple to fix, requires a lot of time and energy and maybe Chrome related which makes it even harder because they barely answer even on simple issues like ChromeDevTools/devtools-protocol#125 and ChromeDevTools/devtools-protocol#145

So for now I stopped investigating speed issues and started to work on features to make Cuprite/Ferrum to be the best tools to work with CDP in Ruby but only have 2 hands lol :)

I may revisit performance issue once again in the future after implementing important features.

lawso017 · 2020-01-29T03:07:51Z

@route thank you for that context -- I also noticed that disconnect between individual tests running quickly, but the overall test suite being relatively slow by comparison. That's interesting and will continue to keep an eye on that for future exploration! In the meantime we're also happy to be testing with a modern browser again.

route · 2020-01-29T11:59:29Z

@lawso017 Surprisingly I figured that Capybara.server = :puma adds 2.5 minutes to the build for our application, check if this is the case for you. I'm investigating it now. You may reduce your build time with :webrick lol

lawso017 · 2020-02-03T22:21:35Z

@route I have been unable replicate a speed improvement using Capybara.server = :webrick -- that slows it down by a couple minutes relative to puma in our environment.

I have observed something else of interest, though... we are building on CircleCI using two containers and I was seeing some sporadic failures due to timeouts with the default 5 sec browser timeout. As I increased the timeout, however, the rspec job became much slower... when profiling the run, it looks like increasing the timeout is causing a slower overall run for some reason.

Here's an example comparing two successful runs:

10 sec timeout:
Top 10 slowest examples (296.26 seconds, 29.8% of total time):
Top 10 slowest examples (247.2 seconds, 21.8% of total time):
=> 27.1 sec avg across the 20 slowest examples, 19:54 total time

15 sec timeout:
Top 10 slowest examples (368.18 seconds, 28.6% of total time):
Top 10 slowest examples (373.79 seconds, 30.0% of total time):
=> 37.1 sec avg across the 20 slowest examples, 22:49 total time

Looking at Ferrum, it seems like the key line is simply data = pending.value!(@browser.timeout) in browser/client.rb's command method.

It does not seem like increasing the timeout should reduce the responsiveness of the call to pending.value!, but that is what appears to be happening... and I've not used concurrent-ruby before.

I would have expected that increasing the timeout would allow for occasional slow responses without generating a Timeout error, but not in general result in overall slower performance. In my case, increasing the timeout makes our test suite take longer. A run with a 30 sec timeout topped out at 34:58 total time.

Curious if you've seen anything like that in your experience...

route · 2020-02-04T06:01:48Z

@lawso017 I haven't found the issue with Capybara.server but that means the issue is in our application then.

In Ferrum/Cuprite there are a few places that are related to timeout but in general if your test is passing then it usually means test is not properly written but it can be a bug somewhere. I've seen some cases even in our application I had to rewrite tests a bit but can't remember now. Run FERRUM_DEBUG=true bundle exec rspec spec/file for one of the suspicious tests with increased timeout and send me the log file to email, I'll save you some time I can find the issue pretty quick.

route closed this as completed Jan 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks are misleading #19

Benchmarks are misleading #19

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019

route commented Jan 16, 2019

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

twalpole commented Jan 16, 2019

route commented Jan 16, 2019

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 17, 2019

lawso017 commented Jan 26, 2020

route commented Jan 27, 2020

lawso017 commented Jan 29, 2020

route commented Jan 29, 2020

lawso017 commented Feb 3, 2020

route commented Feb 4, 2020

Benchmarks are misleading #19

Benchmarks are misleading #19

Comments

twalpole commented Jan 16, 2019 • edited Loading

route commented Jan 16, 2019

route commented Jan 16, 2019

twalpole commented Jan 16, 2019 • edited Loading

route commented Jan 16, 2019 • edited Loading

route commented Jan 16, 2019 • edited Loading

twalpole commented Jan 16, 2019

route commented Jan 16, 2019

twalpole commented Jan 16, 2019 • edited Loading

route commented Jan 16, 2019 • edited Loading

twalpole commented Jan 16, 2019 • edited Loading

route commented Jan 17, 2019

lawso017 commented Jan 26, 2020

route commented Jan 27, 2020

lawso017 commented Jan 29, 2020

route commented Jan 29, 2020

lawso017 commented Feb 3, 2020

route commented Feb 4, 2020

twalpole commented Jan 16, 2019 •

edited

Loading

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

twalpole commented Jan 16, 2019 •

edited

Loading

route commented Jan 16, 2019 •

edited

Loading

twalpole commented Jan 16, 2019 •

edited

Loading