RackTest::Node#visible_text is slow #1307

Closed
pcreux opened this Issue May 15, 2014 · 11 comments

Projects

None yet

2 participants

@pcreux
pcreux commented May 15, 2014

Hello!

Running the following code on a webpage containing about 1000 lines take 0.3 seconds on my machine:

expect(page).to have_text "Download"

While this take "no" time:

text = Nokogiri::XML::parse(page.body).xpath("//body").text
expect(text).to have_text "Download"

I nailed down the performance bottleneck to RackTest::Node#visible_text.

class Capybara::RackTest::Node < Capybara::Driver::Node
  def visible_text
    now = Time.now
    r = Capybara::Helpers.normalize_whitespace(unnormalized_text)
    puts "Time in rack_text/node#visible_text #{Time.now - now}"
    r
  end
  # ...

would return:

Time in rack_text/node#visible_text 0.3498

I don't have time to dig deeper into it right now. Is this a known issue? Anybody ran into it before?

I'm using Ruby 2.1.2 with github capybara.

@pcreux
pcreux commented May 17, 2014

I worked more on this tonight.

The method Capybara::Node::Simple#visible? is called ~1475 times when running page.text. 0.47 seconds are spent running the xpath query of that method in total.

The xpath query seems being the issue here. Replacing it with a good old true makes page.text run in 0.04 seconds.

@twalpole
Collaborator

visible? is called on every element to check whether or not that element is actually visible (with caveats) and therefore should be checked for the desired text. For performance In your tests you could be checking the portion of the page you expect the text to be in for that text instead of the entire page, and thereby running visible? for a lot fewer elements. On the other hand if you don't care about visibility you could just do expect(page).to have_text(:all, 'Download')

@twalpole twalpole closed this May 17, 2014
@pcreux
pcreux commented May 17, 2014

Thanks Thomas,

I was just surprised by how slow it is. I thought this performance issue
appeared recently as I did not run into it before.

Using rack-test, capybara checks that the element is visible using inlined
styles. I assume that most elements are made invisibles via CSS classes
these days ("hide" for instance) thus making this check not reliable. I
don't think that the resulting decrease in performance is worth it though.
What is your opinion on this? Would it be worth escalating this to the
capybara mailing list?

Have a great week-end,

φ http://pcreux.com

On Sat, May 17, 2014 at 1:46 PM, Thomas Walpole notifications@github.comwrote:

visible? is called on every element to check whether or not that element
is actually visible (with caveats) and therefore should be checked for the
desired text. For performance In your tests you could be checking the
portion of the page you expect the text to be in for that text instead of
the entire page, and thereby running visible? for a lot fewer elements. On
the other hand if you don't care about visibility you could just do
expect(page).to have_text(:all, 'Download')


Reply to this email directly or view it on GitHubhttps://github.com/jnicklas/capybara/issues/1307#issuecomment-43423660
.

@twalpole
Collaborator

@pcreux - Yes, when using racktest it checks using inline styles (since its not reasonable to process and interpret css in capybara - hence the "with caveats" in my previous post). We try to provide as much parity as reasonably possible between different drivers, and reasonable detection of visible text is part of that parity. I don't think most people are paying a huge performance hit for it, since they really shouldn't be checking a whole page for the text. If you know you expect text in a message area, then find and check that message area for the text - then the performance hit is minimal. That being said, if you can speed up the visibility check query, without reducing the current functionality, then PRs would gladly be looked at (for instance their really isnt the need to check ancestors when walking the entire tree, since its already done). If you really don't want to have visibility checking in your tests then you can set Capybara.ignore_hidden_elements and Capybara.visible_text_only to false, or as stated before pass :all to the have_text matcher, and have basically 0 performance hit

@pcreux
pcreux commented May 19, 2014

Hey Thomas,

I gave it a shot without much success. The #visible? method is not always called while traversing the tree so it has to check ancestors-or-self.

@twalpole
Collaborator

@pcreux give my visible_optimization branch a try - https://github.com/twalpole/capybara/tree/visible_optimization - and let me know if that makes a noticeable difference - if so I'll look at pulling it in

@pcreux
pcreux commented May 20, 2014

I confirm a 100% speed increase on my side:

twalpole/capybara @visible_optimization
expect(page).to have_content
13.6 (±22.1%) i/s - 65 in   5.070680s
jnicklas/capybara @master
expect(page).to have_content
5.4 (±18.5%) i/s -  27 in   5.089715s

On Mon, May 19, 2014 at 11:31 AM, Thomas Walpole
notifications@github.comwrote:

@abotalov https://github.com/abotalov It's possible I screwed something
up there, but I don't think so - what makes you think it would?
unnormalized_text aborts traversing through children if a node is hidden -
so it would never look at "Invisible text"


Reply to this email directly or view it on GitHubhttps://github.com/jnicklas/capybara/issues/1307#issuecomment-43539538
.

@pcreux
pcreux commented May 20, 2014

It is still an order of magnitude slower than asserting against the page body:

expect(page.text).to include
30784.1 (±16.3%) i/s -     144342 in   5.007497s
@twalpole
Collaborator

@pcreux - yes its going to be slower, it needs to check for visibility which is going to take time. Feel free to attempt further speedups - without losing the current functionality

@pcreux
pcreux commented May 20, 2014

Yes, thank you. I'll see if I can run the xpath query once, then traverse
the tree instead of traversing the tree and run the xpath query on every
single node.

φ http://pcreux.com

On Tue, May 20, 2014 at 9:25 AM, Thomas Walpole notifications@github.comwrote:

@pcreux https://github.com/pcreux - yes its going to be slower, it
needs to check for visibility which is going to take time. Feel free to
attempt further speedups - without losing the current functionality


Reply to this email directly or view it on GitHubhttps://github.com/jnicklas/capybara/issues/1307#issuecomment-43648827
.

@twalpole
Collaborator

@pcreux try pr #1309 first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment