Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

DocumentFragment#search and #xpath #370

Closed
flavorjones opened this Issue · 11 comments

3 participants

@flavorjones
Owner

DocumentFragment#search and #xpath should work properly. The current 'test_fragment_search' should be replaced with the following test cases:

  def test_fragment_css
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.css('#content').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

  def test_fragment_search
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.search('#content').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

  def test_fragment_xpath
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.xpath('.//*[@id="content"]').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

note that #331 and #357 might be related.

@SeanTAllen

I've implemented search w/ tests. However, things get weird with xpath. I emailed the list about that. Once that is worked out on the list, I'll submit the fixes for this plus for the bug that the fragment search uncovered.

@SeanTAllen

a couple of tests are missing... test_fragment_css and test_fragment_search dont check for depth searches of children. will add tests for those.

@SeanTAllen

should this apply to xpaths? if i ask for xpath search current node then searching children is wrong as it would be out of context. that would be an issue for search if it is an xpath.

@flavorjones
Owner

Sean Allen said on the mailing list:


I'm underway on this ( aka issue 370 ) and stumbled across something else I think might be a bug that is getting in the way of xpath support working.
I think xpath support is broken right now so I'm not going to be able to get this working without fixing the other issues...

I dont think this is issue 357 but might be related in some fashion.

require 'rubygems'
require 'nokogiri'

frag = Nokogiri::XML.fragment '<p id="header">header</p><p id="content">content</p><p id="footer">footer</p>'

puts "First we search for p: "
puts frag.xpath( ".//p" )

puts "Then we search for id header: "
puts frag.xpath( ".//*[@id='header']" )

puts "Ok... nothing there, lets see if we can find ids..."
puts frag.xpath( ".//@id" )

puts "Interesting... can we find all p's with an id?..."
puts frag.xpath( ".//p[@id]" )
puts "Nope..."

Is this a libxml bug?

more xpath weirdness...

require 'rubygems'
require 'nokogiri'

xml = "<p id='content'>hi</p>"

frag = Nokogiri::XML.fragment xml

puts "Lets find p..."
puts frag.xpath( ".//p" )
puts "Good.."

puts "Lets find @id..."
puts frag.xpath( ".//@id" )
puts "Good.."

puts "Lets find p on first child..."
puts frag.children.first.xpath( ".//p" )
puts "Not so good.."

puts "Lets find @id on first child..."
puts frag.children.first.xpath( ".//@id" )
puts "Interesting, something seems to be happening in Node.xpath..."
@flavorjones
Owner

Hi Sean,

I appreciate your work on this issue! If I don't get back to you right away, it's because I have a day job, not because I don't care. :)

There are definite weirdnesses going on. I need to track them down. If you want to continue working, please feel free, but understand that I can't give you good guidance without understanding the root issue, and if I understood that I'd fix it. :) :)

I will be working on this.

@SeanTAllen

I've gotten at least as far as knowing that the 'more xpath weirdness' is XPathContext related in some fashion but haven't gotten into the libxml code to try and see why... small demonstration in this gist ( I am incapable of using gh markdown for some reason ):

https://gist.github.com/663525

@SeanTAllen

Follow up question... I have search for css which was my primary immediate concern..
if I do a quick hack to keep xpaths processing 'in the old way' so that they don't get broken, can I pull request over the just search for css and get that in a hopeful soon version of nokogiri? The project I'm working on requires that fix to function so I'm highly interested in getting that out there asap.

I'm also going to need xpath support soon, so I'll keep banging away at that although I might be a little slow as I'm not real familiar with the code at this point ( but I am way more familiar than I was 24-48 hours ago ).

@flavorjones
Owner

If you need something immediately, my recommendation would be to monkeypatch NokogirI::XML::DocumentFragment with your current fix.

Please feel free to send pull requests.

@rtomayko

Another strange behavior that seems somewhat related:

require 'pp'
require 'nokogiri'

doc = Nokogiri::HTML::DocumentFragment.parse("hello world")
pp doc.search('text()')
# => []

doc = Nokogiri::HTML::DocumentFragment.parse("<div>hello world</div>")
pp doc.search('text()')
# => [#<Nokogiri::XML::Text:0x83423adc "hello world">]

DocumentFragment should probably have the same basic behavior as an element node when it contains text, yeah?

@flavorjones
Owner

@rtomayko - Yes, definitely related. You can work around by using more specific xpath:

doc = Nokogiri::HTML::DocumentFragment.parse("hello world")
pp doc.xpath('./text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]

doc = Nokogiri::HTML::DocumentFragment.parse("<div>hello world</div>")
pp doc.xpath('./div/text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]
pp doc.xpath('.//text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]
@flavorjones
Owner

Folding this into #572, the underlying issue is the same.

@flavorjones flavorjones closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.