Wrapper for the nekohtml parser.
Ruby
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
lib
test
.document
.gitignore
LICENSE
README.rdoc
Rakefile
VERSION

README.rdoc

nekohtml

A thin wrapper around NekoHTML as provided by Celerity.

At the moment this gem depends on Celerity to provide the nekohtml jar. Once I can figure out how to make this optional, I'll provide it here if the celerity gem isn't here at install time.

Usage

jruby-1.4.0 > require 'nekohtml'
 => true 
jruby-1.4.0 > html= "<html><head><title>Title of Majesty</title></head></html>" 
 => "<html><head><title>Title of Majesty</title></head></html>" 
jruby-1.4.0 > doc= Nekohtml.parse(html)
 => #<Nekohtml::HtmlDocument:0x3f70119f ... >
jruby-1.4.0 > doc.search("//TITLE")
 => #<Nekohtml::HtmlNodeList:0x1a7b5617 ... >
jruby-1.4.0 > _.first.text
 => "Title of Majesty"

Note that the xpath must use all-caps for tag names. This is a limitation of NekoHTML; I may plunder Celerity's source to see how they/HtmlUnit handle it but for now, that's what you've got.

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright

Copyright © 2010 Alex Young. See LICENSE for details.