Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
HTML Parser for RubyA Fast, Enjoyable
Hpricot is a very flexible HTML parser, based on Tanaka Akira’s HTree and John Resig’s jQuery, but with the scanner recoded in C. I’ve borrowed (what I believe to be) the best ideas from these wares to make Hpricot heaps of fun to use.
# load the Family guy's home page require "hpricot" # need hpricot and open-uri require "open-uri" doc = Hpricot(open("http://www.fox.com/familyguy/index.htm")) # change the CSS class on list element ul (doc/"ul.site-nav").set("class", "new-site-nav") # remove the header (doc/"#header").remove # print the altered HTML puts doc
A Proper Start
- Installing Hpricot, both stable and development versions.
- An Hpricot Showcase with recipes for most common things.
- Wonder what’s happening? Check the commit list.
- See hpricot.com for interactive demos
- Hpricot mailing list: send an email to firstname.lastname@example.org for information