Hpricot is a very flexible HTML parser, based on Tanaka Akira’s HTree and John Resig’s jQuery, but with the scanner recoded in C. I’ve borrowed (what I believe to be) the best ideas from these wares to make Hpricot heaps of fun to use.
# load the Family guy's home page
require "hpricot" # need hpricot and open-uri
doc = Hpricot(open("http://www.fox.com/familyguy/index.htm"))
# change the CSS class on list element ul
# remove the header
# print the altered HTML