public
Description: Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, an array with all the links, all the images in it, etc.
Homepage:
Clone URL: git://github.com/jaimeiniesta/metainspector.git
Click here to lend your support to: metainspector and make a donation at www.pledgie.com !
metainspector / CHANGELOG.rdoc
100644 32 lines (27 sloc) 1.366 kb

1.1.4

4th June, 2009

  • Simplified code: removed address setter, just instantiate a new MetaInspector object if you need to scrape a different URL

1.1.3

22nd May, 2009

  • Simplified code: now there’s no need to call page.scrape!, just initialize it and go directly to page.address, page.title, page.description, page.keywords or page.links, the page will be scraped on the fly
  • Removed page.scraped?, page.scrape!, page.full_doc and page.scraped_doc
  • Added page.document, which returns the whole document scraped with nokogiri

1.1.2

19th May, 2009

  • Using nokogiri instead of hpricot
  • Recover from exceptions

1.1.1

14th May, 2009

  • Simplified scrape method, leaves as nil the metadata not found, to be able to distinguish between a not found element or a found element that was empty.
  • Links array is initialized as an empty array

1.1.0

14th May, 2009

  • Rewritten to use instance methods instead of class methods.
  • Easier interface, provides #new(address), #scrape!, #scraped?, #title, #description, #keywords, and #links instance methods
  • Added #full_doc method to access the temporary file that containts the raw html fetched
  • Added #scraped_doc method to get the whole Hpricot scraped doc
  • Added tests
  • Added samples, including a basic scraping and a little spider

1.0.3

27th June, 2008

  • Initial published version.