Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Simple web scraping for Google Chrome.
branch: master

This branch is even with mnmldave:master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
psd
src
.gitignore
LICENSE.txt
README.md
Rakefile

README.md

Scraper

A Google Chrome extension for getting data out of web pages and into spreadsheets.

Usage

Highlight a part of the page that is similar to what you want to scrape. Right-click and select the "Scrape selected..." item. The scraper window will appear, showing you the initial results. You can export the table to by pressing the "Export to Google Docs..." button or use the left-hand pane to further refine or customize your scraping.

The "Selector" section lets you change which page elements are scraped. You can specify the query as either a jQuery selector, or in XPath.

You may also customize the columns of the table in the "Columns" section. These must be specified in XPath. You can specify names for columns if you would like.

Selecting the "Exclude empty results" filter will prevent any matches that contain no column values from appearing in the table.

After making any customizations, you must press the "Scrape" button to update the table of results.

Download

Download the extension from http://chrome.google.com/extensions/detail/mbigbapnjcgaffohmbkdlecaccepngjd.

Get the sources from https://github.com/mnmldave/scraper.

Building

You don't need to 'build' this extension per se. To test it out, you first need to navigate to chrome://extensions from Google Chrome then expand "Developer Mode". Click the "Load unpacked extension..." button and point it to the src directory.

Learn more about plugin development from the Google Chrome Extensions page.

A Rakefile is included for compiling the Google Chrome extension into a zip file. It also does javascript and css minification.

License

Scraper is open-sourced under a BSD license which you can find in LICENSE.txt.

Credits

Many of the icons used in this extension are from the generous Yusuke Kamiyamane.


Copyright (c) 2010 David Heaton (dave@bit155.com)

Something went wrong with that request. Please try again.