Skip to content
Web scraping tools
Perl Ruby
Pull request Compare This branch is 3 commits behind textarcana:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
README.rdoc
grabit.pl
rwget.rb

README.rdoc

Scrapers

Some tools to retrieve text or files from remote Web pages.

Grabit.pl

My first Web scraper. Expects as argument the name of a file containing a newline-delimited list of URLs. When invoked, launches an interactive shell that asks what type of file should be downloaded. Then downloads all the files that are linked from each of the listed Web pages.

Here's the instructions to use:

  1. Put a list of all the pages you want to scrape, into a text file named FOO

  2. Say perl grabit.pl FOO

  3. You will be prompted to choose which type of file you want to grab.

  4. Enjoy!

Something went wrong with that request. Please try again.