Skip to content

jashmenn/bashpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

bashpider

a spider using nothing but standard unix tools (+ ruby glue)

git clone whatever
cd bashpider
rake data:get_urls # will take a while 
rake crawl:restart # this will run a crawl

If you want to monitor the downloads per second, in another window type the following:

rake crawl:watch

When you feel you've gathered enough data, CTRL-C to kill both windows and then type:

rake results:process 

See: post

About

a "crawler" using xargs and wget

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published