Skip to content
/ shovel Public

wget powered go program for multi-threaded web scraping

Notifications You must be signed in to change notification settings

phact/shovel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

shovel

wget powered go program for multi-threaded web scraping

###Running Shovel:

Install golang (osx):

brew install hg

brew install go

Pull repo:

git clone https://github.com/phact/shovel.git

Run shovel:

go run shovel.go

Config:

Your list of URL's goes in data/urls.txt

By default we run 100 threads of wget at a time. Change the maxFutures int to alter the number of concurrent jobs.

About

wget powered go program for multi-threaded web scraping

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages