Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
18 lines (12 sloc) 469 Bytes

Go Tor Crawler

It get the site list and download each site contents to a directory called "sites".

How to

  1. Start the Tor proxy:

docker run -it -p 8118:8118 -p 9050:9050 dperson/torproxy

  1. Build and start the crawler:

go get github.com/PuerkitoBio/goquery
go get github.com/metal3d/go-slugify
go get golang.org/x/net/proxy
go install
go-tor-crawler config.json

  1. Go make a coffee, the files will be saved in "sites" directory.