Skip to content

hpxro7/burrow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Burrow

Because gophers don't crawl. They burrow.

An experiment in writing a document crawler in Go.

The API aims to be expressive yet succinct. Take for example the task of crawling through html documents using anchor hrefs:

crawl.Through(urlsUsingAnchor).BeginWith(seedUrls, crawledUrlSink)

Note that the current example sever implementation does not persist crawled entities to disk but rather keeps a pool of urls and polls and removes them as the reqeust multiplexer deems fit. Therefore if scalibility is a concern and you expect more than urlSinkSize concurrent requests I would recommend using an actual crawling engine.

About

Because gophers don't crawl. They burrow.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages