Skip to content

rafaeljusto/crawler

Repository files navigation

crawler

Build Status GoDoc Download

Web crawler tool limited to one domain. When crawling example.com it would crawl all pages within the example.com domain, but not follow the links to Facebook or Instagram accounts or subdomains like other.example.com. Given a URL, it should output a site map, showing which static assets each page depends on, and the links between pages.

building

The Crawler project was developed using the Go language and it depends on the following Go packages:

  • code.google.com/p/go.net/html

All the above packages can be installed using the command:

go get -u <package_name>

Also, to easy run the project tests you will need the following:

Finally, to download and build the command line tool just use the following commands:

go get -u github.com/rafaeljusto/crawler
go build -o crawler github.com/rafaeljusto/crawler/app

deploying

To deploy the project you will need the program bellow.

About

Web crawler limited to one domain

Resources

License

Stars

Watchers

Forks

Packages

No packages published