Skip to content
This repository has been archived by the owner on Sep 20, 2018. It is now read-only.

🕷️ Crawls websites for URLs, and stores them in a textfile.

License

Notifications You must be signed in to change notification settings

thatlittlegit-archive/webcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

crawler Travis

crawler is a web-crawler for the 20th Birdhouse project. It scans the Web (or theoretically every protocol supported by reqwest) for URLs, and parses HTML with Servo's html5ever.

Usage

Run with cargo. It's recommended to provide RUST_LOG=crawler=info to get its status as it crawls. Provide a URL to start with as well. You will also want to pipe stdout to a file.

RUST_LOG=crawler=info cargo run https://github.com >urls

About

🕷️ Crawls websites for URLs, and stores them in a textfile.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages