Skip to content

Simple web crawler I implemented as part of an interview process

License

Notifications You must be signed in to change notification settings

Avecyclop/webcrawlie

Repository files navigation

WebCrawlie

Simple web crawler

Build

./gradlew build

Usage

java -jar build/libs/webcrawlie.jar <url>
Results will end up in sitemap.txt

Issues/ToDo
  • Better presentation than a text file (something that can graph a tree)
  • Fix link parsing - follows any href="/*", even plain text ones
  • Parallelization to speed up crawling

About

Simple web crawler I implemented as part of an interview process

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published