Skip to content

Small web crawler developer in Java and Spring Boot

Notifications You must be signed in to change notification settings

unnivm/webcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

webcrawler

Small web crawler developed in Java and Spring Boot.

This crawler is a multi-threaded one which can start multiple crawling.

#How this works:

This is a REST based crawler and crawling a web site can be started using a REST end point.

  1. Start with a seed web site to crawl and depth to crawl. This is a "POST" request

http://localhost:8080/start?depth=5&seed=http://www.google.com

Once it is requested, it will generate a "token". This token can be used to query the status of the crawling operation

  1. You can use another endpoint to check the status of the crawling:

http://localhost:8080/status/ which will generate a JSON response.

  1. This endpoint which will give you the result of the crawling.

    http://localhost:8080/result/

  2. The following endpoint will cancel the current crawling task

    http://localhost:8080/stop/token

Releases

No releases published

Packages

No packages published

Languages