crawler-engine

A search engine implements the page rank, term frequency and inverse document frequency algorithms. The data is provided by the Web Crawler that uses DFS and BFS to crawl through all pages.

search-engine graph dfs-algorithm bfs-algorithm crawler-engine

Updated Mar 29, 2023
Java

rihenperry / whirlpool-urlfrontier

Star

mercator scheme/rate-limiting/scheduling part of whirlpool project; handles crawler priority and politeness

crawler scheduling rate-limiting priority-queue binary-heap mercator crawler-engine

Updated Dec 14, 2021
Java

Dyzio18 / java-web-bot-library

Star

Java website crawler - library for analyze and testing websites

website-crawler crawler-engine web-bot

Updated Dec 30, 2021
Java

fooock / robots.txt

Star

🤖 robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API

kotlin java api docker redis crawler spring-boot gradle docker-compose makefile postgresql robots-txt antlr4 spiders robots-parser crawler-engine redis-stream redis-streams

Updated Dec 2, 2020
Java

paganini2008 / greenfinger

Star

A high-performance distributed web crawling framework based on SpringBoot framework. It provides rich APIs to customize business and easily embedded your system.

java distributed-systems high-performance crawler-engine mircoservice

Updated Oct 8, 2022
Java

runjia1987 / crawler-engine

Star

crawler-engine with HTTP, proxy, JS-Java Interoperability, MQ task consumption, dynamic crawler scripts execution. support deployment in distribution style.

rabbitmq proxy nashorn rhino-js crawler-engine js-java-interoperability mq-task-consumption

Updated Dec 23, 2017
Java

Improve this page

Add a description, image, and links to the crawler-engine topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the crawler-engine topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crawler-engine

Here are 10 public repositories matching this topic...

vivekg13186 / lucas

chiritagabriela / WebCrawlerMTA

shirsho-12 / RakeSearchEngineCOMP250

unnivm / webcrawler

tungtuhoccode / Search-Engine-And-Crawler

rihenperry / whirlpool-urlfrontier

Dyzio18 / java-web-bot-library

fooock / robots.txt

paganini2008 / greenfinger

runjia1987 / crawler-engine

Improve this page

Add this topic to your repo