This tool, written in Java, downloads website source code and stores in a MySQL database for processing.
-
Updated
Jul 13, 2017 - Java
This tool, written in Java, downloads website source code and stores in a MySQL database for processing.
🤖 robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
🚫🤖 Override /robots.txt to disallow all web crawlers, regardless settings stored in the database. Compatible with Liferay 7.0, 7.1, 7.2, 7.3 and 7.4.
Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming
NextTypes is a standards based information storage, processing and transmission system that integrates the characteristics of other systems such as databases, programming languages, communication protocols, file systems, document managers, operating systems, frameworks, file formats and hardware in a single tightly integrated system using a comm…
A set of reusable Java components that implement functionality common to any web crawler
Add a description, image, and links to the robots-txt topic page so that developers can more easily learn about it.
To associate your repository with the robots-txt topic, visit your repo's landing page and select "manage topics."