Norconex HTTP Collector is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.
Java HTML Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
norconex-collector-http Preparing for release. Nov 27, 2017
README.md Update README.md Aug 10, 2016

README.md

Norconex HTTP Collector

Norconex HTTP Collector Logo

Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable. Can be used command-line with file-based configuration on any OS, or can be embedded into Java applications using well documented APIs.

Visit the web site for binary downloads and documentation:

https://www.norconex.com/collectors/collector-http/