Skip to content

Pinned

  1. crawlers crawlers Public

    Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.

    Java 172 65

  2. collector-filesystem collector-filesystem Public

    Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search …

    Java 21 12

  3. importer importer Public

    Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allo…

    Java 32 22

Repositories

Showing 10 of 22 repositories
  • crawlers Public

    Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.

    Java 172 Apache-2.0 65 25 7 Updated May 14, 2024
  • commons-lang Public

    Generic library shared between several projects.

    Java 12 Apache-2.0 6 0 5 Updated Mar 27, 2024
  • committer-solr Public

    Solr implementation of Norconex Committer. Should also work with any Solr-based products, such as LucidWorks.

    Java 3 Apache-2.0 4 8 2 Updated Mar 12, 2024
  • importer Public

    Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.

    Java 32 Apache-2.0 22 14 1 Updated Aug 24, 2023
  • collector-core Public

    Collector-related code shared between different collector implementations

    Java 7 Apache-2.0 15 6 2 Updated Jul 10, 2023
  • commons-maven-parent Public

    Maven parent POM for many Norconex Maven projects.

    JavaScript 0 Apache-2.0 2 0 0 Updated Jul 9, 2023
  • collector-filesystem Public

    Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.

    Java 21 12 9 0 Updated Jul 9, 2023
  • committer-sql Public

    Implementation of Norconex Committer for SQL (JDBC) databases.

    Java 1 Apache-2.0 6 4 1 Updated Jul 7, 2023
  • committer-core Public

    Norconex Committer is a java library and command line application used to route content to local or remote target repositories, such as a search engine index.

    Java 4 Apache-2.0 10 5 0 Updated Feb 8, 2023
  • committer-neo4j Public

    Implementation of Norconex Committer for Neo4j.

    Java 2 Apache-2.0 1 2 0 Updated Jan 4, 2022

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…