NLP framework for Java with integrated NLP model zoo.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.travis
idylnlp-model
idylnlp-models
idylnlp-nlp
idylnlp-pipeline
idylnlp-testing
idylnlp-training-definition
idylnlp-zoo-client
opennlp
.gitattributes
.gitignore
.travis.yml
Jenkinsfile
LICENSE
NOTICE
README.md
checkstyle.xml
pom.xml

README.md

Idyl NLP

Links
Build Status Build Status
Current Release Release
Current Snapshots Snapshot
Unit Test Coverage Coverage Status
Follow Follow

Idyl NLP is a natural language processing (NLP) framework released under the business-friendly Apache License, version 2.0. The framework features core NLP capabilities such as language detection, sentence extraction, tokenization, and named-entity extraction.

Idyl NLP uses a combination of custom implementations and other open-source projects to perform its tasks. In some cases there are multiple implementations available allowing a choice of which to use. Idyl NLP stands on the shoulders of giants to provide a capable, flexible, and powerful NLP framework.

If you are looking for commercially supported NLP microservices look at the NLP Building Blocks. These applications are powered by Idyl NLP.

Visit the Idyl NLP home page at idylnlp.ai.

Idyl NLP Capabilities

Refer to the sample projects for example implementations of the below capabilities. Some of the unit tests in this project will also provide examples.

  • Language Detection
  • Sentence Extraction
  • Tokenization
  • Named-Entity Extraction (supports neural network models on CPU/GPU)
  • Document Classification (supports neural network models on CPU/GPU)

All of these core capabilities with the exception of language detection can utilize custom trained models. The ability to train and evaluate trained models is available. Named-entity extraction and document classification support neural network models as well as maximum entropy and perceptron-based models.

Idyl NLP Ecosystem Projects

Projects Using Idyl NLP

Usage

Release and snapshot dependencies are available:

<dependency>
  <groupId>ai.idylnlp</groupId>
  <artifactId>...</artifactId>
  <version>1.0.0</version>
</dependency>

Simplified NER Pipeline

An example to quickly make a named-entity extraction pipeline to extract person entities from English natural language text:

NerPipelineBuilder builder = new NerPipeline.NerPipelineBuilder();
NerPipeline pipeline = builder.build(LanguageCode.en);

EntityExtractionResponse response = pipeline.run("George Washington was president.");

for(Entity e : response.getEntities()) {
   System.out.println(e.toString());
}

This code outputs:

Text: George Washington; Confidence: 0.96; Type: person; Language Code: eng; Span: [0..2);

Building Idyl NLP

Idyl NLP requires Java 8. To build, simply:

mvn clean install

Testing

Unit tests are included. Some tests require data that cannot be made publicly available at this time due to either size constraints or licensing. These tests are categorized as ExternalData and are skipped during a regular build. We execute these tests using an in-house build job executed after each commit. We are working to find a suitable location to host the large test data to make it available to everyone.

There are also some tests categorized as HighMemoryUsage. These tests require a very large amount of memory to execute. For this reason they are disabled during regular builds. We execute these tests on a privately hosted build server.

License

Idyl NLP is available under the Apache License, version 2.0.

Copyright 2018 Mountain Fog, Inc.