Welcome to Apache OpenNLP Models!

The Apache OpenNLP library provides binary models for processing of natural language text. This repository is intended for the distribution of model files as a Maven artifacts.

Useful Links

For additional information, visit the OpenNLP Home Page

You can use OpenNLP with any language, further demo models are provided here.

The models are fully compatible with the latest release, they can be used for testing or getting started.

Please train your own models for all other use cases.

Documentation, including JavaDocs, code usage and command-line interface examples are available here

You can also follow our mailing lists for news and updates.

Overview

Component	Language	Compatibility	Description	README and Reports
Language Detector	Detects 103 languages	>= 1.8.3	Detects 103 languages in ISO 693-3 standard. Works well with longer texts that have at least 2 sentences or more from the same language.	README Effectiveness Misclassified
Sentence	fr	>= 1.0.0	Sentence detection model for French	README Evaluation Logs
Sentence	de	>= 1.0.0	Sentence detection model for German	README Evaluation Logs
Sentence	en	>= 1.0.0	Sentence detection model for English	README Evaluation Logs
Sentence	it	>= 1.0.0	Sentence detection model for Italian	README Evaluation Logs
Sentence	nl	>= 1.0.0	Sentence detection model for Dutch	README Evaluation Logs
Parts of Speech	de	>= 1.0.0	Parts of speech model for German	README Evaluation Logs
Parts of Speech	en	>= 1.0.0	Parts of speech model for English	README Evaluation Logs
Parts of Speech	fr	>= 1.0.0	Parts of speech model for French	README Evaluation Logs
Parts of Speech	it	>= 1.0.0	Parts of speech model for Italian	README Evaluation Logs
Parts of Speech	nl	>= 1.0.0	Parts of speech model for Dutch	README Evaluation Logs
Parts of Speech	it	>= 1.0.0	Parts of speech model for Italian	README Evaluation Logs
Tokens	de	>= 1.0.0	Tokenizer model for German	README Evaluation Logs
Tokens	en	>= 1.0.0	Tokenizer model for English	README Evaluation Logs
Tokens	fr	>= 1.0.0	Tokenizer model for French	README Evaluation Logs
Tokens	it	>= 1.0.0	Tokenizer model for Italien	README Evaluation Logs
Tokens	nl	>= 1.0.0	Tokenizer model for Dutch	README Evaluation Logs

Getting Started

You can import a model artifact directly via Maven, SBT or Gradle, for instance:

Maven

<dependency>
    <groupId>org.apache.opennlp</groupId>
    <artifactId>opennlp-models-langdetect</artifactId>
    <version>${opennlp.models.version}</version>
</dependency>

SBT

libraryDependencies += "org.apache.opennlp" % "opennlp-models-langdetect" % "${opennlp.version}"

Gradle

compile group: "org.apache.opennlp", name: "opennlp-models-langdetect", version: "${opennlp.version}"

For more details please check our documentation

Adding a new Model

Ensure to add a new model to the expected-models.txt file located in opennlp-models-test.

Contributing

The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.

If you would like to get involved please follow the instructions here

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
opennlp-models-langdetect		opennlp-models-langdetect
opennlp-models-pos		opennlp-models-pos
opennlp-models-sentdetect		opennlp-models-sentdetect
opennlp-models-test		opennlp-models-test
opennlp-models-tokenizer		opennlp-models-tokenizer
.asf.yaml		.asf.yaml
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pom.xml		pom.xml

License

apache/opennlp-models

Folders and files

Latest commit

History

Repository files navigation

Welcome to Apache OpenNLP Models!

Useful Links

Overview

Getting Started

Maven

SBT

Gradle

Adding a new Model

Contributing

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages