Skip to content

DKPro Core 1.8.0

Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core, version 1.8.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

Changed minimal system requirements

  • Requires Java 8 (Issue #369)
  • Upgrade Apache UIMA to version 2.8.1 (Issue #662)
  • Upgrade uimaFIT to version 2.2.0 (Issue #664)
  • Upgrade Spring Framework to version 3.2.16 (Issue #815)

Major improvements

  • Extensive automatically generated reference documentation (e.g. Issues #753, #635, #589)
  • New framework for text normalization and transformation (e.g. Issue #537)
  • New validation framework, mainly for improved bug detection in unit tests (Issue #728)
  • Writer components write to console if no target is specified (Issue #700)
  • Renamed some components for a more uniform naming scheme (e.g. Issue #717)
  • Writers per default refuse to overwrite files (Issue #669, #564)
  • Dependency parsers and readers consistently create a self-looped ROOT node (Issue #628)

Analysis components

  • Added JTok component, Java-based configurable tokenizer and sentence splitter (Issue #695)
  • Added RFTagger component, a tool for the annotation of text with fine-grained part-of-speech tags. (Issue #684)
  • Added RegexTokenizer and WhitespaceTokenizer components - simple whitespace tokenizers (Issue #552)
  • Added MateTools SRL component (Issue #483)

Data formats

  • Added Brat format (Issue #656)
  • Added Mallet LDA (Issue #602)
  • Added Reuters-21578 Text Classification (Issue #691)
  • Added RTF (Issue #588)
  • Added Solr (Issue #576)
  • Added UIMA Json (Issue #455)
  • Added Writer for one sentence per line (Issue #673)
  • Improved TEI to support (Issue #594, #596)

Types

  • Added MorphologicalFeatures type to support morphological analysis (e.g. Issue #244)
  • Added Div type for generic document structure (e.g. Issue #598)
  • Added id feature on Token and Sentence (e.g. Issue #609)
  • Added MetadataStringField type (Issue #672)

Further highlights in this release include:

  • Upgrade Spring framework to version 3.2.16 (Issue #815)
  • Upgrade GATE to version 8.0 (Issue #387)
  • Upgrade Stanford CoreNLP to version 3.6.0 (Issue #706)
  • Upgrade OpenNlp to version 1.6.0 (Issue #634)
  • Upgrade LanguageTool to version 3.3 (Issue #819)
  • Upgrade MaltParser to version 1.8.1 (Issue #734)
  • Added Bintray as a repository to DKPro Core (#694)

A more detailed overview of the changes and bug corrections in this release can be found here.

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.