Changelog

Version 2.2.1

SHA: f7e4c3f

added: temponym tagging functionality [Core]
added: English temponym resources [Resources]
fixed: parameter pos set to "no" (POSTagger.NO) works for all languages now (AllLanguagesTokenizer) [Standalone]
fixed: several minor issues

Version 2.1

SHA: 378e476

fixed: TreeTaggerWrapper no longer creates temporary files which speeds up processing [Core]
fixed: Various improvements to Maven support [Maven]
fixed: Errors in Arabic resources [Resources]
fixed: Values in IntervalTagger that were switched for a while [Core]

Version 2.0

SHA: b9248e0

added: automatically-created resources for 200+ languages [Resources]
added: AllLanguagesTokenizer, a simple, generic, whitespace-based tokenizer that can be used with all languages
fixed: Minor rule improvements for some languages [Resources]

Version 1.9

SHA: d50e129

added: Support for Estonian [Resources]
added: Support for Portuguese (thanks to Zunsik Lim) [Resources]
added: Resource loading is now easier and and looks in multiple places [Core]
fixed: StanfordPOSTaggerWrapper would not accept URLs as model (#26) [Core]
fixed: A bug where pattern replacing in rules would mangle some patterns [Core]
fixed: Lots of improvements to German and English resources [Resources]
fixed: Minor improvements for almost all other languages [Resources]

Version 1.8

SHA: b9c5832

fixed: Italian resources have received a major overhaul [Resources]
added: Support for Croatian (thanks to Luka Skukan), including a wrapper for the hunpos preprocessing tool [Resources]
added: Ability to use regular expressions in POS constraints of rules [Resources]
added: Tokenization without the Tree Tagger's Perl script [TreeTaggerWrapper]
fixed: various minor bugfixes in the TreeTaggerWrapper and Standalone code
fixed: Some code pertaining to the invocation of Processors [Core]

Version 1.7

SHA: 5ca451f

added: Support for calculation of BC and AD dates, and dates close to the year 0, including Arabic, Dutch, English, French, German, Italian, Spanish, Vietnamese language resources.
added: A preliminary version of Elena Klyachko's Russian resources [Resources]
fixed: A minor issue with parameter files in TreeTaggerWrapper [Core]

Version 1.6

SHA: d592d19

added: Chinese resources, support in TreeTaggerWrapper as well as the TempEval-2 Reader and Standalone version
added: Better handling of overlapping temporal expressions [Core]
fixed: Made TempEval-3 Reader more robust to non-TE3-inputs
fixed: More stable TreeTaggerWrapper parameter file recognition
fixed: Various minor improvements in the resources for all languages [Resources]
fixed: Some minor fixes for resource recognition in Standalone [Standalone]

Version 1.5

SHA: 183643e

added: French resources, kindly provided by Véronique Moriceau of the LIMSI-CNRS [Resources]
added: Support for the IntervalTagger in HeidelTime Standalone [Standalone]
added: Support to choose from !StanfordPOSTagger or TreeTagger as HeidelTime Standalone's preprocessing engine [Standalone]
added: Interval resources to Vietnamese [Resources]
fixed: Improvements in German, English and Vietnamese resources [Resources]
added: Rudimentary Maven support [Meta]

Version 1.4.1

SHA: 07c89a5

fixed: A bug that would prevent HeidelTime Standalone from loading resources under Windows [Standalone]

Version 1.4

SHA: 07c89a5

added: Support for Spanish and Arabic document processing via standalone [Standalone]
added: Some more error handling for unexpected user input [Core]
fixed: Made fixes and alterations to Spanish, Italian, German, Vietnamese and Arabic resources [Resources]
fixed: A bug where underspecified centuries ("UNDEF-century") in resources would break processing [Core]
fixed: Made several improvements to the StanfordPOSTagger to work with more unconventional documents
fixed: Erroneous normalization of underspecified centuries; it now works according to the TIMEX standard [Resources]
fixed: The Windows printResourceInformation.bat script now works with paths that contain spaces [Resources]

Version 1.3

SHA: 1d2fdfa

added: Resources for Spanish, Italian, Vietnamese and Arabic [Resources]
added: TreeTaggerWrapper, a sentence-tokenization, word-tokenization and part of speech tagging wrapper for the popular TreeTagger. It replaces the DKPro Analysis Engines as preprocessing component.
added: Support for regular expressions in normalization resources [Core]
added: A new annotator, IntervalTagger that recognizes interval expressions
added: New UIMA Collection Reader and Consumer for our participation in the TempEval-3 challenge: TempEval3Reader and TempEval3Writer
added: A UIMA Analysis Engine (JVnTextProWrapper) that leverages the JVnTextPro tool to produce word- and sentence-tokenization as well as part of speech tagging for Vietnamese
added: A switch (-c) that allows passing the path of the config.props file (issue 3 (on Google Code)) [Standalone]
added: Sub-processor priorities to influence when they are being run [Core]
added: Switches (-v/-vv) to control the verbosity of logging messages (issue 4 (on Google Code)) [Standalone]
added: A descriptor parameter as well as command line switch (-locale) to specify a locale for HeidelTime to base relative date calculation on (issue 1 (on Google Code)) [Core/Standalone]
added: setenv and printResourceInformation batch files for Windows
added: An optional setting in the ACETernWriter descriptor to prevent conversion from Timex3 to Timex2
fixed: Charset in Dutch resources [Resources]
fixed: Behaviour when non-hardcoded languages are supplied: The resource folder is assumed to be the name of the language [Core]
fixed: Token boundary detection [Core]
fixed: Typos in english resources [Resources]
fixed: A bug where two overlapping temporal expressions would break XML-conformity (issue 5 (on Google Code)) [Standalone]
fixed: A bug that would break resource loading on the Mac OS platform [Core]
fixed: TempEval2 Reader now works properly with the italian TempEval2 corpus
fixed: ACE Tern reader recognizes DCTs from the ICAB corpus
fixed: A bug in TempEval2Writer that would break the output of parallelized UIMA workflows
fixed: A bug where when you used the HeidelTime standalone version programmatically, the OutputType would be ignored (issue 7 (on Google Code)) [Standalone]
fixed: Several things in a lot of places that made it hard to use HeidelTimeStandalone programmatically (i.a. issue 6 (on Google Code)) [Standalone]

Version 1.2

SHA: f1b7a4f

added: links to the journal paper to the readme file
fixed: TIMEX3 SET expressions not being translated to TIMEX2 expresions correctly [ACETernWriter]
removed: dead code

Version 1.1

SHA: f814184

added: support code and english resources for two additional domains: COLLOQUIAL and SCIENTIFIC
fixed: an incorrect DCT recognition regex
fixed: standalone not recognizing parameters when not all upper case

Version 1.0

SHA: 7dc7ab5

conducted major code overhaul, mostly modularization-based refactoring
merged the previously independent heideltime-standalone project into the kit repository
fixed: made logger message source more apparent

Version 0

SHA: 38cad56

added: support to read the ACE Tern 2005 Corpus
fixed: regex recognitions of strings in the form of "1995-1996"
added: recognition logic for holidays together with english and german resources

Initial Version

SHA: 8a1262f

This release represents the state of HeidelTime's development before the use of a revision control system; i.e., the version, which was available from the dbs.ifi.uni-heidelberg.de page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changelog

Table of contents

Changelog

Version 2.2.1

Version 2.1

Version 2.0

Version 1.9

Version 1.8

Version 1.7

Version 1.6

Version 1.5

Version 1.4.1

Version 1.4

Version 1.3

Version 1.2

Version 1.1

Version 1.0

Version 0

Initial Version

Clone this wiki locally