Skip to content

Releases: digitallinguistics/tags2dlx

v0.4.0

01 Oct 16:26
Compare
Choose a tag to compare

FIX: use CommonJS in command line script (#43)
FIX: update default punctuation in readme (#42)
DOCS: change command line example to use directory instead of file (#39)

v0.3.0

30 Sep 18:50
Compare
Choose a tag to compare

This is a breaking release which changes the method for determining utterances in a text. Utterances are now determined based on newlines rather than punctuation. This was motivated by the fact that some portions of major corpora (such as the Open American National Corpus) do not include punctuation.

The utteranceSeparators option has been removed, and the punctuation option has been updated so that the default list of punctuation now includes punctuation typically placed at the end of a sentence/utterance.

v0.2.1

30 Aug 23:48
5311bb0
Compare
Choose a tag to compare

This release fixes a memory leak issue caused by running text conversions in parallel. Each text is now converted in sequence. This slows the script down significantly, but avoids memory leaks.

v0.2.0

30 Aug 23:24
8322f4f
Compare
Choose a tag to compare

This release adds a command line interface. All options are accepted except for the metadata option. Run tags2lx --help from the command line to see all options.

v0.1.1

25 Aug 23:38
ab4a004
Compare
Choose a tag to compare

This releases adds citation information for Zenodo to the readme, as well as information about the available options that can be passed to the tags2dlx function.

v0.1.0

25 Aug 22:57
341f92e
Compare
Choose a tag to compare

This is the initial release of the tags2dlx library.

NEW: convert text to a valid DLx Text object
NEW: tokenize text into utterances
NEW: tokenize utterances into word tokens
NEW: parse words into token and tag
NEW: option: metadata
NEW: option: punctuation
NEW: option: tagName
NEW: option: tagSeparator
NEW: option: utteranceSeparators
DOCS: add project readme