An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
-
Updated
Jul 27, 2024 - Python
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Bitextor generates translation memories from multilingual websites
Python scripts preprocessing Penn Treebank and Chinese Treebank
OpusFilter - Parallel corpus processing toolkit
Utilities for Processing the Switchboard Dialogue Act Corpus
Corpus processing library
Utilities for Processing the Meeting Recorder Dialogue Act Corpus
Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.
N-Gram language model that learns n-gram probabilities from a given corpus and generates new sentences from it based on the conditional probabilities from the generated words and phrases.
A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry
A processor for KyotoCorpus, KWDLC, and AnnotatedFKCCorpus
A parser for annotated MuseScore 3 files.
uniblock, scoring and filtering corpus with Unicode block information (and more).
Measure the similarity of text corpora for 74 languages
Utilities for Processing the HCRC Map Task Corpus
Utilities for Processing the bAbi Tasks Corpus
Sense Tagged Instances For Finnish
Utilities for Processing the BT Oasis Corpus
Utilities for Processing the Dialogue State Tracking Challenge 3 Corpus
Add a description, image, and links to the corpus-processing topic page so that developers can more easily learn about it.
To associate your repository with the corpus-processing topic, visit your repo's landing page and select "manage topics."