A collection of C++ classes representing fundamental linguistic units and building blocks of language from phonetics to pragmatics.
-
Updated
Feb 23, 2018 - C++
A collection of C++ classes representing fundamental linguistic units and building blocks of language from phonetics to pragmatics.
An Interpreter for a simplified version of R.
Лабораторные работы по курсу «Компьютерная лингвистика-1» (ФИТ НГУ, 2022).
Cross-platform IDE for CG-3.
Jointly Extracting Relations with Class Ties via Effective Deep Ranking
An unsupervised Chinese word segmentation tool.
Implementations of spectral word embedding methods
Command-line utilities for working with the Format for Linguistic Annotation (FoLiA), powered by libfolia (C++), written by Ko van der Sloot (CLST, Radboud University)
Tools for the 3rd edition of the Constraint Grammar formalism.
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate a…
Add a description, image, and links to the computational-linguistics topic page so that developers can more easily learn about it.
To associate your repository with the computational-linguistics topic, visit your repo's landing page and select "manage topics."