Fast and portable character string processing in R (with the Unicode ICU)
-
Updated
Jul 11, 2024 - C++
Fast and portable character string processing in R (with the Unicode ICU)
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate a…
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
A large scale feature extraction tool for text-based machine learning
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
OOP based PWG-DQ User Interface (CLI) Development in Python
A minimalist single-header library for building pattern-matchers, lexers, and parsers.
Yet another way to type amharic on standard english keyboard.
Функции работы с русскими числительными
Markov chain N-gram text generator for fast work with big number of N. Want to reach fast work with 6-grams or more.
An C++ program which can provide a Google-like summary of a document given a list of positions of words and phrases to highlight.
New version of the specs pipeline stage based on what's in current CMS pipelines
A Regex📋 implementation in C++ using Thompson's NFA algorithm
UNIX line counting utilities
Example of cleaning the text-file for unreasonable symbols
Searching for words in the text using the Karp-Rabin algorithm.
Program removes repeated words from text files.
A Graphics Library that renders in text mode
Text data processing utility (PREProcessor PIPEline), written in C++ using Qt
Add a description, image, and links to the text-processing topic page so that developers can more easily learn about it.
To associate your repository with the text-processing topic, visit your repo's landing page and select "manage topics."