#

text-mining

Here are 17 public repositories matching this topic...

shangjingbo1226 / AutoPhrase

AutoPhrase: Automated Phrase Mining from Massive Text Corpora

text-mining automatic lexicon multi-language phrase compound-words quality-phrases

Updated Jan 27, 2022
C++

bigartm / bigartm

Fast topic modeling platform

python c-plus-plus machine-learning text-mining bigdata topic-modeling python-api bigartm regularizer

Updated Aug 19, 2023
C++

qminer / qminer

Analytic platform for real-time large-scale streams containing structured and unstructured data.

javascript machine-learning text-mining data-mining cpp signal-processing

Updated Apr 15, 2023
C++

bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

nlp natural-language-processing text-mining r rcpp tokenizer conll r-pkg dependency-parser r-package pos-tagging lemmatization udpipe

Updated Mar 1, 2023
C++

bnosac / ruimtehol

R package to Embed All the Things! using StarSpace

nlp natural-language-processing text-mining r similarity embeddings classification starspace

Updated Feb 23, 2024
C++

docwire / docwire

DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality

Updated Nov 14, 2024
C++

bnosac / tokenizers.bpe

R package for Byte Pair Encoding based on YouTokenToMe

text-mining tokenization bpe byte-pair-encoding

Updated Sep 16, 2023
C++

IDisposable / IFilterExtractor

A simple component to extract just the text from any file that has an IFilter installed. Available as a C++ COM component and as a C# .NET library.

text-mining com text-extraction ifilter

Updated Mar 31, 2017
C++

607011 / txtz

Short string compression

text-mining compression cplusplus text strings educational compression-algorithm educational-project cplusplus-17 shannon-fano shannon-fano-algorithm

Updated Aug 21, 2024
C++

dualword / dualword-pmc

PubMed Central browser

Updated Mar 29, 2023
C++

sidmishraw / scp

A data processing pipeline for text-mining on contents extracted from PDFs using Apriori and Simplicial Complex algorithms

text-mining association-rules document-clustering apriori-algorithm simplicialcomplex pdf-processor docpruner simplicial-complex

Updated Oct 28, 2017
C++

ThomasThorpe / ContextualSummaries

An C++ program which can provide a Google-like summary of a document given a list of positions of words and phrases to highlight.

cli text-mining cpp text-processing contextual-summarization

Updated Sep 24, 2019
C++

analisis-20minutos / herramientas-analisis

Herramientas de obtención y análisis del corpus de noticias de 20minutos.

nlp data-science natural-language-processing text-mining data-mining web-scrapping

Updated Nov 26, 2018
C++

stygio / pdf-word-counter

A text analysis tool for PDF files.

pdf data-science text-mining cpp text-analysis data-visualization cpp11 pdf-document-processor

Updated Oct 28, 2020
C++

Kuderic / Weakly-Supervised-Topic-Mining

Topic modeling with AutoPhrase and CatE

text-mining data-mining topic-modeling topic-mining

Updated Oct 25, 2022
C++

jolly-fellow / n-gram-text-gen

Markov chain N-gram text generator for fast work with big number of N. Want to reach fast work with 6-grams or more.

text-mining text-analysis text-generation n-grams linguistics text-processing language-model

Updated Dec 9, 2018
C++

MelekV1 / MoteurDeRecherche

Search Engine built with C++

search-engine text-mining tf-idf

Updated Jun 5, 2020
C++

Improve this page

Add a description, image, and links to the text-mining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the text-mining topic, visit your repo's landing page and select "manage topics."