Linguistic search for large annotated text corpora, based on Apache Lucene
-
Updated
Nov 14, 2024 - Java
Linguistic search for large annotated text corpora, based on Apache Lucene
Reading the data from OPIEC - an Open Information Extraction corpus
Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
📖 Probabilistic model and Deep Learning based Korean NLP Engine
A text management tool for linguistic purposes...
This repository contains program source code of a converter that can transform Kiel Corpus files into standardised TEI-XML files.
an search engine for classic Chinese poetry
Uses markov chains and a corpus of text to respond to conversation
Code for my BSc thesis: Cleaning of Parallel Texts for Machine Translation
⛏️📄 Script to scrape all files linked on a textfiles.com page
QuoVadis: annotation of Entities and Relations, initial Ph.D. work
This is a search engine that searches through a given corpus for queries.
Add a description, image, and links to the corpus topic page so that developers can more easily learn about it.
To associate your repository with the corpus topic, visit your repo's landing page and select "manage topics."