Linguistic search for large annotated text corpora, based on Apache Lucene
-
Updated
Jun 18, 2024 - Java
Linguistic search for large annotated text corpora, based on Apache Lucene
This repository contains program source code of a converter that can transform Kiel Corpus files into standardised TEI-XML files.
This is a search engine that searches through a given corpus for queries.
an search engine for classic Chinese poetry
Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
QuoVadis: annotation of Entities and Relations, initial Ph.D. work
Reading the data from OPIEC - an Open Information Extraction corpus
Uses markov chains and a corpus of text to respond to conversation
A text management tool for linguistic purposes...
⛏️📄 Script to scrape all files linked on a textfiles.com page
📖 Probabilistic model and Deep Learning based Korean NLP Engine
Code for my BSc thesis: Cleaning of Parallel Texts for Machine Translation
Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
Add a description, image, and links to the corpus topic page so that developers can more easily learn about it.
To associate your repository with the corpus topic, visit your repo's landing page and select "manage topics."