corpus
Here are 38 public repositories matching this topic...
Lermontov Online: DH-inspired project for Lermontov's heritage (HSE, together with Lermontov's Museum)
-
Updated
Apr 12, 2018 - HTML
A text analysis project on collection of script dialogue between characters for the episode 4,5,6 of star wars
-
Updated
May 11, 2018 - HTML
a garden of file formats from a collection of sources for use as inputs for fuzzing engines.
-
Updated
Oct 4, 2019 - HTML
Toxic Comment Classification Project constructed by Qimo Li, Chen He and Kun Qiu for the course "Introduction to Natural Language Processing in Python" at Brandeis University.
-
Updated
Dec 20, 2019 - HTML
In this repository, I have used NLP to determine: What are the most frequent words in Herman Melville's novel Moby Dick and how often do they occur?
-
Updated
Aug 10, 2020 - HTML
Predictive texting is a data processed tool that makes it quicker and easier to write text by suggesting words as you type. The tool will read the text inside the text input area and predict the three most suitable options. After the prediction is made, the options are displayed as buttons. The user can press the button to insert text, the tool …
-
Updated
Sep 8, 2020 - HTML
Data accompanying the dissertation "Genre Analysis and Corpus Design: 19th Century Spanish-American novels (1830-1910)"
-
Updated
Jan 31, 2021 - HTML
Materiales para el curso de verano, «Del corpus a la interpretación: Estilometría con R», Burgos, 2021
-
Updated
Sep 11, 2021 - HTML
Un corpus de chansons de geste
-
Updated
Sep 14, 2021 - HTML
A Text / Speech Summarizer
-
Updated
Nov 6, 2021 - HTML
data, metadata, tools, and LDA experiments on a corpus of Sanskrit philosophy texts
-
Updated
Nov 28, 2021 - HTML
Arabic Stories Corpus
-
Updated
Dec 16, 2021 - HTML
Scattertext plot comparing Nippon TV (Japan) and Arirang News (South Korea) YouTube videos. See cooperchris17/yt_short_news for more details
-
Updated
Mar 23, 2022 - HTML
documenting annotations for risk of bias
-
Updated
Aug 3, 2022 - HTML
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
-
Updated
Aug 18, 2022 - HTML
This repository contains python code to create a corpus of 12,215 terms of service documents scraped from TOSDR, intended for legal, privacy, and natural language processing research.
-
Updated
Mar 14, 2023 - HTML
Improve this page
Add a description, image, and links to the corpus topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the corpus topic, visit your repo's landing page and select "manage topics."