corpus-linguistics

Star

Here are 326 public repositories matching this topic...

rahonalab / TEITOK-docker

Star

Open Corpus Workbench with TEITOK Docker compose file

linguistics cwb digital-humanities corpus-linguistics cqp digital-philology opencorpusworkbench

Updated May 30, 2019
Dockerfile

PaulCaroline / comm313_S21_Final_Project

Star

Corpus linguistics final project for the course COMM 313: Computational Text Analysis at the University of Pennsylvania. Aims to determine how the anti-vaccination movement has evolved on social media before and during the COVID-19 pandemic.

twitter university sentiment-analysis corpus-linguistics covid19 snscrape

Updated May 8, 2021
Jupyter Notebook

matbahasa / ETA

Star

Easy Text Annotator

visualization nlp information-retrieval annotation linguistics corpus-linguistics annotaton-tool

Updated Feb 1, 2023
JavaScript

Affenmilchmann / lingwiki

Star

(Ongoing module in development) Getting Wikipedia articles parsed content. Created for getting text corpuses data fast and easy. But can be freely used for other purpuses too

parser wikipedia multithreading linguistics corpus-linguistics corpus-data corpus-tools article-extractor wikipedia-corpus

Updated Jan 3, 2023
Python

sonalsinha / Marwari_recordings

Star

The recordings of marwari speech by Bharti, the speaker of it. It Includes setences of all kinds using translation method and narrations of health care and lifecycle.

language documentation data corpus speech corpus-linguistics marwari

Updated Jul 4, 2024

craigmateo / pipeline-corpus

Star

Corpus for linguistic study of natural gas pipeline debates.

corpus-linguistics corpus-data

Updated Apr 6, 2024

chrisdrymon / Treebanks

Star

Treebanks modified from PROIEL and Perseus.

linguistics greek treebank ancient-greek perseus computational-linguistics corpus-linguistics perseus-digital-library treebanks perseusdl proiel

Updated Jun 1, 2018

andcarnivorous / CorpusInfo

Star

A module to quickly create Corpus objects containing TTR, tokenized sentences, lexical density, class frequencies and more.

nlp computational-linguistics corpus-linguistics

Updated Jun 30, 2019
Python

UIUCLearningLanguageLab / CreateWikiCorpus

Star

Extract raw text articles from Wikipedia dump

nlp corpus-linguistics

Updated Jun 21, 2022
Python

JorgeFCS / multimodal-annotation-distance

Star

A tool for determinating distances between multimodal annotations.

gesture corpus-linguistics data-processing prosody

Updated Oct 16, 2023
Python

LingConLab / data_oral_khakas_corpus

Star

linguistics corpus-linguistics corpus-data khakas languages-of-russia

Updated Aug 17, 2022
R

CaterinaBi / interrogatives-corpus-work

Star

Paper that Lena Baunaz and I are working on as part of my SNSF-funded 'Focus in diachrony' research project at the University of Cambridge, UK.

python syntax excel nltk data-analysis corpus-linguistics syntactic-analysis corpus-processing wh-movement

Updated Jan 31, 2023
Jupyter Notebook

MevSillire / Corpus_CODIM

Star

All scripts needed to exploit French corpus and create the associated database for the CODIM Project.

corpus-linguistics

Updated Aug 22, 2023
Jupyter Notebook

zelewskap / BA_heuristics

Star

Heuristics and cognitive biases in public discourse on climate changes - lingustic data analysis

annotations transformers bachelor-thesis corpus-linguistics linguistic-analysis corpus-processing data-analysis-python corpus-statistics bachelor-degree corpus-analysis

Updated Jun 30, 2023
Jupyter Notebook

jeroenvansweeveldt / DTA_Thesis

Star

Repository for the MA Digital Text Analysis thesis.

python r ocr xml power-bi tesseract corpus-linguistics historical-corpus-linguistics

Updated Jun 28, 2024
Jupyter Notebook

dlukes / shiny-mda

Star

A Shiny app for visualizing Multi-Dimensional Analysis results

shiny linguistics digital-humanities corpus-linguistics

Updated Feb 14, 2022
R

annadmitrieva / collocations_thesis

Star

Code for my Master's thesis

python nlp natural-language-processing r corpus-linguistics collocations

Updated Jul 24, 2019
Jupyter Notebook

gowribhat / sms-corpus-keyword-analysis

Star

python hadoop jupyter-notebook corpus corpus-linguistics hadoop-mapreduce

Updated Dec 7, 2023
Jupyter Notebook

M-Taghizadeh / Persian_Question_Answering_Voice2Voice_AI

Star

This repository hosts BonyadAI, a Persian question answering AI Model. We developed an initial web crawler and scraper to gather the dataset. The second phase involved building a machine learning model based on word embeddings and NLP techniques. This AI model operates end-to-end, receiving user voice input and providing responses in Persian voice.

python crawler machine-learning natural-language-processing text-to-speech deep-learning word2vec artificial-intelligence question-answering persian speech-to-text corpus-linguistics farsi scraping-python transformer-architecture farsi-datasets large-language-models

Updated Jul 7, 2024
Jupyter Notebook

Ighina / SemanticNetworkVizR

Star

codes to perform semantic network analysis on multiple concepts (defined as multiple words-set, i.e. dictionaries) across multiple texts with R

text-mining r information-extraction corpus-linguistics network-visualization igraph semantic-analysis

Updated Dec 5, 2018
R

Improve this page

Add a description, image, and links to the corpus-linguistics topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the corpus-linguistics topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

corpus-linguistics

Here are 326 public repositories matching this topic...

rahonalab / TEITOK-docker

PaulCaroline / comm313_S21_Final_Project

matbahasa / ETA

Affenmilchmann / lingwiki

sonalsinha / Marwari_recordings

craigmateo / pipeline-corpus

chrisdrymon / Treebanks

andcarnivorous / CorpusInfo

UIUCLearningLanguageLab / CreateWikiCorpus

JorgeFCS / multimodal-annotation-distance

LingConLab / data_oral_khakas_corpus

CaterinaBi / interrogatives-corpus-work

MevSillire / Corpus_CODIM

zelewskap / BA_heuristics

jeroenvansweeveldt / DTA_Thesis

dlukes / shiny-mda

annadmitrieva / collocations_thesis

gowribhat / sms-corpus-keyword-analysis

M-Taghizadeh / Persian_Question_Answering_Voice2Voice_AI

Ighina / SemanticNetworkVizR

Improve this page

Add this topic to your repo