Assignment 2 for CS 11-731 Machine Translation course.
-
Updated
Nov 6, 2019 - TypeScript
Assignment 2 for CS 11-731 Machine Translation course.
[ACL 2021, Findings] Cognate Prediction Per Machine Translation
A natural language processing and machine learning project for a low resource langauge in Zambia.
A 16M LLM for POS tagging in African languages
Auto-generated stopwords for South African Bantu Languages
Embedding Evaluation Data for South African Languages
Investigating transfer learning in low-resourced languages, specifically in a named entity recognition (NER) task (IJCNLP-AACL 2023). http://arxiv.org/abs/2309.05311
Italian hate speech detection using transformer.
A web application to test sentence-similarity models of the top 10 Indian Languages
a repository containing the details of natural language inference dataset in Hindi
Dataset for Paper - A Neural Approach to Multilingual Sentiment Analysis in Low Resource Languages, submitted in Elsevier Expert Systems with Applications
A custom tokenizer for the Balochi language.
FilWordNet web portal — a language resource for Filipino and Philippine English built from text analysis network science and natural language processing
[ACL'24 Findings] Teaching Large Language Models an Unseen Language on the Fly
Scripts and files I used throughout my M.Sc. Voice Technology Thesis Project at Rijkuniversiteit Groningen - Campus Fryslân.
Finetuning BERT models on a powerset of different linguistic domains
Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus
Fine-tune LLM for early Middle English lemmatization with data from LAEME.
GlotSparse: Building Corpora in Under-Resourced Languages
Repo associated with the forthcoming paper 'Instruct-global: aligning language models to follow instructions in low-resource languages'. Instruct-global automates the process of generating instruction datasets in low-resource languages (LRLs).
Add a description, image, and links to the low-resource-languages topic page so that developers can more easily learn about it.
To associate your repository with the low-resource-languages topic, visit your repo's landing page and select "manage topics."