low-resource-nlp

Star

Here are 25 public repositories matching this topic...

devrimcavusoglu / nonwestlit

Star

NONWESTLIT Project Codebase

multilingual dataset low-resource-languages low-resource-nlp

Updated Jul 23, 2024
Python

luciusssss / mc2_corpus

Star

[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)

multilingual natural-language-processing corpus mongolian tibetan tibetan-nlp uyghur kazakh low-resource-languages low-resource-nlp

Updated Jun 15, 2024
Python

nicolay-r / RuSentNE-LLM-Benchmark

Star

This repository highlights the LLMs reasoning capabilities of ✨ Mistral / LLaMA-3 / Phi-3 / Gemma / Flan-T5 / GPT-4o ✨ in Targeted Sentiment Analysis in Russian / Translated to English mass-media 📊

sentiment-analysis leaderboard prompt openai gemma zero-shot mistral reasoning fine-tuning low-resource-languages transformers-library low-resource-nlp gpt4 llm llms chain-of-thought llama3 gpt4o

Updated Sep 6, 2024
Python

luciusssss / ZhuangBench

Star

[ACL'24 Findings] Teaching Large Language Models an Unseen Language on the Fly

low-resource-languages zhuang low-resource-nlp large-language-models llm

Updated Jun 12, 2024
Python

kasunw22 / sinhala-word-embedding-alignment

Star

English-Sinhala multilingual word embedding alignment resources

sinhala procrustes-alignment english-sinhala procrustes-analysis labse low-resource-nlp bilingual-lexicon-induction word-embedding-alignment rcsls-alignment supervised-embedding-alignment unsupervised-embedding-alignment low-resource-word-embedding-alignment sinhala-word-embeddings fasttext-sinhala-word-embedding-alignment vecmap multilingual-embeddings

Updated Jun 8, 2024
Python

mdm-code / manx

Star

Fine-tune LLM for early Middle English lemmatization with data from LAEME.

nlp deep-learning parsing neural-network lemmatizer nlp-machine-learning lemmatization low-resource-languages middle-english low-resource-nlp low-resource-machine-learning

Updated Jan 25, 2024
Python

GGLAB-KU / turkish-plu

Star

Code for AACL23 paper "Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on Turkish"

deep-learning retrieval text-generation procedural classification language-model low-resource-languages procedural-text low-resource-nlp procedural-language-understanding

Updated Feb 4, 2024
Python

nicolay-r / RuSentRel-Leaderboard

Star

This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)

benchmark sentiment-analysis leaderboard cnn neural-networks attention language-models attention-mechanism relation-extraction classifiers bilstm bert-model low-resource-nlp chatgpt

Updated Dec 28, 2023
Python

pnborchert / MultiRep

Star

Efficient Information Extraction in Few-Shot Relation Classification through Contrastive Representation Learning. NAACL 2024.

information-extraction relation-extraction few-shot fewrel contrastive-learning low-resource-nlp

Updated Jun 18, 2024
Python

vgupta123 / contextualize_scdv

Star

Unsupervised Contextualized Document Representation, to appear in SustaiNLP 2021 EMNLP 2021

text-classification word-sense-disambiguation multi-label-classification bert multi-sense-embeddings multi-class-classification few-shot-learning scdv sparse-document-vectors low-resource-nlp emnlp2021 sustainlp2021

Updated Sep 26, 2021
Python

Lhtie / Bio-Domain-Transfer

Star

Implementation of NAACL 2024 main conference paper: Named Entity Recognition Under Domain Shift via Metric Learning for Life Science

chemical pytorch information-extraction named-entity-recognition nltk biomedical knowledge-transfer few-shot contrastive-learning low-resource-nlp doamin-adaptation transformers-bert

Updated Jun 19, 2024
Python

zjunlp / OntoED

Star

[ACL 2021] OntoED: Low-resource Event Detection with Ontology Embedding

information-extraction event-detection low-resource low-resource-nlp ontoed event-exxtraction

Updated Apr 15, 2022
Python

ruoyuxie / noisy_parallel_data_alignment

Star

Enhanced awesome-align for low-resource languages and noise simulation: https://arxiv.org/abs/2301.09685

ocr noise word-aligner word-alignment noisy-data ocr-text low-resource-languages nueral-machine-translation low-resource-nlp

Updated Mar 4, 2023
Python

HenningBuhl / low-resource-machine-translation

Star

This repository is an open-source colleciton of various low-resource machine translation experiments.

Updated May 23, 2023
Python

This repository contains the code, data, and associated models of the paper titled "BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset", accepted in Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics: AACL 2022.

paraphrase-generation bangla-nlp low-resource-nlp bangla-paraphrase