Skip to content

UBC-NLP/SPARROW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages

Chiyu Zhang, Khai Duy Doan, Qisheng Liao, Muhammad Abdul-Mageed

Publish at Main Conference of EMNLP 2023

Paper

Code License Data License

Title

Comparison of SM benchmarks with leaderboards. SPARROW is an evaluation benchmark for sociopragmatic meaning understanding. SPARROW comprises 169 datasets covering 13 task types across six primary categories (e.g., anti-social language detection, emotion recognition). SPARROW datasets encompass 64 different languages originating from 12 language families representing 16 writing scripts.



Title

Summary of datasets in SPARROW. Lang: number of languages, LF: number of language families, Scr: number of scripts.

Benchmark and Leaderboard

  • You can access our SPRROW benchmark and leaderboard here.
  • You can find SPARROW benchmark on huggingface datasets.
  • More guidance for submitting your system is provided here.

Multilingual Model for Sociopragmatic Meaning Understanding:

Citation

Please cite us if you use our data or models.

@inproceedings{zhang-etal-2023-skipped,
    title = "The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages",
    author = "Zhang, Chiyu  and
      Khai Duy Doan and,
      Qisheng Liao and,
      Abdul-Mageed, Muhammad",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    year = "2023",
    publisher = "Association for Computational Linguistics",
}

Releases

No releases published

Packages

No packages published