Skip to content
#

nlp-evaluation

Here are 6 public repositories matching this topic...

Language: All
Filter by language

This repository contains the dataset and code used in our paper, “MENA Values Benchmark: Evaluating Cultural Alignment and Multilingual Bias in Large Language Models.” It provides tools to evaluate how large language models represent Middle Eastern and North African cultural values across 16 countries, multiple languages, and perspectives.

  • Updated Jun 3, 2025
  • Python

Improve this page

Add a description, image, and links to the nlp-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nlp-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more