Comprehensive NLP Evaluation System
-
Updated
Aug 8, 2024 - Python
Comprehensive NLP Evaluation System
This repo houses experimental projects inspired by insights from the course 'Building and Evaluating Advanced RAG Applications' offered by DeepLearning.AI
This repository contains the dataset and code used in our paper, “MENA Values Benchmark: Evaluating Cultural Alignment and Multilingual Bias in Large Language Models.” It provides tools to evaluate how large language models represent Middle Eastern and North African cultural values across 16 countries, multiple languages, and perspectives.
LLM behavior QA: tone collapse, false consent, and reroute logic scoring.
NLP evaluation tool prepared for Udacity Front End Development Nanodegree. First attempt to write unitary tests with Jest. App not working properly anymore due to the trial period of Aylien API, that reached the end.
Framework for testing Generative Dialog Models (GDMs)
Add a description, image, and links to the nlp-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the nlp-evaluation topic, visit your repo's landing page and select "manage topics."