CivAgent is an LLM-based Human-like Agent acting as a Digital Player within the Strategy Game Unciv.
-
Updated
Jun 5, 2024
CivAgent is an LLM-based Human-like Agent acting as a Digital Player within the Strategy Game Unciv.
The prompt engineering, prompt management, and prompt evaluation tool for Ruby.
The prompt engineering, prompt management, and prompt evaluation tool for Java.
A compilation of referenced benchmark metrics to evaluate different aspects of knowledge for Large Language Models.
The prompt engineering, prompt management, and prompt evaluation tool for Go.
Evaluate LLMs using custom functions for reasoning and RAGs and dataset using Langchain
The prompt engineering, prompt management, and prompt evaluation tool for Kotlin.
Calibration game is a game to get better at identifying hallucination in LLMs.
For the purposes of familiarization and learning. Consists of utilizing LangChain framework, LangSmith for tracing, OpenAI LLM models, Pinecone serverless vectorDB using Jupyter Notebook and Python.
The prompt engineering, prompt management, and prompt evaluation tool for C# and .NET
Visualize LLM Evaluations for OpenAI Assistants
A framework for automatically manipulating and evaluating the political ideology of LLMs with two ideology tests: Wahl-O-Mat and Political Compass Test.
Summary Evaluation Tool
The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.
Add a description, image, and links to the llm-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the llm-evaluation topic, visit your repo's landing page and select "manage topics."