#

llm-as-evaluator

Here are 5 public repositories matching this topic...

prometheus-eval / prometheus-eval

Evaluate your LLM's response with Prometheus and GPT4 💯

python evaluation gpt4 llm llmops vllm litellm llm-as-a-judge llm-as-evaluator

Updated Jul 30, 2024
Python

JohnSnowLabs / langtest

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Aug 5, 2024
Python

zhaochen0110 / Timo

Timo: Towards Better Temporal Reasoning for Language Models (COLM 2024)

temporal-reasoning sota-model llms rlhf rlaif llm-as-a-judge llm-as-evaluator self-critic-framework

Updated Jul 3, 2024
Python

djokester / groqeval

Use groq for evaluations

groq llm generative-ai mixtral llm-as-a-judge llm-as-evaluator llama3

Updated Jul 11, 2024
Python

rafaelsandroni / antibodies

Antibodies for LLMs hallucinations (grouping LLM as a judge, NLI, reward models)

python nli hallucinations llms hallucination-detection llm-as-a-judge llm-as-evaluator

Updated Jun 13, 2024
Python

Improve this page

Add a description, image, and links to the llm-as-evaluator topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-as-evaluator topic, visit your repo's landing page and select "manage topics."