Skip to content

Conversation

monoxgas
Copy link
Contributor

@monoxgas monoxgas commented Jul 22, 2025

LLM Judge

Key Changes:

  • Added a new llm_judge scorer which uses an inference model to grade based on a rubric.

Generated Summary:

  • Added a new module llm_judge.py to implement a scoring system using a language model (LLM) to evaluate outputs against a rubric.
  • Introduced two new classes, JudgeInput and Judgement, to structure the input data and output results respectively.
  • Implemented the core function llm_judge, which evaluates the input based on the rubric and provides metrics, including score and pass/fail status.
  • Updated __init__.py to include the new llm_judge scorer in the module's exports, ensuring it can be easily accessed.
  • Updated pyproject.toml to upgrade the rigging library to version ^3.2.1 for compatibility with the new scoring functionality.
  • Potential impact: This feature enhances the scoring mechanism by integrating LLM capabilities, which could improve the accuracy of evaluations based on defined criteria.

This summary was generated with ❤️ by rigging

@dreadnode-renovate-bot dreadnode-renovate-bot bot added the area/python Changes to Python package configuration and dependencies label Jul 22, 2025
@monoxgas monoxgas merged commit f6980c4 into main Jul 22, 2025
9 checks passed
@monoxgas monoxgas deleted the feat/llm-judge branch July 22, 2025 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/python Changes to Python package configuration and dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant