Add an enhanced multi-rubric relenvacy agent and checker#49
Merged
Conversation
- Add `akd.agents.relenvacy.MultiRubricRelevancyAgent` which performs a bunch of different binary classification - Add `akd.tools.relevancy.EnhancedRelevancyChecker` as a tool that uses `MultiRubricRelevancyAgent` under the hood and performs different scoring and iterations to get final scores
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
akd.agents.relevancy.MultiRubricRelevancyAgentwhich performs a series of binary classifications across multiple evaluation criteria.akd.tools.relevancy.EnhancedRelevancyCheckeras a tool that utilizes theMultiRubricRelevancyAgent. This tool orchestrates multiple iterations and applies a weighted scoring model to produce a final, nuanced relevancy score.Summary 📝
This pull request introduces a sophisticated, multi-dimensional relevancy assessment framework. It replaces the previous simple relevancy check with a new
MultiRubricRelevancyAgentthat evaluates content against six distinct rubrics: Topic Alignment, Content Depth, Recency, Methodological Relevance, Evidence Quality, and Scope.To leverage this new agent, an
EnhancedRelevancyCheckertool has been developed. This tool provides a configurable and robust method for determining content relevancy by applying customizable weights to each rubric's score. It supports multiple iterations and ensembling to improve accuracy and provides a confidence score for its assessments. Factory functions have been added to create pre-configured settings for specific use cases like "strict literature search" and "general content," making the new system both powerful and easy to use.Details
New
MultiRubricRelevancyAgent:akd/agents/relevancy.py, this agent is designed to assess content against six specific dimensions.MultiRubricRelevancyInputSchema,MultiRubricRelevancyOutputSchema) and Enum labels for each assessment rubric.create_multi_rubric_relevancy_agentinakd/agents/factory.py, configures the agent with a detailed system prompt that guides the LLM to act as an expert literature assessor.New
EnhancedRelevancyCheckerTool:akd/tools/relevancy.py, this tool orchestrates the relevancy checking process.RubricWeights,RubricScoringConfig) that allow for fine-tuning the importance of each rubric and the numerical scores assigned to their labels.n_iter), optionally swapping the query and content for a more robust check.Configuration Factories:
akd/tools/factory.py:create_strict_literature_config_for_relevancy: Provides a configuration with weights optimized for academic and scientific literature (e.g., high importance for methodology and evidence quality).create_general_content_config_for_relevancy: Provides a configuration tailored for general-purpose content assessment (e.g., higher importance for topic and scope alignment).Usage
This new framework introduces a powerful way to assess the relevancy of content. The two main components are the
MultiRubricRelevancyAgent, which performs the core evaluation, and theEnhancedRelevancyChecker, a tool that orchestrates the process and calculates a final score.1. Standalone
MultiRubricRelevancyAgentYou can use the agent directly if you only need the raw, multi-faceted assessment without a final weighted score.
First, create an instance of the agent using the factory function. This comes pre-configured with a detailed system prompt for high-quality literature assessment.
The output will be a JSON object containing assessments for
topic_alignment,content_depth,recency_relevance,methodological_relevance,evidence_quality, andscope_relevance, along with anoverall_relevancelabel and detailedreasoning_steps.2. Using the
EnhancedRelevancyCheckerThe
EnhancedRelevancyCheckeris the recommended way to get a final, quantifiable relevancy score. It uses theMultiRubricRelevancyAgentinternally and adds features like configurable weighting, multiple iterations, and confidence scoring.a. Basic Usage with Pre-configured Setups
For convenience, factory functions are available to create configurations for common use cases. Here, we'll use the setup for a strict literature search.
The output from the checker includes:
score: A final weighted score between 0.0 and 1.0.is_relevant: A boolean indicating if the score met the configured threshold.rubric_scores: A dictionary of the averaged numerical scores for each rubric.confidence: A score from 0.0 to 1.0 indicating consistency across iterations.reasoning_steps: A combined list of all reasoning steps from all iterations.b. Custom Configuration
You can create a custom configuration for domain-specific needs. For example, you might create a set of weights for a specific NASA SMD (Science Mission Directorate) field like "Earth Science," which prioritizes recency.
This is in alignment with #11