Skip to content

Add an enhanced multi-rubric relenvacy agent and checker#49

Merged
NISH1001 merged 1 commit into
developfrom
feature/enhanced-relevancy
Jun 20, 2025
Merged

Add an enhanced multi-rubric relenvacy agent and checker#49
NISH1001 merged 1 commit into
developfrom
feature/enhanced-relevancy

Conversation

@NISH1001
Copy link
Copy Markdown
Collaborator

  • Add akd.agents.relevancy.MultiRubricRelevancyAgent which performs a series of binary classifications across multiple evaluation criteria.
  • Add akd.tools.relevancy.EnhancedRelevancyChecker as a tool that utilizes the MultiRubricRelevancyAgent. This tool orchestrates multiple iterations and applies a weighted scoring model to produce a final, nuanced relevancy score.

Summary 📝

This pull request introduces a sophisticated, multi-dimensional relevancy assessment framework. It replaces the previous simple relevancy check with a new MultiRubricRelevancyAgent that evaluates content against six distinct rubrics: Topic Alignment, Content Depth, Recency, Methodological Relevance, Evidence Quality, and Scope.

To leverage this new agent, an EnhancedRelevancyChecker tool has been developed. This tool provides a configurable and robust method for determining content relevancy by applying customizable weights to each rubric's score. It supports multiple iterations and ensembling to improve accuracy and provides a confidence score for its assessments. Factory functions have been added to create pre-configured settings for specific use cases like "strict literature search" and "general content," making the new system both powerful and easy to use.

Details

  1. New MultiRubricRelevancyAgent:

    • Located in akd/agents/relevancy.py, this agent is designed to assess content against six specific dimensions.
    • Introduces new Pydantic schemas (MultiRubricRelevancyInputSchema, MultiRubricRelevancyOutputSchema) and Enum labels for each assessment rubric.
    • A factory function, create_multi_rubric_relevancy_agent in akd/agents/factory.py, configures the agent with a detailed system prompt that guides the LLM to act as an expert literature assessor.
  2. New EnhancedRelevancyChecker Tool:

    • Located in akd/tools/relevancy.py, this tool orchestrates the relevancy checking process.
    • It uses new configuration models (RubricWeights, RubricScoringConfig) that allow for fine-tuning the importance of each rubric and the numerical scores assigned to their labels.
    • The tool runs the agent for a configurable number of iterations (n_iter), optionally swapping the query and content for a more robust check.
    • It ensembles the results to calculate a final weighted score, a confidence level based on the consistency of the agent's responses, and individual scores for each rubric.
  3. Configuration Factories:

    • Two new factory functions have been added in akd/tools/factory.py:
      • create_strict_literature_config_for_relevancy: Provides a configuration with weights optimized for academic and scientific literature (e.g., high importance for methodology and evidence quality).
      • create_general_content_config_for_relevancy: Provides a configuration tailored for general-purpose content assessment (e.g., higher importance for topic and scope alignment).

Usage

This new framework introduces a powerful way to assess the relevancy of content. The two main components are the MultiRubricRelevancyAgent, which performs the core evaluation, and the EnhancedRelevancyChecker, a tool that orchestrates the process and calculates a final score.

1. Standalone MultiRubricRelevancyAgent

You can use the agent directly if you only need the raw, multi-faceted assessment without a final weighted score.

First, create an instance of the agent using the factory function. This comes pre-configured with a detailed system prompt for high-quality literature assessment.

from akd.agents.factory import create_multi_rubric_relevancy_agent
from akd.agents.relevancy import MultiRubricRelevancyInputSchema

# Create the agent
relevancy_agent = create_multi_rubric_relevancy_agent()

# Define the query and content to check
query = "landslide nepal 2025"
content = """A study on the Gorkha, Nepal (2015) earthquake and its relation to landslides, using deep learning models for detection."""

# Run the agent
result = await relevancy_agent.arun(
    MultiRubricRelevancyInputSchema(query=query, content=content)
)

# The output contains separate judgments for each rubric
print(result.model_dump_json(indent=2))

The output will be a JSON object containing assessments for topic_alignment, content_depth, recency_relevance, methodological_relevance, evidence_quality, and scope_relevance, along with an overall_relevance label and detailed reasoning_steps.

2. Using the EnhancedRelevancyChecker

The EnhancedRelevancyChecker is the recommended way to get a final, quantifiable relevancy score. It uses the MultiRubricRelevancyAgent internally and adds features like configurable weighting, multiple iterations, and confidence scoring.

a. Basic Usage with Pre-configured Setups

For convenience, factory functions are available to create configurations for common use cases. Here, we'll use the setup for a strict literature search.

from akd.tools.factory import create_strict_literature_config_for_relevancy
from akd.tools.relevancy import EnhancedRelevancyChecker
from akd.agents.relevancy import MultiRubricRelevancyInputSchema

# Create a configuration optimized for strict literature search.
# This sets higher weights for methodology and evidence, a high threshold (0.7),
# and runs for 3 iterations to ensure confidence.
strict_config = create_strict_literature_config_for_relevancy(n_iter=3, debug=True)

# Instantiate the checker with the configuration
checker = EnhancedRelevancyChecker(config=strict_config)

# Run the check
result = await checker.arun(
    MultiRubricRelevancyInputSchema(
        query="landslide nepal 2025",
        content="""A study on the Gorkha, Nepal (2015) earthquake and its relation to landslides, using deep learning models for detection.""",
        domain_context="Academic literature search for systematic review"
    )
)

# The output includes a final score, rubric breakdown, and confidence
print(result.model_dump_json(indent=2))

The output from the checker includes:

  • score: A final weighted score between 0.0 and 1.0.
  • is_relevant: A boolean indicating if the score met the configured threshold.
  • rubric_scores: A dictionary of the averaged numerical scores for each rubric.
  • confidence: A score from 0.0 to 1.0 indicating consistency across iterations.
  • reasoning_steps: A combined list of all reasoning steps from all iterations.
b. Custom Configuration

You can create a custom configuration for domain-specific needs. For example, you might create a set of weights for a specific NASA SMD (Science Mission Directorate) field like "Earth Science," which prioritizes recency.

from akd.tools.relevancy import EnhancedRelevancyChecker, EnhancedRelevancyCheckerConfig, RubricWeights
from akd.agents.factory import create_multi_rubric_relevancy_agent

# Define custom weights for Earth Science
earth_science_weights = RubricWeights(
    topic_alignment=0.30,
    content_depth=0.20,
    recency_relevance=0.20,  # Increased weight
    methodological_relevance=0.10,
    evidence_quality=0.15,
    scope_relevance=0.05,
)

# Create a custom configuration
custom_config = EnhancedRelevancyCheckerConfig(
    rubric_weights=earth_science_weights,
    relevance_threshold=0.6,
    n_iter=2,
    agent=create_multi_rubric_relevancy_agent() # Important to pass the agent
)

# Instantiate and run the checker
custom_checker = EnhancedRelevancyChecker(config=custom_config)
result = await custom_checker.arun(...) # as above

This is in alignment with #11

- Add `akd.agents.relenvacy.MultiRubricRelevancyAgent` which performs a
  bunch of different binary classification
- Add `akd.tools.relevancy.EnhancedRelevancyChecker` as a tool that uses
  `MultiRubricRelevancyAgent` under the hood and performs different
  scoring and iterations to get final scores
@NISH1001 NISH1001 merged commit 7c4979d into develop Jun 20, 2025
@NISH1001 NISH1001 deleted the feature/enhanced-relevancy branch June 20, 2025 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant