Add an enhanced multi-rubric relenvacy agent and checker by NISH1001 · Pull Request #49 · NASA-IMPACT/akd-core

NISH1001 · 2025-06-20T14:57:02Z

Add akd.agents.relevancy.MultiRubricRelevancyAgent which performs a series of binary classifications across multiple evaluation criteria.
Add akd.tools.relevancy.EnhancedRelevancyChecker as a tool that utilizes the MultiRubricRelevancyAgent. This tool orchestrates multiple iterations and applies a weighted scoring model to produce a final, nuanced relevancy score.

Summary 📝

This pull request introduces a sophisticated, multi-dimensional relevancy assessment framework. It replaces the previous simple relevancy check with a new MultiRubricRelevancyAgent that evaluates content against six distinct rubrics: Topic Alignment, Content Depth, Recency, Methodological Relevance, Evidence Quality, and Scope.

To leverage this new agent, an EnhancedRelevancyChecker tool has been developed. This tool provides a configurable and robust method for determining content relevancy by applying customizable weights to each rubric's score. It supports multiple iterations and ensembling to improve accuracy and provides a confidence score for its assessments. Factory functions have been added to create pre-configured settings for specific use cases like "strict literature search" and "general content," making the new system both powerful and easy to use.

Details

New MultiRubricRelevancyAgent:
- Located in akd/agents/relevancy.py, this agent is designed to assess content against six specific dimensions.
- Introduces new Pydantic schemas (MultiRubricRelevancyInputSchema, MultiRubricRelevancyOutputSchema) and Enum labels for each assessment rubric.
- A factory function, create_multi_rubric_relevancy_agent in akd/agents/factory.py, configures the agent with a detailed system prompt that guides the LLM to act as an expert literature assessor.
New EnhancedRelevancyChecker Tool:
- Located in akd/tools/relevancy.py, this tool orchestrates the relevancy checking process.
- It uses new configuration models (RubricWeights, RubricScoringConfig) that allow for fine-tuning the importance of each rubric and the numerical scores assigned to their labels.
- The tool runs the agent for a configurable number of iterations (n_iter), optionally swapping the query and content for a more robust check.
- It ensembles the results to calculate a final weighted score, a confidence level based on the consistency of the agent's responses, and individual scores for each rubric.
Configuration Factories:
- Two new factory functions have been added in akd/tools/factory.py:
  - create_strict_literature_config_for_relevancy: Provides a configuration with weights optimized for academic and scientific literature (e.g., high importance for methodology and evidence quality).
  - create_general_content_config_for_relevancy: Provides a configuration tailored for general-purpose content assessment (e.g., higher importance for topic and scope alignment).

Usage

This new framework introduces a powerful way to assess the relevancy of content. The two main components are the MultiRubricRelevancyAgent, which performs the core evaluation, and the EnhancedRelevancyChecker, a tool that orchestrates the process and calculates a final score.

1. Standalone `MultiRubricRelevancyAgent`

You can use the agent directly if you only need the raw, multi-faceted assessment without a final weighted score.

First, create an instance of the agent using the factory function. This comes pre-configured with a detailed system prompt for high-quality literature assessment.

from akd.agents.factory import create_multi_rubric_relevancy_agent
from akd.agents.relevancy import MultiRubricRelevancyInputSchema

# Create the agent
relevancy_agent = create_multi_rubric_relevancy_agent()

# Define the query and content to check
query = "landslide nepal 2025"
content = """A study on the Gorkha, Nepal (2015) earthquake and its relation to landslides, using deep learning models for detection."""

# Run the agent
result = await relevancy_agent.arun(
    MultiRubricRelevancyInputSchema(query=query, content=content)
)

# The output contains separate judgments for each rubric
print(result.model_dump_json(indent=2))

The output will be a JSON object containing assessments for topic_alignment, content_depth, recency_relevance, methodological_relevance, evidence_quality, and scope_relevance, along with an overall_relevance label and detailed reasoning_steps.

2. Using the `EnhancedRelevancyChecker`

The EnhancedRelevancyChecker is the recommended way to get a final, quantifiable relevancy score. It uses the MultiRubricRelevancyAgent internally and adds features like configurable weighting, multiple iterations, and confidence scoring.

a. Basic Usage with Pre-configured Setups

For convenience, factory functions are available to create configurations for common use cases. Here, we'll use the setup for a strict literature search.

from akd.tools.factory import create_strict_literature_config_for_relevancy
from akd.tools.relevancy import EnhancedRelevancyChecker
from akd.agents.relevancy import MultiRubricRelevancyInputSchema

# Create a configuration optimized for strict literature search.
# This sets higher weights for methodology and evidence, a high threshold (0.7),
# and runs for 3 iterations to ensure confidence.
strict_config = create_strict_literature_config_for_relevancy(n_iter=3, debug=True)

# Instantiate the checker with the configuration
checker = EnhancedRelevancyChecker(config=strict_config)

# Run the check
result = await checker.arun(
    MultiRubricRelevancyInputSchema(
        query="landslide nepal 2025",
        content="""A study on the Gorkha, Nepal (2015) earthquake and its relation to landslides, using deep learning models for detection.""",
        domain_context="Academic literature search for systematic review"
    )
)

# The output includes a final score, rubric breakdown, and confidence
print(result.model_dump_json(indent=2))

The output from the checker includes:

score: A final weighted score between 0.0 and 1.0.
is_relevant: A boolean indicating if the score met the configured threshold.
rubric_scores: A dictionary of the averaged numerical scores for each rubric.
confidence: A score from 0.0 to 1.0 indicating consistency across iterations.
reasoning_steps: A combined list of all reasoning steps from all iterations.

b. Custom Configuration

You can create a custom configuration for domain-specific needs. For example, you might create a set of weights for a specific NASA SMD (Science Mission Directorate) field like "Earth Science," which prioritizes recency.

from akd.tools.relevancy import EnhancedRelevancyChecker, EnhancedRelevancyCheckerConfig, RubricWeights
from akd.agents.factory import create_multi_rubric_relevancy_agent

# Define custom weights for Earth Science
earth_science_weights = RubricWeights(
    topic_alignment=0.30,
    content_depth=0.20,
    recency_relevance=0.20,  # Increased weight
    methodological_relevance=0.10,
    evidence_quality=0.15,
    scope_relevance=0.05,
)

# Create a custom configuration
custom_config = EnhancedRelevancyCheckerConfig(
    rubric_weights=earth_science_weights,
    relevance_threshold=0.6,
    n_iter=2,
    agent=create_multi_rubric_relevancy_agent() # Important to pass the agent
)

# Instantiate and run the checker
custom_checker = EnhancedRelevancyChecker(config=custom_config)
result = await custom_checker.arun(...) # as above

This is in alignment with #11

- Add `akd.agents.relenvacy.MultiRubricRelevancyAgent` which performs a bunch of different binary classification - Add `akd.tools.relevancy.EnhancedRelevancyChecker` as a tool that uses `MultiRubricRelevancyAgent` under the hood and performs different scoring and iterations to get final scores

NISH1001 merged commit 7c4979d into develop Jun 20, 2025

NISH1001 deleted the feature/enhanced-relevancy branch June 20, 2025 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an enhanced multi-rubric relenvacy agent and checker#49

Add an enhanced multi-rubric relenvacy agent and checker#49
NISH1001 merged 1 commit into
developfrom
feature/enhanced-relevancy

NISH1001 commented Jun 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NISH1001 commented Jun 20, 2025

Summary 📝

Details

Usage

1. Standalone MultiRubricRelevancyAgent

2. Using the EnhancedRelevancyChecker

a. Basic Usage with Pre-configured Setups

b. Custom Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Standalone `MultiRubricRelevancyAgent`

2. Using the `EnhancedRelevancyChecker`