Skip to content

Conversation

@rhlbhatnagar
Copy link
Contributor

@rhlbhatnagar rhlbhatnagar commented Nov 5, 2025

No description provided.

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Nov 5, 2025
Copy link
Member

@anistark anistark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ascore implementation executes verification sequentially.

We can probably execute these in parallel, which is sequential right now:

  • Verify response claims against reference
  • Verify reference claims against response

thoughts?

statements: List[StatementFaithfulnessAnswer]


def claim_decomposition_prompt(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this method here or can we place it somewhere in common ragas.prompt.metrics.common


# Ensure implementations give reasonably similar scores
# Factual correctness may have more variation due to claim decomposition and different LLM behavior
assert score_diff < 0.35, (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35% tolerance is too high, no? Is it intended? Can it be lowered to 10-15 % ?


return MetricResult(value=float(np.round(score, 2)))

async def _decompose_claims(self, response: str) -> List[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing callbacks in this and _verify_claims methods.

Is it intended? Callbacks would help in analysis and tracing.

@rhlbhatnagar rhlbhatnagar marked this pull request as draft November 5, 2025 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants