Skip to content

Conversation

@rgonalo
Copy link
Member

@rgonalo rgonalo commented Nov 11, 2025

This pull request adds support for using custom accuracy data sets in Behave scenarios via new @accuracy_data_<suffix> tags, allowing more flexible and data-driven AI scenario testing. It also updates the documentation and tests to reflect these new capabilities, and improves how accuracy and retry values are determined when accuracy data is present.

New Behave accuracy data tag and usage:

  • Added support for @accuracy_data_<suffix> tags in Behave scenarios, enabling scenarios to use different sets of accuracy data per retry. The suffix allows referencing specific data sets stored in the context. [1]], [2]], [3]], [4]])

Documentation updates:

  • Expanded docs/ai_utils.rst to document the new @accuracy_data tag, its format, usage examples, and how to store and access accuracy data in Behave step definitions. [1]], [2]])

Enhancements to accuracy utilities:

  • Updated get_accuracy_and_retries_from_tags to use the length of accuracy data (if present) as the default number of retries, improving alignment between data and test execution. Added helper functions to extract accuracy data suffixes and fetch/store retry data. [1]], [2]], [3]], [4]])

Test improvements:

  • Added comprehensive tests for the new accuracy data tag extraction, data retrieval, and retry data storage logic, ensuring correct behavior with various tag and data combinations. [1]], [2]])

Other minor changes:

Copy link
Contributor

@robertomier robertomier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rgonalo rgonalo merged commit bc5ae5c into master Nov 17, 2025
17 checks passed
@rgonalo rgonalo deleted the feat/accuracy_using_data branch November 17, 2025 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants