feat: accuracy tag uses stored data in retries #437
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds support for using custom accuracy data sets in Behave scenarios via new
@accuracy_data_<suffix>tags, allowing more flexible and data-driven AI scenario testing. It also updates the documentation and tests to reflect these new capabilities, and improves how accuracy and retry values are determined when accuracy data is present.New Behave accuracy data tag and usage:
@accuracy_data_<suffix>tags in Behave scenarios, enabling scenarios to use different sets of accuracy data per retry. The suffix allows referencing specific data sets stored in the context. [1]], [2]], [3]], [4]])Documentation updates:
docs/ai_utils.rstto document the new@accuracy_datatag, its format, usage examples, and how to store and access accuracy data in Behave step definitions. [1]], [2]])Enhancements to accuracy utilities:
get_accuracy_and_retries_from_tagsto use the length of accuracy data (if present) as the default number of retries, improving alignment between data and test execution. Added helper functions to extract accuracy data suffixes and fetch/store retry data. [1]], [2]], [3]], [4]])Test improvements:
Other minor changes:
3.6.0.dev2. ([VERSIONL1-R1])environment.pyfor consistency. ([toolium/behave/environment.pyL90-R96])