-
Notifications
You must be signed in to change notification settings - Fork 436
Description
What would you like to happen?
Description: Issue Type: Improvement / Feature Request Priority: Major
Summary: Currently, the Golden Data Set validation in Pipeline Unit Tests performs a strict comparison: the output rows must exactly match the values in the dataset. If a single character differs (or if the sort order changes), the test fails.
This rigidity makes it difficult to test pipelines that generate dynamic data, such as:
Timestamps/Dates: (e.g., sysdate or execution_date).
Generated IDs: (e.g., UUIDs or sequence numbers).
Variable outputs: (e.g., Live API responses).
Proposed Solution: Enhance the Golden Data Set configuration to allow Validation Rules (Assertions) per field, rather than just static values.
Instead of only checking Value == Expected, allow the user to define rules such as:
Not Null: Pass if the field has any value.
Data Type Check: Pass if the value is a specific type (e.g., String, Integer).
Numeric Ranges: Pass if Value > X or Value < Y.
String Length: Pass if Length > X.
Regex Matching: Pass if the string matches a pattern (e.g., ^[0-9]{3}-[0-9]{2}$).
Contains: Pass if the string contains a specific substring.
Benefit: This would drastically improve the usability of Unit Tests for real-world scenarios where data is dynamic. Developers could validate the structure and integrity of the data without needing to know the exact runtime values beforehand.
Issue Priority
Priority: 3
Issue Component
Component: Pipelines