Inspired by #331 (comment)
Here's an example with a rule that relies on dynamically selecting all boolean columns.
If there are no group rules, this works as expected.
Adding a group rule can make validation fail unexpectedly. I have not debugged this in detail, but I thikn the reason is that we chain with_columns here in a way that makes the rule expression "see" the result of the group rule evaluation.
MWE:
import polars as pl
import dataframely as dy
class MySchema(dy.Schema):
x = dy.Bool()
@dy.rule()
def myrule(cls):
return ~pl.any_horizontal(pl.col(pl.Boolean))
class MySchemaWithGroupRule(MySchema):
@dy.rule(group_by="x")
def my_group_rule(cls):
return pl.lit(True)
df = pl.DataFrame({"x": [False]})
# Succeeds
MySchema.validate(df)
# Fails
MySchemaWithGroupRule.validate(df)
Result:
File "/Users/x/repos/dataframely/test.py", line 26, in <module>
MySchemaWithGroupRule.validate(df)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^
File "/Users/x/repos/dataframely/dataframely/schema.py", line 573, in validate
raise ValidationError(
format_rule_failures(list(failure.counts().items()))
)
dataframely.exc.ValidationError: 1 rules failed validation:
- 'myrule' failed for 1 rows
Expectation: validation should succeed in both cases.
Inspired by #331 (comment)
Here's an example with a rule that relies on dynamically selecting all boolean columns.
If there are no group rules, this works as expected.
Adding a group rule can make validation fail unexpectedly. I have not debugged this in detail, but I thikn the reason is that we chain
with_columnshere in a way that makes the rule expression "see" the result of the group rule evaluation.MWE:
Result:
Expectation: validation should succeed in both cases.