Allow checks based on data types #1169

NeerajMalhotra-QB · 2023-04-27T19:26:52Z

Is your feature request related to a problem? Please describe.
Imagine you have this schema:

schema = pa.DataFrameSchema({
    "a": pa.Column(int, checks=pa.Check.le(10)),
    "b": pa.Column(float, checks=pa.Check.lt(-1.2)),
    "c": pa.Column(str, checks=pa.Check.le(20)),
})

In above schema, column c has wrong check. But it will still flow through the entire process and may eventually fail in data side of validation but there may be other situations where check may not fail and fall through the cracks.

Such type of checks shouldn't be allowed for a given data type.

Describe the solution you'd like
There are 2 ways to solve this:

write a decorator to match the checks with respective allowed data types only. We are building it in forked branch here for pyspark.sql - https://github.com/NeerajMalhotra-QB/pandera.
Another (much better) solution will be to enhance register_checks and register_dtype to validate if a check should be allowed for a given type of the field.

We haven't adopted 2nd option yet, as it will require changes in common area (used by other frameworks) but will look into it in future release unless someone wants to take a stab on it first.

cc: @cosmicBboy

The text was updated successfully, but these errors were encountered:

NeerajMalhotra-QB · 2023-06-12T16:55:50Z

This is implemented for native pyspark.sql in #1213.

We can leverage it and extend to other frameworks.

NeerajMalhotra-QB added the enhancement New feature or request label Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow checks based on data types #1169

Allow checks based on data types #1169

NeerajMalhotra-QB commented Apr 27, 2023 •

edited

Loading

NeerajMalhotra-QB commented Jun 12, 2023

Allow checks based on data types #1169

Allow checks based on data types #1169

Comments

NeerajMalhotra-QB commented Apr 27, 2023 • edited Loading

NeerajMalhotra-QB commented Jun 12, 2023

NeerajMalhotra-QB commented Apr 27, 2023 •

edited

Loading