You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rules like uniqueness and referential integrity are incredibly common when doing analysis on a schema. As a part of the testing process, dbt should be able to be fed a schema configuration that instructs it how to test that a schema is following these rules, and then be able to run these tests automatically.
There are three specific schema constraints that we should test for:
Not null
Uniqueness
Referential integrity
Below I'll provide the most standardized SQL to use to test these constraints. All tests pass when the queries return 0.
not null
Could be declared like table.field is not null.
with t as (
select [field] as f
from [table]
)
select count(*)
from t
where f is null
uniqueness
Could be declared like table.field is unique.
with t as (
select [field] as f
from [table]
)
select count(*) from (
select f
from t
group by f
having count(*) > 1
)
referential integrity
Could be declared like parent table one joins to child table many on one.id = many.one_id.
with one as (
select [pk field name] as id
from [table name of parent table]
), many as (
select [fk field name] as id
from [table name of child table]
)
select count(*)
from many
where id not in (select id from one)
and id is not null
The text was updated successfully, but these errors were encountered:
Wanted to have thought this through so that when you're ready for it you have a roadmap to implement. I think that generating and running a bunch of tests using these as templates would cover what we need.
I don't feel too strongly about the specific syntax I used for constructing the schema declarations, but I did try to make it "sql-like", which I think is a good thing.
Rules like uniqueness and referential integrity are incredibly common when doing analysis on a schema. As a part of the testing process, dbt should be able to be fed a schema configuration that instructs it how to test that a schema is following these rules, and then be able to run these tests automatically.
There are three specific schema constraints that we should test for:
Below I'll provide the most standardized SQL to use to test these constraints. All tests pass when the queries return 0.
not null
Could be declared like
table.field is not null
.uniqueness
Could be declared like
table.field is unique
.referential integrity
Could be declared like
parent table one joins to child table many on one.id = many.one_id
.The text was updated successfully, but these errors were encountered: