Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test to check for recent rows at each grouping level #47

Closed
clausherther opened this issue Apr 22, 2021 · 3 comments · Fixed by #49
Closed

Add a test to check for recent rows at each grouping level #47

clausherther opened this issue Apr 22, 2021 · 3 comments · Fixed by #49

Comments

@clausherther
Copy link
Contributor

clausherther commented Apr 22, 2021

We should add the ability to check for recent data at a grouping level via a group_by parameter.
For example, a user might expect recent rows for every product in a fact table. So, a recency check should look at grouped rows for each product_id that does not meet the recency requirement.

e.g.

with latest_grouped_timestamps as (
    
    select
        product_id,
        max(timestamp_column) as latest_timestamp_column
    from
        fact_table
    group by
        1
),
validation_errors as (
    
    select *
    from
        latest_grouped_timestamps
    where
        latest_timestamp_column < { threshold_timestamp }

)
select count(*) from validation_errors

Name: expect_grouped_row_values_to_have_recent_data?

Note: this would only cover grouped values that already exists in the model to be tested. So, for example, products that have never sold would not be checked. This would require a left join to a dimension table and is currently out of scope for this test.

@jeffwhite-indeed
Copy link

This looks great. This should be able to work with SCD2 Dims as well, correct?

@clausherther
Copy link
Contributor Author

@jeffwhite-indeed yes, this works with any single table, as long as you have column you want to group by and a timestamp or date column to check for recent values for each grouped row. So fact tables or type 2 dimensions in a dimension model, or other tables like that.

@jeffwhite-indeed
Copy link

jeffwhite-indeed commented Apr 22, 2021

Perfect. Thanks for confirming that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants