Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions _posts/2024-12-08-dbt-expectations.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,13 @@ layout: post
title: Using dbt expectations as part of a dbt build.
---

<i> The objective of the blog post is to give a practical overview of the data transformation testing tool Great Expectations/dbt expectations. </i>
<i> The objective of the blog post is to give a practical overview of the data transformation testing tool Great Expectations (specifically the open source version dbt expectations. </i>

### Why data testing?

Having been involved in data transformations in the past (e.g. moving data from on prem to the Azure cloud) I'm aware of the potential complexity of ensuring the quality of data from source to target, verifying the transformations at each stage and maintaining data integrity.
Having been involved in data transformations in the past (e.g. moving data from on prem to the Azure cloud) I'm aware of the potential complexity of ensuring the quality of data from multiple sources to target, verifying the transformations at each stage and maintaining data integrity.

Given

### Great Expectations

[Great Expectations.io](https://greatexpectations.io/) and its open source version [dbt expectations](https://github.com/calogica/dbt-expectations) are frameworks that enable automated tests to be embedded in ingestion/transformation pipelines.
In the context of these data testing challenges, [Great Expectations.io](https://greatexpectations.io/) and its open source version [dbt expectations](https://github.com/calogica/dbt-expectations) are frameworks that enable automated tests to be embedded in ingestion/transformation pipelines.

<GE Image>
![Great Expectations logo', December 2024](/images/gx_logo_horiz_color.png)
Expand Down Expand Up @@ -72,6 +68,12 @@ In a specific example, the failing sql code is run directly against the table (i

### Lineage Graph (Data Flow DAG)

In the sections above we've looked at practical tests in dbt expectations which can be embedded in the data transformation pipeline, they can also be featured in the 'lineage graph' alongside the source tables, dimension, fact tables etc. to show where and when the tests run, what table it relates to etc.

Provided the test in question is included in the schema.yml and has a description value, we can see it included on the lineage graph generated by dbt:

![dbt lineage graph](/images/dbt-dag-3.png)

Source data in green -> dependencies

Select what types of elements to include in the graph, refresh to only show selection
Expand Down
Binary file added images/dbt-dag-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.