Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified docs/assets/images/features/cll-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/features/cll-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/features/cll-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/features/cll-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 70 additions & 3 deletions docs/features/column-level-lineage.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,18 @@ Common use-cases for column-level lineage are

## Usage

1. Select a node in the lineage DAG, then click the **eye** icon next to the column you want to view.
1. Select a node in the lineage DAG, then click the the column you want to view.

![alt text](../assets/images/features/cll-1.png){: .shadow}

1. The column-level lineage for the selected column will be displayed.

![alt text](../assets/images/features/cll-2.png){: .shadow}

1. To exit column-level lineage view, click the close button in the upper-left corner.

![alt text](../assets/images/features/cll-3.png){: .shadow}

## Transformation Types

The transformation type is also displayed for each column, which will help you understand how the column was generated or modified.
Expand All @@ -34,6 +38,69 @@ The transformation type is also displayed for each column, which will help you u
| Unknown | We have no information about the transformation type. This could be due to a parse error, or other unknown reason. |


## Limitation
## Impact Radius of a Column

The **right side of the Column-Level Lineage (CLL)** graph represents the **impact radius** of a selected column.
This view helps you quickly understand what will be affected if that column changes.

### What does the impact radius include?

- **Downstream columns** that directly reference the selected column
- **Downstream models** that directly depend on the selected column
- **All indirect downstream columns and models** that transitively depend on it

This helps you evaluate both the direct and downstream effects of a column change, making it easier to understand its overall impact.


### Example: Simplified Model Chain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some textual explanation of the image would help, in addition to the raw SQL from the models.
To help the reader understand how CLL can help them to visualize the relations. Of course, they can figure it out from the SQL by themself, but we can speed that up with a brief, plain language, description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the example segment. PTAL

image

Given the following models, here's how changes to `stg_orders.status` would impact downstream models:

```sql
-- stg_orders.sql
select
order_id,
customer_id,
status,
...
from {{ ref("raw_orders") }}


-- orders.sql
select
order_id,
customer_id,
status,
...
from {{ ref("stg_orders") }}


-- customers.sql
select
c.customer_id,
...
from {{ ref("stg_customers") }} as c
join {{ ref("stg_orders") }} as o
on c.customer_id = o.customer_id
where o.status = 'completed'
group by c.customer_id


-- customer_segments.sql
select
customer_id,
...
from {{ ref("customers") }}
```

![alt text](../assets/images/features/cll-example.png){: .shadow}

The following impact is detected:

- **orders**: This model is partially impacted, as it selects the `status` column directly from `stg_orders` but does not apply any transformation or filtering logic. The change is limited to the `status` column only.

- **customers**: This model is fully impacted, because it uses `status` in a WHERE clause (`where o.status = 'completed'`). Any change to the logic in `stg_orders.status` can affect the entire output of the model.

- **customer_segments**: This model is indirectly impacted, as it depends on the `customers` model, which itself is fully impacted. Even though `customer_segments` does not directly reference `status`, changes can still propagate downstream via its upstream dependency.


Column-level lineage only displays column selection operations. It does not indicate if a column has been used in filters (WHERE clauses), with grouping (GROUP BY), joins, or other transformations.