Skip to content

Dependent Child Keys

tkirschke edited this page Dec 13, 2022 · 1 revision

Dependent Child Keys

Some Links require a dependent child key to create a unique combination of columns, such as a line-item number. Let's think of an example object representing invoices. An Invoice is connecting multiple products. Therefor the combination of an invoice number and a product number is not unique. The uniquess is only achieved, when the line-item number is added.

To model a dependent child key with datavault4dbt, you have to consider both the staging model and the link model. The dependent child key would be added to the input of the link hashkey calculation, to achieve uniqueness inside the link. Additionally, the unhashed dependent child key is added as a column to the link model.

Staging Model

To extend the link hashkey with the dependent child key, the staging model for invoices must have a section for "hashed_columns" where the link hashkey is defined properly. Let's have a look at an example staging model:

stg_invoices.sql

{{ config(materialized='view') }}

{%- set yaml_metadata -%}
source_model: 'source_invoices'
ldts: 'edwLoadDate'
rsrc: 'edwRecordSource'
hashed_columns: 
    hk_invoices_products_l:
        - invoice_number
        - product_no
        - line_item_no
    hk_product_h:
        - product_no 
    hk_invoices_h:
        - invoice_number 
    hd_invoices_s:
        is_hashdiff: true
        columns: 
            - invoice_date
            - total_sum
            - payment_method
            - shipping_method
    hd_invoices_products_s:
        is_hashdiff: true
        columns:
            - quantity
            - discount 
{%- endset -%}

{%- set metadata_dict = fromyaml(yaml_metadata) -%}

{%- set source_model = metadata_dict['source_model'] -%}
{%- set ldts = metadata_dict['ldts'] -%}
{%- set rsrc = metadata_dict['rsrc'] -%}
{%- set hashed_columns = metadata_dict['hashed_columns'] -%}

{{ datavault4dbt.stage(source_model=source_model,
                    ldts=ldts,
                    rsrc=rsrc,
                    hashed_columns=hashed_columns,
                    derived_columns=none,
                    prejoined_columns=none,
                    missing_columns=none) }}

In this example, the hashkey for an invoices_products_l would be calculated out of the two business keys 'invoice_number' and 'product_no' plus the dependent child key 'line_item_no'.

Link Model

invoices_products_l.sql

{{ config(materialized='incremental') }}

{%- set yaml_metadata -%}
link_hashkey: 'hk_invoices_products_l'
foreign_hashkeys: 
    - 'hk_product_h'
    - 'hk_invoices_h'
    - 'line_item_no'
source_models: stg_invoices
{%- endset -%}    

{%- set metadata_dict = fromyaml(yaml_metadata) -%}

{%- set link_hashkey = metadata_dict['link_hashkey'] -%}
{%- set foreign_hashkeys = metadata_dict['foreign_hashkeys'] -%}
{%- set source_models = metadata_dict['source_models'] -%}


{{ datavault4dbt.link(link_hashkey=link_hashkey,
        foreign_hashkeys=foreign_hashkeys,
        source_models=source_models) }}

On top of the previously created stage model, a link between invoices and products is created, as shown in the example model above. Compared to a regular link without a dependent child key, we have hijacked the 'foreign_hashkeys' parameter to additionally include the dependent child key in the link table structure.

Conclusion

Modeling a dependent child key with datavault4dbt is very straight forward. Users just need to add the dependent child key to the link hashkey calculation inside the staging model. Afterwards they add the depdendent child key to the foreign hashkeys inside the link model.

Clone this wiki locally