Skip to content
Tim Kirschke edited this page Mar 17, 2023 · 11 revisions

This macro creates a PIT table to gather snapshot based information of one hub and its surrounding satellites. For this macro to work, a snapshot table is required, that has a trigger collumn to identify which snapshots to include in the PIT table. The easiest way to create such a snapshot table is to use the control_snap macros provided by this package.

Features:

  • Tracks the active satellite entries for each entry in a Hub for each snapshot
  • Strongly improves performance if upstream queries requires many JOIN operations
  • Creates a unique dimension key to optimize loading performance of incremental loads
  • Allows to insert a static string as record source column, matching business vault definition of a record source
Parameters Data Type Explanation
pit_type string String to insert into the 'pit_type' column. Allows for future implementations of other PIT variants, like T-PITs etc. Can be set freely, something like 'PIT' could be the default. Optional.
tracked_entity string Name of the tracked Hub entity. Must be available as a model inside the dbt project.
hashkey string The name of the hashkey column inside the previously refered Hub entity.
sat_names list of strings A list of all the satellites that should be included in this PIT table. Can only be satellites that are attached to the tracked Hub, and should typically include all those satellites. You should always refer here to the version 1 satellites, since those hold the load-end-date. The macro currently supports regular satellites and nh-satellites.
snapshot_relation string The name of the snapshot relation. It needs to be available as a model inside this dbt project.
snapshot_trigger_column string The name of the column inside the previously mentioned snapshot relation, that is boolean and identifies the snapshots that should be included in the PIT table.
dimension_key string The desired name of the dimension key inside the PIT table. Should follow some naming conventions. Recommended is the name of the hashkey with a '_d' suffix.
ldts string Name of the ldts column inside all source models. Is optional, will use the global variable 'datavault4dbt.ldts_alias'. Needs to use the same column name as defined as alias inside the staging model.
custom_rsrc string A custom string that should be inserted into the 'rsrc' column inside the PIT table. Since a PIT table is a business vault entity, the technical record source is no longer used here. Optional.
ledts string Name of the load-end-date column inside the satellites. Is optional, will use the global variable 'datavault4dbt.ledts_alias' if not set here.

Example 1

{{ config(materialized='incremental',
        post_hook="{{ datavault4dbt.clean_up_pit('control_snap_v1') }}") }}

{%- set yaml_metadata -%}
pit_type: 'Regular PIT'
tracked_entity: 'account_h'
hashkey: 'hk_account_h'
sat_names:
    - account_lroc_p_s
    - account_lroc_n_s
    - account_hroc_p_s
    - account_hroc_n_s
snapshot_relation: 'control_snap_v1'
snapshot_trigger_column: 'is_active'
dimension_key: 'hk_account_d'
custom_rsrc: 'PIT table for SAP/Accounts. For more information see our Wiki!'
{%- endset -%}    

{%- set metadata_dict = fromyaml(yaml_metadata) -%}

{%- set pit_type = metadata_dict['pit_type'] -%}
{%- set tracked_entity = metadata_dict['tracked_entity'] -%}
{%- set hashkey = metadata_dict['hashkey'] -%}
{%- set sat_names = metadata_dict['sat_names'] -%}
{%- set snapshot_relation = metadata_dict['snapshot_relation'] -%}
{%- set snapshot_trigger_column = metadata_dict['snapshot_trigger_column'] -%}
{%- set dimension_key = metadata_dict['dimension_key'] -%}
{%- set custom_rsrc = metadata_dict['custom_rsrc'] -%}

{{ datavault4dbt.pit(pit_type=pit_type,
                                tracked_entity=tracked_entity,
                                hashkey=hashkey,
                                sat_names=sat_names,
                                snapshot_relation=snapshot_relation,
                                snapshot_trigger_column=snapshot_trigger_column,
                                dimension_key=dimension_key,
                                custom_rsrc=custom_rsrc) }}

Description

With this example, a PIT Table is created. In line three of this example, the post hook "clean_up_pit" is used. for further information about the hook, click on the following link: [Hook Clean Up PITs](https://github.com/ScalefreeCOM/datavault4dbt/wiki/Hook-Cleanup-Pits)
  • pit_type:
    • Regular PIT: PIT type is set to 'Regular PIT'. Optional.
  • tracked_entity:
    • account_h: This PIT table tracks the Hub Account.
  • hashkey:
    • hk_account_h: The name of the hashkey column ('hk_account_h') inside the previously refered Hub entity ('account_h').
  • sat_names:
    • ['account_lroc_p_s','account_lroc_n_s','account_hroc_p_s','account_hroc_n_s']: This four satellites are inclueded in the PIT table.
  • snapshot_relation:
    • control_snap_v1: The name of the snapshot relation.
  • snapshot_trigger_column:
    • is_active: The name of the column inside the previously mentioned snapshot relation that is boolean and identifies the snapshots that should be included in the PIT table.
  • dimension_key:
    • hk_account_d: The desired name of the dimension key inside the PIT table.
  • custom_rsrc:
    • PIT table for SAP/Accounts.: A custom string that should be inserted into the 'rsrc' column inside the PIT table. Optional.