Skip to content

mashiike/dbt-materialization-load-with

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dbt-materialization-load-with

A dbt package for materialization with data loading Note: This is a PoC-like DBT package for Redshift.

This DBT package mainly provides materialization for Redshift. There are two materializations in the package, load_table and load_incremental. These are based on the concept of loading data into a temporary table using Redshift's COPY command, and then executing the table or incremental materialization based on the loaded temporary table.

Installation

Add to your packages.yml

packages:
  - git: "https://github.com/mashiike/dbt-materialization-load-with"
    revision: v0.0.0

It is also useful to prepare the following macro.

{% macro is_incremental() %}
  {{ return(materialization_load_with.is_incremental()) }}
{% endmacro %}

{% macro load_temporary_table() %}
  {{ return(materialization_load_with.load_temporary_table()) }}
{% endmacro %}

QuickStart

Assuming you have a CSV like the following, when you load it, it will look like this

csv:

id,name,age
1,hoge,18
2,fuga,28
3,piyo,38

simple_load_table.sql:

{{config(
    materialized='load_table',
    load_columns=[
      {'name':'id',   'data_type':'integer'},
      {'name':'name', 'data_type':'varchar'},
      {'name':'age', 'data_type':'integer'},
    ],
    from='s3://example-com/path/to/csv',
    iam_role=env_var('IAM_ROLE_ARN'),
    copy_option="REGION '"~env_var('AWS_DEFAULT_REGION','us-east-1')~"' FORMAT csv IGNOREHEADER 1",
)}}

select *
from {{load_temporary_table()}}

If you want to update it with an addendum, it will look like this simple_load_incremental.sql:

{{config(
    materialized='load_table',
    load_columns=[
      {'name':'id',   'data_type':'integer'},
      {'name':'name', 'data_type':'varchar'},
      {'name':'age', 'data_type':'integer'},
    ],
    from='s3://example-com/path/to/csv',
    iam_role=env_var('IAM_ROLE_ARN'),
    copy_option="REGION '"~env_var('AWS_DEFAULT_REGION','us-east-1')~"' FORMAT csv IGNOREHEADER 1",
)}}

select *
from {{load_temporary_table()}} as l
{%- if is_incremental() %}
where not exists (
  select 1 
  from {{ this }} as t
  where l.id = t.id
)
{%- endif %}

Note: In the case of load_incremental, it is difficult to switch the config dynamically; it is very useful if the from is fixed or can be specified by an environment variable.

LICENSE

MIT

However, some of the code has been modified from https://github.com/dbt-labs/dbt-core, the original license of which is here

About

A dbt package for materialization with data loading

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published