You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
dbt is a popular way of defining transformations and dlt implements an extremely easy to use runner (see #28 ) that executes any dbt package in isolated python environment. it also manages the dbt profiles and credentials so the same pipeline definition can be used both for loading and transforming.
we want to make this even easier by generating initial package that will contain all the models and support the chaining mechanism. the user just needs to add transformations for their tables. Here is an example of a dbt package that was started from such template: https://github.com/dlt-hub/rasa_semantic_schema
should work with bigqueryredshiftpostgresduckdb (in fact the materializations for _dlt_loads table are so simple that they should work everywhere)
Implementation
Let's start from standalone script that generates dbt package from Pipeline instance passed. The instance will contain all relevant schemas with table definitions.
The text was updated successfully, but these errors were encountered:
Motivation
dlt
automates extraction and loading data into a warehouse. it also infers table schema from the data and allows to add various hints uniqueness and nullability. the loading is "atomic" thanks to "load_id" mechanism where any new piece of data is marked with load_id which is flagged as valid after all the data was correctly loaded (see here https://github.com/dlt-hub/rasa_semantic_schema/blob/master/models/staging/load_ids.sql and https://github.com/dlt-hub/rasa_semantic_schema/blob/master/models/views/_loads.sql). The same mechanism allows to chain transformations for everyload_id
.dbt
is a popular way of defining transformations anddlt
implements an extremely easy to use runner (see #28 ) that executes anydbt
package in isolated python environment. it also manages thedbt
profiles and credentials so the samepipeline
definition can be used both for loading and transforming.we want to make this even easier by generating initial package that will contain all the models and support the chaining mechanism. the user just needs to add transformations for their tables. Here is an example of a dbt package that was started from such template: https://github.com/dlt-hub/rasa_semantic_schema
Requirements
dbt
runner as implemented in [dbt] allow executing dbt packages via dbt runner #28dbt
(our schemas are similar to dbt model yml so it should be quite easy) - see https://github.com/dlt-hub/rasa_semantic_schema/blob/master/models/sources.ymlload_id
status update as in https://github.com/dlt-hub/rasa_semantic_schema/blob/master/models/views/_loads.sqlbigquery
redshift
postgres
duckdb
(in fact the materializations for_dlt_loads
table are so simple that they should work everywhere)Implementation
Let's start from standalone script that generates
dbt
package fromPipeline
instance passed. The instance will contain all relevant schemas with table definitions.The text was updated successfully, but these errors were encountered: