Small application to learn the basics of dbt.
References:
- Python
- Docker and docker compose
By default dbt
will look for warehouse connections in the file ~/.dbt/profiles.yml
. The DBT_PROFILES_DIR
environment variable tells dbt to look for the profiles.yml
file in the current working directory. You need to define all the connections credentials in you local .env
:
# Set to recognize profiles.yml
DBT_PROFILES_DIR=$(pwd)
# DEV database parameters
DBT_POSTGRES_HOST_DEV=localhost
DBT_POSTGRES_PORT_DEV=5432
DBT_POSTGRES_USER_DEV=dbt_dev
DBT_POSTGRES_PWD_DEV=password
DBT_POSTGRES_DBNAME_DEV=jaffle_shop_dev
DBT_POSTGRES_SCHEMA_DEV=dbt_dev
set -a; source .env; set +a
cd dbt
dbt run/build/test/...
The repo is configured to run 2 separate GitHub Actions workflows:
- On any push to "main" branch, deploy to
dbt_final
schema in the remote Postgres db - On any Pull Request, deploy to
dbt_pr{pull_request_number}_{short_commit_sha}
schema
For these pipelines to work, you need to define these roles beforehand on the Postgres db:
-- Role for dbt to deploy in PROD
CREATE ROLE dbt_prod WITH LOGIN PASSWORD 'password_prod';
GRANT ALL PRIVILEGES ON SCHEMA raw TO dbt_prod;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA "raw" TO dbt_prod;
-- Role for dbt to deploy in STAG
CREATE ROLE dbt_stag WITH CREATEDB LOGIN PASSWORD 'password_stag';
GRANT CREATE ON DATABASE jaffle_shop_prod TO dbt_stag;
GRANT USAGE ON SCHEMA raw TO dbt_stag;
GRANT SELECT ON ALL TABLES IN SCHEMA raw TO dbt_stag;
- The STAG schemas related to the Pull Requests should be cleanup automatically after a given time to avoid bloating the db instance