Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: guide for existing dbt projects #6769

Merged
merged 4 commits into from Sep 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/src/_getting-started/index.md
Expand Up @@ -908,6 +908,7 @@ schedules:
Once your raw data has arrived in your data warehouse, its schema will likely need to be transformed to be more appropriate for analysis.

To help you achieve this, Meltano supports transformation using [`dbt`](https://www.getdbt.com/).
If you already have an existing dbt project that you'd like to migrate to Meltano, check out the [existing dbt project guide](https://docs.meltano.com/guide/existing-dbt-project) for more details.
pnadolny13 marked this conversation as resolved.
Show resolved Hide resolved

To learn about data transformation, refer to the [Data Transformation (T) guide](/guide/transformation).
`dbt` plugins are adapter specific so you should install the plugin that matches your warehouse (i.e. Postgres = `dbt-postgres`, Snowflow = `dbt-snowflake`, etc.).
Expand Down
70 changes: 70 additions & 0 deletions docs/src/_guide/migrate-an-existing-dbt-project.md
@@ -0,0 +1,70 @@
---
title: Migrate an Existing dbt Project
description: Learn how to import an existing dbt project into your Meltano project.
layout: doc
weight: 25
---

This guide will describe how to bring existing dbt code into the your Meltano project.
Meltano uses some suggested patterns for organizing your dbt project so it integrates well with core features of Meltano like [environments](https://docs.meltano.com/concepts/environments).
You can organize your project whatever way you chose but this guide will describe how to import it so it matches the default transformer installation.

### Pre-requisites

As always, we highly recommend git versioning your Meltano project prior to following this guide so you have the ability to roll back and not affect your existing Meltano project configurations.


### Add dbt Transformer

Add your adapter-specific dbt variant (e.g. dbt-postgres) that can be found on [MeltanoHub](https://hub.meltano.com/transformers/).


```
meltano add transformer dbt-<adapter_name>

# For example
meltano add transformer dbt-postgres
```

Next configure your transformer to include database names, connection credentials, etc.
See the [transform data guide](https://docs.meltano.com/guide/transformation#install-dbt) for more details.
Or use the [interactive config flag](/reference/command-line-interface#how-to-use-interactive-config) to follow prompts.

```
meltano config dbt-snowflake set --interactive
```

Once you've configured your transformer you should be able to run the following command to test your connection and credentials.

```
meltano invoke dbt-postgres debug
```

### Migrating dbt Code Into Meltano

Note that adding a transformer creates scaffolding within your Meltano `/transform` directory including a `dbt_project.yml` and `/profile/profiles.yml`.
If you have an existing dbt project you will already have your own version of these files in your other repo so we'll describe how to merge what you have and what Meltano provides and expects.

#### Meltano's Default Structure For dbt

Meltano expects dbt project files to exist in the default directories listed below.
You can either place your files in the appropriate directories or you can update the given `dbt_project.yml` to follow the directory structure of your existing project, if thats preferred.

- data - this is where seed files are stored. See [seeds dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/seeds)
- models - this is where models are stored. See [models dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/building-models)
- analysis - analysis sql that shouldnt be materialized. See [analyses dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/analyses)
- tests - this is where singular dbt test are stored. See [tests dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/tests)
- macros - jinja macros. See [macros dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/jinja-macros)
- snapshots - this is where snapshot models are stored. See [snapshots dbt docs](https://docs.getdbt.com/docs/building-a-dbt-project/snapshots)

#### dbt Profiles

Meltano's default dbt project scaffolding comes with a profiles.yml (see [dbt profiles docs](https://docs.getdbt.com/dbt-cli/configure-your-profile) for details) that is configured to take advantage of the [environments](https://docs.meltano.com/concepts/environments) feature.
This means that you configure dbt using the provided Meltano settings and they get automatically passed to dbt based on what Meltano environment is active.
Meltano's dbt installation comes with pre-configured [dbt targets](https://docs.getdbt.com/dbt-cli/configure-your-profile#understanding-targets-in-profiles) mapped to the default environment names (i.e. dev, staging, prod), avoiding the need to toggle credentials manually and allowing sharing of settings/credentials across plugins.

#### Custom `dbt_projects.yml` Configurations

If you had any configurations in your dbt_project.yml such as definitions of how models are materialized, target databases, schemas, etc. you can directly copy them into your new Meltano dbt_project.yml file.

Again Meltano doesn't require this structure, any valid dbt project will work, but this is the default recommended structure with some base configurations for a simple integration between Meltano and dbt.
12 changes: 8 additions & 4 deletions docs/src/_guide/transformation.md
Expand Up @@ -7,6 +7,10 @@ weight: 5

Transformations in Meltano are implemented using dbt. All Meltano generated projects have a `transform/` directory, which is populated with the required configuration, models, packages, etc in order to run the transformations. A transform in Meltano is simply a set of dbt models that can be installed as a package. See the [transform plugin](/concepts/plugins#transforms) docs for more details.

<div class="notification is-info">
<p>If you already have an existing dbt project that you'd like to migrate to Meltano, check out the <a href="/guide/existing-dbt-project">existing dbt project guide</a> for more details.</p>
</div>

## Adapter-Specific dbt Transformation

In alignment with the [dbt documentation](https://docs.getdbt.com/docs/available-adapters), we support adapter-specific installations of `dbt`.
Expand All @@ -33,11 +37,11 @@ After dbt is installed you can configure it using `config` CLI commands, [Meltan
# list available settings
meltano config dbt-snowflake list

# set the Snowflake user in the `dev` environment
meltano --environment=dev config dbt-snowflake set user DEV_USER
# configure the `dev` environment interactively
meltano --environment=dev config dbt-snowflake set --interactive

# set the Snowflake user in the `prod` environment
meltano --environment=prod config dbt-snowflake set user PROD_USER
# configure the `prod` environment interactively
meltano --environment=prod config dbt-snowflake set --interactive
```

More details on [configuring plugins](/guide/configuration), including with [environment variables](/guide/configuration#environment-variables).
Expand Down