# The DBT Workflow

### Introduction

Now that we have gotten our DBT account connected to our data warehouse and our github repository, it's time to see how we can use DBT with our data warehouse.  Now remember, that there are a few main benefits of using DBT:


1. It provides an opinionated workflow for writing our queries
2. It provides opinionated *file structure* for organizing our SQL queries
1. It allows to quickly turn SQL SELECT statements into new SQL tables populated with that selected data


In this lesson, we'll focus on the two benefits: we'll see how DBT can allow us to quickly create and populate new tables, and we'll see how it enforces a proper development workflow.

### Moving to DBT

Now as we know, we can always directly query our Northwinds database in *Snowflake*.  And in snowflake this looks something like the following:

> <img src="./snowflake-query.png" width="100%">

Ok, now let's try to run this same query from DBT.  First, remember that we should have already connected DBT to our snowflake database.  In fact, let's see this.  Once your logged into DBT, click on the sidebar, and then click on Account Settings. 

> <img src="./account_settings.png" width="30%">

We can then see the connection to snowflake from our northwinds project.

> <img src="./dbt-snowflake.png" width="100%">

So click on `Connection: Snowflake`, and we can see the details of that connection.  

> <img src="./snowflake-dbt-conn.png" width="80%">

### DBT with Git

Ok, so now it's about time to query our from DBT.  So click on the `Northwinds` dropdown at the top, followed by the `Northwinds` tab in the dropdown.

> <img src="./northwinds-tab.png" width="60%">

And from there return to the codebase by clicking on the green `Start Developing` button.  

Now, when we return to the codebase, we start off on the master branch in git.  But DBT *will not allow us* to make any changes or add new code directly on the master branch.  

> <img src="./northwinds-new-branch.png" width="60%">

We can see that by looking underneath the green button.  It tells us the current branch is master, and specifies that this branch is read only, `branch: master (read-only)`.

> **Thanks DBT!** Here, DBT is enforcing a proper a proper git workflow.  Developers should never make changes directly on master, because that code will affect the codebase that the rest of the team is working on.  If we made a mistake or broke some tests and it made it to master, the rest of the development team would suffer.  So instead the workflow is to (1) create a new branch (2) make changes on that branch and then (3) eventually merge that code onto master by opening a pull request.

So instead, we need to first click on the `Create a new branch` button.  And then, we can name the branch something related to the work we are about to complete.

> <img src="./dim_customers_model.png" width="60%"> 

Once we click submit, DBT will automatically create the `build_dim_customers_model` branch, and also checkout that branch for us.   Notice that now the green button has changed to `open pull request`, and that underneath it tells us our current branch `branch: build_dim_customers_model`.

> <img src="./resulting_branch.png" width="60%">

### Performing the query

And from once we have created this new branch, from there we just need to create a new folder called `models` (not in the `example` folder), and then under that folder a file called `dim_customers.sql`.  Once this new file is created, we can write the following query.

> <img src="./select-preview.png" width="80%">

Notice that if our table is under a schema, then we need to specify that schema as a prefix.

```sql
SELECT contact_name, address, phone FROM postgres_northwinds_rds_public.customers
```

> Notice that there is no semi-colon at the end of our SQL statement. DBT will add that for us.  

And from there, we can can query the database by clicking on the `preview` button towards the bottom.

### Creating a new table

Ok, now currently DBT just queried our data displayed to us the results.  But remember, that one purpose of DBT is to turn those results into a newly populated table.  If you look at the *very* bottom of the screen you'll see a field for `Runs`.  

> This is essentially the equivalent of entering a command into the terminal.

Then if we call the `dbt run` command, we'll see that the result of our query will also create and insert data into a new table.

>  <img src="./runs.png" width="100%">

```bash
dbt run --models dim_customers
```

> Above we specify to only run the code in `models/dim_customers.sql`, and to not run *all* of the code in our codebase. 

And then from there, will be taken to the Runs dashboard.

> <img src="./run_customers.png" width="100%">

Now just from our `SELECT` statement, DBT supposedly created and inserted in some data.  We can see this if we click on `dim_customers`, and then move the green radio button from `Summary` over to `Details`.

> <img src="./customers_details.png" width="100%">

So here, we can see that DBT turned our `SELECT` statement into some SQL that creates a new table `dev.dbt_jigsawlabsstudent.dim_customers_dbt_tmp`, and inserts in the data selected from our query.  

> Unpacking the name of our table, `FIVETRAN_DATABASE` is the name of the database, `dbt_jigsawlabsstudent` is the newly created schema, and `dim_customers` is the name of the newly created table -- generated from the file name of our model, `dim_customers.sql`. 

### Returning to Snowflake

So at this point we've used DBT to quickly create a new table.  Let's also confirm that it has in fact changed our data warehouse by returning to snowflake.  If we return to the query editor in snowflake, we can see that we have a new schema, `dbt_jigsawlabsstudent` (yours will be different).

And then if we take a look at the tables in that schema, we can see that there is a new table in that schema.

> <img src="./dim_customers_schema.png" width="30%">

> If we want, we can view some of that data from snowflake.

> <img src="./snowflake_view.png" width="80%">

### Updating the codebase

Alright, so now that we've confirmed that our snowflake database was updated properly with our DBT code, the next step is to commit our changes to the codebase, and perhaps merge our changes to master.

So click on the green commit button.

> <img src="./dim_customers_commit.png" width="80%">

And add a commit message like `added dim_customers model`.

> <img src="./dim_customers_model_commit.png" width="80%">

Then, click on `open pull request`.

<img src="./pull_request.png" width="80%"> 

<img src="./create_pull_request.png" width="80%">

Then click create pull request, and add a commit message, then click on `Confirm merge`.

<img src="./pull_commit.png" width="60%">

Finally, click on `Merge pull request` to merge our updated code with the master branch.

<img src="./merge-request.png" width="60%">

If we go to the master branch, we will find our `models/dim_customers.sql` file with the code from DBT.

<img src="./customers_model.png" width="60%">

And our work is complete :)

### Summary

In this lesson, we saw the workflow of DBT.  DBT connects to both our data warehouse and to our Github accounts.  Once these two components are connected, we then use DBT to turn our SELECT query into a table in our data warehouse.  We performed this by first performing the query to see what it selected, and then creating a new table in our data warehouse with the command:

`dbt run --models dim_customers`

We and we confirmed the change was made by querying snowflake directly.

Now at this point, we still have not added our changes to the DBT codebase on the DBT master branch.  So to do this, we used DBT to create a new commit first on our branch, and then merged those changes onto master.