Local development with dbt-databricks

This page describes how to develop a dbt project on your computer using dbt-databricks. We will create an empty dbt project with information on how to connect to Databricks. We will then run our first dbt models.

Prerequisites

Access to a Databricks workspace
Ability to create a Personal Access Token (PAT)
Python 3.8+
dbt-core v1.1.0+
dbt-databricks v1.1.0+

Prepare to connect

Collect connection information

Before you scaffold a new dbt project, you have to collect some information which dbt will use to connect to Databricks. Where you find this information depends on whether you are using Databricks Clusters or Databricks SQL endpoints. We recommend that you develop dbt models against Databricks SQL endpoints as they provide the latest SQL features and optimizations.

Databricks SQL endpoints

Log in to your Databricks workspace
Click the SQL persona in the left navigation bar to switch to Databricks SQL
Click SQL Endpoints
Choose the SQL endpoint you want to connect to
Click Connection details
Copy the value of Server hostname. This will be the value of host when you scaffold a dbt project.
Copy the value of HTTP path. This will be the value of http_path when you scaffold a dbt project.

Databricks Clusters

Log in to your Databricks workspace
Click the Data Science & Engineering persona in the left navigation bar
Click Compute
Click on the cluster you want to connect to
Near the bottom of the page, click Advanced options
Scroll down some more and click JDBC/ODBC
Copy the value of Server Hostname. This will be the value of host when you scaffold a dbt project.
Copy the value of HTTP Path. This will be the value of http_path when you scaffold a dbt project.

Scaffold a new dbt project

Now, we are ready to scaffold a new dbt project. Switch to your terminal and type:

dbt init databricks_demo

In the choice that follows, type 1, which instructs dbt to use the dbt-databricks adapter:

Which database would you like to use?
[1] databricks
[2] spark

Next, you have to provide the full hostname of your Databricks workspace. For example, if your workspace is myworkspace.cloud.databricks.com, enter it here.

In the http_path field, enter the HTTP path you noted above.

In the token field, enter the PAT you created earlier.

In the catalog field, enter the name of the Unity Catalog catalog if you are using it. Otherwise, enter null. This field only shows if you are using dbt-databricks>=1.1.1 and is only relevant to users of Unity Catalog.

In schema, enter databricks_demo, which is the schema you created earlier.

Leave threads at 1 for now.

Test connection

You are now ready to test the connection to Databricks. In the terminal, enter the following command:

dbt debug

If all goes well, you will see a successful connection. If you cannot connect to Databricks, double-check the PAT and update it accordingly in ~/.dbt/profiles.yml.

Run your first models

At this point, you simply run the demo models in the models/example directory. In your terminal, type:

dbt run

Once the dbt run completes, switch to Databricks, click Data in the left navigation bar and find the tables you just created! If you created your own schema, you will find two tables:

demo_databricks.my_first_dbt_model
demo_databricks.my_second_dbt_model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

local-dev.md

local-dev.md

Local development with dbt-databricks

Prerequisites

Prepare to connect

Collect connection information

Databricks SQL endpoints

Databricks Clusters

Scaffold a new dbt project

Test connection

Run your first models

Files

local-dev.md

Latest commit

History

local-dev.md

File metadata and controls

Local development with dbt-databricks

Prerequisites

Prepare to connect

Collect connection information

Databricks SQL endpoints

Databricks Clusters

Scaffold a new dbt project

Test connection

Run your first models