Skip to content

dbt-labs/dbt-databricks-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dbt + Databricks Demo!

This is a modified version of our public tutorial intended for users of dbt on Databricks.

Any questions? jeremy@fishtownanalytics.com

Sample data

Create Databricks tables jaffle_shop.orders, jaffle_shop.customers, and stripe.payments from these CSV files, which are located in a public S3 bucket (docs):

s3://dbt-tutorial-public/jaffle_shop_orders.csv
s3://dbt-tutorial-public/jaffle_shop_customers.csv
s3://dbt-tutorial-public/stripe_payments.csv

Getting started

The instructions below assume you are running dbt on macOS. Linux and Windows users should adjust the bash commands accordingly.

  1. Clone this github repo
  2. Install dbt-spark: pip install dbt-spark
  3. Copy the example profile to your ~/.dbt folder (created when installing dbt):
$ cp ./sample.profiles.yml ~/.dbt/profiles.yml
  1. Populate ~/.dbt/profiles.yml with your Databricks host, API token, cluster ID, and schema name
open ~/.dbt
  1. Verify that you can connect to Databricks
$ dbt debug
  1. Verify that you can run dbt
$ dbt run

Resources:

  • Learn more about dbt in the docs
  • Check out Discourse for commonly asked questions and answers
  • Join the chat on Slack for live discussions and support
  • Find dbt events near you
  • Check out the blog for the latest news on dbt's development and best practices
  • Watch our Office Hours on dbt + Spark

Releases

No releases published

Packages

No packages published