-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an experimental dbt-sql template #1059
Merged
Merged
Changes from 2 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
4fee665
Add a dbt template
lennartkats-db b3a5ef8
Use a template for VS Code settings
lennartkats-db c81d139
Tweak message
lennartkats-db c900cff
Update
lennartkats-db 16f26a3
Add tests
lennartkats-db cd52c83
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db d85d4c4
Fix test
lennartkats-db 419bd27
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db 9030f56
Add template
lennartkats-db 45ea8db
Improve catalog handling
lennartkats-db 0268c88
Minor tweaks
lennartkats-db 94ebd9a
Update template to use materialized views & streaming tables
lennartkats-db 14bc1fa
Add conditional
lennartkats-db 1501298
Improve template
lennartkats-db 6fc5ed4
Offer an option to use personal schemas
lennartkats-db 220a1ea
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db 99f920e
Fix ANSI mode
lennartkats-db af0dd6d
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db 1099eed
Don't ask for a "production" schema, just assume "default"
lennartkats-db 33c5e91
Explain mode: development
lennartkats-db 7275310
Change project layout based on OSS team feedback
lennartkats-db de7bd78
Improve DX with default_catalog helper
lennartkats-db 8e7c6a1
Remove from list of templates for now
lennartkats-db 18c6b70
Update README.md
lennartkats-db a660efa
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db 2f52ff1
Mark as experimental
lennartkats-db e041148
Restore sql-dbt template in hidden form
lennartkats-db 00bf2fe
Merge remote-tracking branch 'databricks/main' into dbt-template
lennartkats-db e5fb708
Copy-editing
lennartkats-db File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# dbt template | ||
|
||
This folder provides a template for using dbt-core with Databricks Asset Bundles. | ||
It follows the standard dbt project structure and has an additional `resources` | ||
directory to define Databricks resources such as jobs that run dbt models. | ||
|
||
* Learn more about the dbt and its standard project structure here: https://docs.getdbt.com/docs/build/projects. | ||
* Learn more about Databricks Asset Bundles here: https://docs.databricks.com/en/dev-tools/bundles/index.html |
46 changes: 46 additions & 0 deletions
46
libs/template/templates/dbt-sql/databricks_template_schema.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
{ | ||
"welcome_message": "\nWelcome to the dbt template for Databricks Asset Bundles!", | ||
"properties": { | ||
"project_name": { | ||
"type": "string", | ||
"pattern": "^[A-Za-z_][A-Za-z0-9_]+$", | ||
"pattern_match_failure_message": "Name must consist of letters, numbers, and underscores.", | ||
"default": "my_dbt_project", | ||
"description": "\nPlease provide a unique name for this project.\nproject_name", | ||
"order": 1 | ||
}, | ||
"workspace_host_override": { | ||
"comment": "We explicitly ask users for the workspace_host since we ask for a http_path below. A downside of doing this is that {{user_name}} may not be correct if they pick a different workspace than the one from the current profile.", | ||
"type": "string", | ||
"pattern": "^https:\\/\\/[^/]+$", | ||
"pattern_match_failure_message": "URL must be of the form https://my.databricks.host", | ||
"description": "\nPlease provide the workspace URL to use.\nworkspace_url", | ||
"default": "{{workspace_host}}", | ||
"order": 2 | ||
}, | ||
"http_path": { | ||
"type": "string", | ||
"pattern": "^/sql/.\\../warehouses/[a-z0-9]+$", | ||
"pattern_match_failure_message": "Path must be of the form /sql/1.0/warehouses/abcdef1234567890", | ||
"description": "\nPlease provide the HTTP Path of the SQL warehouse you would like to use with dbt during development\nYou can find this path by clicking on \"Connection Details\" for your SQL warehouse.\nhttp_path [example: /sql/1.0/warehouses/abcdef1234567890]", | ||
"order": 3 | ||
}, | ||
lennartkats-db marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"catalog": { | ||
"type": "string", | ||
"default": "", | ||
"pattern": "^\\w*$", | ||
"pattern_match_failure_message": "Invalid catalog name.", | ||
"description": "\nPlease provide an initial catalog (leave blank if you would not want to use an initial catalog).\ncatalog", | ||
"order": 4 | ||
}, | ||
"schema": { | ||
"type": "string", | ||
"default": "default", | ||
"pattern": "^\\w+$", | ||
"pattern_match_failure_message": "Invalid schema name.", | ||
"description": "\nPlease provide a default schema for this project.\nNote that you can pick a different schema for local development when you first use the 'dbt init' command.\nschema", | ||
"order": 4 | ||
} | ||
}, | ||
"success_message": "\n📊 Your new project has been created in the '{{.project_name}}' directory!\nPlease refer to the README.md file for \"getting started\" instructions." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
{{define "latest_lts_dbr_version" -}} | ||
13.3.x-scala2.12 | ||
{{- end}} | ||
|
||
{{define "latest_lts_db_connect_version_spec" -}} | ||
>=13.3,<13.4 | ||
{{- end}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Preamble | ||
|
||
This file only template directives; it is skipped for the actual output. | ||
|
||
{{skip "__preamble"}} | ||
|
||
{{if eq .project_name "dbt"}} | ||
{{fail "Project name 'dbt' is not supported"}} | ||
{{end}} |
3 changes: 3 additions & 0 deletions
3
libs/template/templates/dbt-sql/template/{{.project_name}}/.vscode/__builtins__.pyi
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Typings for Pylance in Visual Studio Code | ||
# see https://github.com/microsoft/pyright/blob/main/docs/builtins.md | ||
from databricks.sdk.runtime import * |
8 changes: 8 additions & 0 deletions
8
libs/template/templates/dbt-sql/template/{{.project_name}}/.vscode/extensions.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"recommendations": [ | ||
"databricks.databricks", | ||
"ms-python.vscode-pylance", | ||
"redhat.vscode-yaml", | ||
"databricks.sqltools-databricks-driver", | ||
] | ||
} |
30 changes: 30 additions & 0 deletions
30
libs/template/templates/dbt-sql/template/{{.project_name}}/.vscode/settings.json.tmpl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
{ | ||
"python.analysis.stubPath": ".vscode", | ||
"databricks.python.envFile": "${workspaceFolder}/.env", | ||
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])", | ||
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------", | ||
"python.testing.pytestArgs": [ | ||
"." | ||
], | ||
"python.testing.unittestEnabled": false, | ||
"python.testing.pytestEnabled": true, | ||
"python.analysis.extraPaths": ["src"], | ||
"files.exclude": { | ||
"**/*.egg-info": true, | ||
"**/__pycache__": true, | ||
".pytest_cache": true, | ||
}, | ||
"python.envFile": "${workspaceFolder}/.databricks/.databricks.env", | ||
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python", | ||
"sqltools.connections": [ | ||
{ | ||
"connectionMethod": "VS Code Extension (beta)", | ||
"catalog": "hive_metastore", | ||
"previewLimit": 50, | ||
"driver": "Databricks", | ||
"name": "databricks", | ||
"path": "/sql/1.0/warehouses/ec7fa4bd0f0afc8f" | ||
} | ||
], | ||
"sqltools.autoConnectTo": "", | ||
} |
119 changes: 119 additions & 0 deletions
119
libs/template/templates/dbt-sql/template/{{.project_name}}/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# {{.project_name}} | ||
|
||
The '{{.project_name}}' project was generated by using the dbt template for | ||
Databricks Asset Bundles. It follows the standard dbt project structure | ||
and has an additional `resources` directory to define Databricks resources such as jobs | ||
that run dbt models. | ||
|
||
* Learn more about the dbt and its standard project structure here: https://docs.getdbt.com/docs/build/projects. | ||
* Learn more about Databricks Asset Bundles here: https://docs.databricks.com/en/dev-tools/bundles/index.html | ||
|
||
## Development setup | ||
|
||
1. Install the Databricks CLI from https://docs.databricks.com/dev-tools/cli/databricks-cli.html | ||
|
||
2. Authenticate to your Databricks workspace: | ||
``` | ||
$ databricks configure | ||
``` | ||
|
||
3. Install dbt | ||
|
||
To install dbt, you need a recent version of Python. For the instructions below, | ||
we assume `python3` refers to the Python version you want to use. On some systems, | ||
you may need to refer to a different Python version, e.g. `python` or `/usr/bin/python`. | ||
|
||
Run these instructions from the `{{.project_name}}` directory. We recommend making | ||
use of a Python virtual environment and installing dbt as follows: | ||
|
||
``` | ||
$ python3 -m venv .venv | ||
$ . .venv/bin/activate | ||
$ pip install -r requirements-dev.txt | ||
``` | ||
|
||
4. Initialize your dbt profile | ||
|
||
Use `dbt init` to initialize your profile. | ||
|
||
``` | ||
$ dbt init | ||
``` | ||
|
||
Note that dbt authentication uses personal access tokens by default | ||
(see https://docs.databricks.com/dev-tools/auth/pat.html). | ||
You can use OAuth as an alternative, but this currently requires manual configuration. | ||
See https://github.com/databricks/dbt-databricks/blob/main/docs/oauth.md | ||
for general instructions, or https://community.databricks.com/t5/technical-blog/using-dbt-core-with-oauth-on-azure-databricks/ba-p/46605 | ||
for advice on setting up OAuth for Azure Databricks. | ||
|
||
To setup up additional profiles, such as a 'prod' profile, | ||
see https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles. | ||
|
||
5. Activate dbt so it can be used from the terminal | ||
|
||
``` | ||
$ . .venv/bin/activate | ||
``` | ||
|
||
## Local development with dbt | ||
|
||
Use `dbt` to [run this project locally using a SQL warehouse](https://docs.databricks.com/partners/prep/dbt.html): | ||
|
||
``` | ||
$ dbt seed | ||
$ dbt run | ||
``` | ||
|
||
(Did you get an error that the dbt command could not be found? You may need | ||
to try the last step from the development setup above to re-activate | ||
your Python virtual environment!) | ||
|
||
Use `dbt test` to run tests generated from yml files such as `models/schema.yml` | ||
and any SQL tests from `tests/` | ||
|
||
``` | ||
$ dbt test | ||
``` | ||
|
||
## Deploying to Databricks with Databricks Asset Bundles | ||
|
||
Databricks Asset Bundles can be used to deploy to Databricks and to execute | ||
dbt commands as a job using Databricks Workflows. See | ||
https://docs.databricks.com/dev-tools/bundles/index.html to learn more. | ||
|
||
Use the Databricks CLI to deploy a development copy of this project to a workspace: | ||
|
||
``` | ||
$ databricks bundle deploy --target dev | ||
``` | ||
|
||
(Note that "dev" is the default target, so the `--target` parameter | ||
is optional here.) | ||
|
||
This deploys everything that's defined for this project. | ||
For example, the default template would deploy a job called | ||
`[dev yourname] {{.project_name}}_job` to your workspace. | ||
You can find that job by opening your workpace and clicking on **Workflows**. | ||
|
||
To run the deployed job, use the "run" command: | ||
``` | ||
$ databricks bundle run --targed dev | ||
``` | ||
|
||
To deploy a production copy, type: | ||
|
||
``` | ||
$ databricks bundle deploy --target prod | ||
``` | ||
|
||
## IDE support | ||
|
||
Optionally, install developer tools such as the Databricks extension for Visual Studio Code from | ||
https://docs.databricks.com/dev-tools/vscode-ext.html. Third-party extensions | ||
related to dbt may further enhance your dbt development experience! | ||
|
||
## CI/CD | ||
|
||
See https://docs.databricks.com/dev-tools/bundles/ci-cd.html for documentation | ||
on CI/CD setup. |
Empty file.
43 changes: 43 additions & 0 deletions
43
libs/template/templates/dbt-sql/template/{{.project_name}}/databricks.yml.tmpl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# This is a Databricks asset bundle definition for {{.project_name}}. | ||
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation. | ||
bundle: | ||
name: {{.project_name}} | ||
|
||
include: | ||
- resources/*.yml | ||
|
||
# Variable declarations. These variables are assigned in the dev/prod targets below. | ||
variables: | ||
warehouse_id: | ||
description: The warehouse to use | ||
catalog: | ||
description: The catalog to use | ||
schema: | ||
description: The schema to use | ||
|
||
# Deployment targets. | ||
targets: | ||
dev: | ||
default: true | ||
mode: development | ||
workspace: | ||
host: {{.workspace_host_override}} | ||
variables: | ||
warehouse_id: {{index ((regexp "[^/]+$").FindStringSubmatch .http_path) 0}} | ||
catalog: {{.catalog}} | ||
schema: {{.schema}} # tip: use ${workspace.current_user.short_name} if you want your own schema | ||
|
||
prod: | ||
mode: production | ||
workspace: | ||
host: {{.workspace_host_override}} | ||
variables: | ||
warehouse_id: {{index ((regexp "[^/]+$").FindStringSubmatch .http_path) 0}} | ||
catalog: {{.catalog}} | ||
schema: {{.schema}} | ||
{{- if not is_service_principal}} | ||
run_as: | ||
# This runs as {{user_name}} in production. We could also use a service principal here | ||
# using service_principal_name (see the Databricks documentation). | ||
user_name: {{user_name}} | ||
{{- end}} |
36 changes: 36 additions & 0 deletions
36
libs/template/templates/dbt-sql/template/{{.project_name}}/dbt_project.yml.tmpl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
|
||
# Name your project! Project names should contain only lowercase characters | ||
# and underscores. A good package name should reflect your organization's | ||
# name or the intended use of these models | ||
name: '{{.project_name}}' | ||
version: '1.0.0' | ||
config-version: 2 | ||
|
||
# This setting configures which "profile" dbt uses for this project. | ||
profile: '{{.project_name}}' | ||
|
||
# These configurations specify where dbt should look for different types of files. | ||
# The `model-paths` config, for example, states that models in this project can be | ||
# found in the "models/" directory. You probably won't need to change these! | ||
model-paths: ["models"] | ||
analysis-paths: ["analyses"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seeds"] | ||
macro-paths: ["macros"] | ||
snapshot-paths: ["snapshots"] | ||
|
||
clean-targets: # directories to be removed by `dbt clean` | ||
- "target" | ||
- "dbt_packages" | ||
|
||
# Configuring models | ||
# Full documentation: https://docs.getdbt.com/docs/configuring-models | ||
|
||
# In this example config, we tell dbt to build all models in the example/ | ||
# directory as views. These settings can be overridden in the individual model | ||
# files using the `{{"{{"}} config(...) {{"}}"}}` macro. | ||
models: | ||
{{.project_name}}: | ||
# Config indicated by + and applies to all files under models/example/ | ||
example: | ||
+materialized: view |
Empty file.
27 changes: 27 additions & 0 deletions
27
...mplate/templates/dbt-sql/template/{{.project_name}}/models/example/my_first_dbt_model.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
|
||
/* | ||
Welcome to your first dbt model! | ||
Did you know that you can also configure models directly within SQL files? | ||
This will override configurations stated in dbt_project.yml | ||
Try changing "table" to "view" below | ||
*/ | ||
|
||
{{ config(materialized='table') }} | ||
|
||
with source_data as ( | ||
|
||
select 1 as id | ||
union all | ||
select null as id | ||
|
||
) | ||
|
||
select * | ||
from source_data | ||
|
||
/* | ||
Uncomment the line below to remove records with null `id` values | ||
*/ | ||
|
||
-- where id is not null |
6 changes: 6 additions & 0 deletions
6
...plate/templates/dbt-sql/template/{{.project_name}}/models/example/my_second_dbt_model.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
|
||
-- Use the `ref` function to select from other models | ||
|
||
select * | ||
from {{ ref('my_first_dbt_model') }} | ||
where id = 1 |
21 changes: 21 additions & 0 deletions
21
libs/template/templates/dbt-sql/template/{{.project_name}}/models/example/schema.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
|
||
version: 2 | ||
|
||
models: | ||
- name: my_first_dbt_model | ||
description: "A starter dbt model" | ||
columns: | ||
- name: id | ||
description: "The primary key for this table" | ||
tests: | ||
- unique | ||
- not_null | ||
|
||
- name: my_second_dbt_model | ||
description: "A starter dbt model" | ||
columns: | ||
- name: id | ||
description: "The primary key for this table" | ||
tests: | ||
- unique | ||
- not_null |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this pattern work without
+
or*
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this question. There is a
+
in there?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe you're referring to the
\.\\..
part? That matches a version, like1.0
.