dbt-databricks-factory

Creates dbt based GCP workflows.

Installation

Use the package manager pip to install dbt-databricks-factory for [dp (data-pipelines-cli)]:

pip install dbt-databricks-factory

Usage

To create a new dbt workflow json schema, run:

python -m dbt_databricks_factory.cli create-job \
    --job-name '<job name>' \
    --project-dir '<dbt project directory>' \
    --profiles-dir '<path to profiles directory>' \
    --git-provider '<git provider>' \
    --git-url 'https://url.to/repo.git' \
    --git-branch 'main' \
    --job-cluster my-cluster-name @path/to/cluster_config.json \
    --default-task-cluster my-cluster-name \
    --library 'dbt-databricks>=1.0.0,<2.0.0' \
    --library 'dbt-bigquery==1.3.0' \
    --pretty \
    path/to/dbt/manifest.json > workflow.json

This workflow will create a json file with the dbt workflow definition. You can then use it to create a new workflow in Databricks by for example post request like here:

curl --fail-with-body -X POST "${DATABRICKS_HOST}api/2.1/jobs/create" \
-H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
-H "Content-Type: application/json" \
-d "@workflow.json" >job_id.json

echo "Job ID:"
cat job_id.json
curl --fail-with-body -X POST "${DATABRICKS_HOST}api/2.1/jobs/run-now" \
-H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
-H "Content-Type: application/json" \
-d @job_id.json >run_id.json

echo "Run ID:"
cat run_id.json
curl --fail-with-body -X GET -G "${DATABRICKS_HOST}api/2.1/jobs/runs/get" \
-H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
-d "run_id=$(jq -r '.run_id' < run_id.json)" >run_status.json

jq < run_status.json

To get more information about the command, run:

python -m dbt_databricks_factory.cli create-job --help

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
src/dbt_databricks_factory		src/dbt_databricks_factory
tests/unit		tests/unit
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

src/dbt_databricks_factory

src/dbt_databricks_factory

tests/unit

tests/unit

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

CHANGELOG.md

CHANGELOG.md

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

VERSION

VERSION

pdm.lock

pdm.lock

pyproject.toml

pyproject.toml

Repository files navigation

dbt-databricks-factory

Installation

Usage

About

Releases 2

Packages

Contributors 2

Languages

License

getindata/dbt-databricks-factory

Folders and files

Latest commit

History

Repository files navigation

dbt-databricks-factory

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Languages