dbtmock
is an utility for end to end testing DBT pipelines.
dbtmock
lets you:
- pick a model to test
- mock model dependencies
- mock a final result
- assert output of a model against a final result dataset
dbtmock
does not mess with your dbt project, it simply generates SQL queries based on your project manifests.json
. These SQL queries can be used as regular dbt tests.
- Select a model you want to test
- Picked any arbitrarly dependencies to Mock. Any intermediate models or sources are Ok. The mocked data is provided via CSV files.
- Provide expected output result using a CSV file
- Use
dbtmock
cli to generate test queries - Generated queries are run via
dbt test
command
- Check Github releases, Download your needed binary
go install github.com/dav009/dbtmock/cmd/dbtmock@latest
dbtmock --manifest target/manifest.json --tests unit_tests --output tests
--manifest
is the path to your dbt's projectmanifest.json
--tests
is a path to a folder contaningjson
files. Each json file in that folder is a test definition (see examples below)--output
is the path folder where the generated tests will be stored. By default it points totests
folder wheredbt
looks for custom tests to run
- Go to your dbt project:
cd my_project
- Create a folder to store your tests definitions:
mkdir unit_tests
- Create a json file per test definition. For example the Json file below is a single test in which there are two mocks
seed.jaffle_shop.raw_customers
andseed.jaffle_shop.raw_orders
. The test will run the model"model.jaffle_shop.customers"
. The content ofoutput
has the data which will be used for assertions.
{
"name": "dummy_test",
"model": "model.jaffle_shop.customers",
"mocks": {
"seed.jaffle_shop.raw_customers": {
"filepath": "seeds/raw_customers.csv",
"types": {
"id": "INT64"
}
},
"seed.jaffle_shop.raw_orders": {
"filepath": "seeds/raw_orders.csv",
"types": {
"id": "INT64",
"user_id": "INT64",
"order_date":"DATE"
}
},
},
"output": {
"filepath": "output.csv",
"types": {
"customer_id": "INT64",
"most_recent_order": "DATE",
"number_of_orders": "INT64",
"customer_lifetime_value": "FLOAT64",
"first_order": "DATE"
}
}
}
- Go to your dbt project, make sure you generated a
manifest.json
(e.g: rundbt compile
) - run
dbtmock --tests unit_tests --output tests
- Check the files in the output folder :
ls tests
- Run
dbt test
- You should see your tests in the list of test being ran by dbt
Take a look at dbtmock_example
- This a fork of the vanilla
jaffle_store
project, which is thehello world
of dbt - This fork has a few changes:
unit_tests
folder with a sampletest.json
- some mock data as csv:
mocked_data/output.csv
as a dataset to assert the final output of the given test - it uses already given
csv
files as mocked data
You can use this fork to run dbtmock
- Clone the project
git clone ..
- Create a
profile.yml
with your BigQuery settings - Run
dbt seed
- Run
dbt compile
- Run
dbtmock
(no arguments needed) - See test generated in :
tests/dummy_test.sql
- Run
dbt test
see in the output the result:START test dummy_test.. PASS dummy_test
go build cmd/dbtmock/dbtmock.go
- This project uses
manifest.json
to figure out what are the dependencies of the model that you are testing. - Given a model to test
dbtmock
recursively rebuilds the sql queries accross dependencies - While rebuilding a model SQL
dbtmock
replaces table for given mocked data dbtmock
creates a giant SQL query with mocks and model SQL logic. The output of the query is then compared to the provided expected data via aMINUS
sql operation.- Only tested with BigQuery