MaterializeInc · morsapaes · Jun 18, 2022 · May 3, 2022 · May 4, 2022 · May 4, 2022
diff --git a/.github/tests/dbt-get-started.sh b/.github/tests/dbt-get-started.sh
@@ -13,5 +13,5 @@ docker-compose exec -T dbt dbt run
 sleep 5
 
 # Check that there's data making it's way to the avg_bid materialized view
-record_count=$(docker-compose run -T mzcli -Atc 'SELECT COUNT(*) FROM avg_bid')
+record_count=$(docker-compose run -T cli -Atc 'SELECT COUNT(*) FROM avg_bid')
 [[ "$record_count" -gt 0 ]]
diff --git a/dbt-get-started/Dockerfile b/dbt-get-started/Dockerfile
@@ -0,0 +1,8 @@
+FROM python:3.9.9-bullseye
+
+WORKDIR /usr/app/dbt
+
+RUN set -ex; \
+    pip install --no-cache-dir dbt-materialize==1.1.2
+
+ENTRYPOINT ["/bin/bash"]
diff --git a/dbt-get-started/README.md b/dbt-get-started/README.md
@@ -2,7 +2,7 @@
 
 [dbt](https://docs.getdbt.com/docs/introduction) has become the standard for data transformation (“the T in ELT”). It combines the accessibility of SQL with software engineering best practices, allowing you to not only build reliable data pipelines, but also document, test and version-control them.
 
-While dbt is a great fit for **batch** transformations, it can only **approximate** transforming streaming data. This demo recreates the Materialize [getting started guide](https://materialize.com/docs/get-started/) using dbt as the transformation layer.
+This demo recreates the Materialize [getting started guide](https://materialize.com/docs/get-started/) using dbt as the transformation layer.
 
 ## Docker
 
@@ -38,15 +38,15 @@ dbt --version
 
 We've created a few core models that take care of defining the building blocks of a dbt+Materialize project, including a streaming [source](https://materialize.com/docs/overview/api-components/#sources):
 
-- `market_orders_raw.sql`
+- `sources/market_orders_raw.sql`
 
 , as well as a staging [view](https://materialize.com/docs/overview/api-components/#non-materialized-views) to transform the source data:
 
-- `market_orders.sql`
+- `staging/stg_market__orders.sql`
 
-and a [materialized view](https://materialize.com/docs/overview/api-components/#materialized-views) that continuously updates as the underlying data changes:
+, and a [materialized view](https://materialize.com/docs/overview/api-components/#materialized-views) that continuously updates as the underlying data changes:
 
-- `avg_bid.sql`
+- `marts/avg_bid.sql`
 
 To run the models:
 
@@ -56,12 +56,50 @@ dbt run
 
 > :crab: As an exercise, you can add models for the queries demonstrating [joins](https://materialize.com/docs/get-started/#joins) and [temporal filters](https://materialize.com/docs/get-started/#temporal-filters).
 
+### Test the project
+
+To help demonstrate how `dbt test` works with Materialize for **continuous testing**, we've added some [generic tests](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) to the [`avg_bid` model](dbt/models/marts/avg_bid.sql):
+
+```yaml
+models:
+  - name: avg_bid
+    description: 'Computes the average bid price'
+    columns:
+      - name: symbol
+        description: 'The stock ticker'
+        tests:
+          - not_null
+          - unique
+```
+
+, and configured testing in the [project file](dbt/dbt_project.yml):
+
+```yaml
+tests:
+  mz_get_started:
+    marts:
+      +store_failures: true
+      +schema: 'etl_failure'
+```
+
+Note that tests are configured to [`store_failures`](https://docs.getdbt.com/reference/resource-configs/store_failures), which instructs dbt to create a materialized view for each test using the respective `SELECT` statements.
+
+To run the tests:
+
+```bash
+dbt test
+```
+
+This creates two materialized views in a dedicated schema (`public_etl_failures`): `not_null_avg_bid_symbol` and `unique_avg_bid_symbol`. dbt takes care of naming the views based on the type of test (`not_null`, `unique`) and the columns being tested (`symbol`).
+
+These views are continuously updated as new data streams in, and allow you to monitor failing rows **as soon as** an assertion fails. You can use this feature for unit testing during the development of your dbt models, and later in production to trigger real-time alerts downstream.
+
 ## Materialize
 
-To connect to the running Materialize service, you can use `mzcli`, which is included in the setup:
+To connect to the running Materialize service, you can use a PostgreSQL-compatible client like `psql`, which is bundled in the `materialize/cli` image:
 
 ```bash
-docker-compose run mzcli
+docker-compose run cli
 ```
 
 and run a few commands to check the objects created through dbt:
@@ -99,6 +137,30 @@ SHOW MATERIALIZED VIEWS;
 
 You'll notice that you're only able to `SELECT` from `avg_bid` — this is because it is the only materialized view! This view is incrementally updated as new data streams in, so you get fresh and correct results with low latency. Behind the scenes, Materialize is indexing the results of the embedded query in memory.
 
+### Continuous testing
+
+To validate that the schema storing the tests was created:
+
+```sql
+SHOW SCHEMAS;
+
+        name
+--------------------
+ public
+ public_etl_failure
+```
+
+, and that the materialized views that continuously test the `avg_bid` view for failures are up and running:
+
+```sql
+SHOW VIEWS FROM public_etl_failure;
+
+          name
+-------------------------
+ not_null_avg_bid_symbol
+ unique_avg_bid_symbol
+```
+
 ## Local installation
 
 To set up dbt and Materialize in your local environment instead of using Docker, follow the instructions in the [documentation](https://materialize.com/docs/guides/dbt/).

diff --git a/dbt-get-started/compose.yaml b/dbt-get-started/compose.yaml
@@ -6,11 +6,11 @@ services:
     ports:
       - 6875:6875
     healthcheck: {test: curl -f localhost:6875, interval: 1s, start_period: 30s}
-  mzcli:
+  cli:
     image: materialize/cli:v0.26.0
-    container_name: mzcli
+    container_name: cli
   dbt:
-    image: materialize/dbt-materialize:v0.26.0
+    build: ./
     container_name: dbt
     init: true
     entrypoint: /bin/bash

diff --git a/dbt-get-started/dbt/dbt_project.yml b/dbt-get-started/dbt/dbt_project.yml
@@ -14,3 +14,9 @@ target-path: 'target' # directory which will store compiled SQL files
 clean-targets: # directories to be removed by `dbt clean`
   - 'target'
   - 'dbt_modules'
+
+tests:
+  mz_get_started:
+    marts:
+      +store_failures: true
+      +schema: 'etl_failure'
diff --git a/dbt-get-started/dbt/models/avg_bid.sql → dbt-get-started/dbt/models/marts/avg_bid.sql b/dbt-get-started/dbt/models/avg_bid.sql → dbt-get-started/dbt/models/marts/avg_bid.sql
@@ -1,6 +1,6 @@
 {{ config(materialized='materializedview') }}
 
 SELECT symbol,
-       AVG(bid_price) AS avg
-FROM {{ ref('market_orders') }}
+       AVG(bid_price) AS avg_bid
+FROM {{ ref('stg_market_orders') }}
 GROUP BY symbol
diff --git a/dbt-get-started/dbt/models/marts/models.yml b/dbt-get-started/dbt/models/marts/models.yml
@@ -0,0 +1,11 @@
+version: 2
+
+models:
+  - name: avg_bid
+    description: 'Computes the average bid price'
+    columns:
+      - name: symbol
+        description: 'The stock ticker'
+        tests:
+          - not_null
+          - unique
diff --git a/dbt-get-started/dbt/models/schema.yml b/dbt-get-started/dbt/models/schema.yml
diff --git a/...-started/dbt/models/market_orders_raw.sql → .../dbt/models/sources/market_orders_raw.sql b/...-started/dbt/models/market_orders_raw.sql → .../dbt/models/sources/market_orders_raw.sql
diff --git a/dbt-get-started/dbt/models/sources/sources.yml b/dbt-get-started/dbt/models/sources/sources.yml
@@ -0,0 +1,7 @@
+version: 2
+
+sources:
+  - name: market_orders
+    schema: public
+    tables:
+      - name: market_orders_raw
diff --git a/dbt-get-started/dbt/models/staging/models.yml b/dbt-get-started/dbt/models/staging/models.yml
@@ -0,0 +1,5 @@
+version: 2
+
+models:
+  - name: stg_market_orders
+    description: 'Converts market order data to proper data types'
diff --git a/dbt-get-started/dbt/models/market_orders.sql → .../dbt/models/staging/stg_market_orders.sql b/dbt-get-started/dbt/models/market_orders.sql → .../dbt/models/staging/stg_market_orders.sql
@@ -6,4 +6,4 @@ SELECT
     (text::jsonb)->>'symbol' AS symbol,
     (text::jsonb)->>'trade_type' AS trade_type,
     to_timestamp(((text::jsonb)->'timestamp')::bigint) AS ts
-FROM {{ ref('market_orders_raw') }}
+FROM {{ source('market_orders', 'market_orders_raw') }}