Materialize dbt Project Example

This project demonstrates how to properly configure and organize a dbt project for Materialize, showcasing best practices for building real-time data objects.

Overview

This example teaches you how to:

Configure dbt models as views, indexed views, and materialized views in Materialize
Set up sources using Materialize's load generators
Create sinks to export data to Kafka
Organize your project with proper layer separation (staging, intermediate, marts)
Use dbt selectors for controlled deployment
Implement comprehensive testing strategies
Deploy with CI/CD and blue/green deployments using GitHub Actions

The project uses an auction house scenario to illustrate these concepts, implementing flipper detection, health monitoring, and suspicious activity alerts as practical examples of real-time data transformations.

CI/CD & Deployment

This project includes complete CI/CD workflows for automated testing and zero-downtime deployments:

Workflows

dbt CI - Validates SQL compilation on every pull request
Deploy to QA - Auto-deploys to QA environment when code merges to main
Blue/Green Production - Zero-downtime production deployments using native dbt-materialize macros
Scheduled Cleanup - Automatically removes old deployment environments

Quick Start

Deploy to QA:

git push origin main  # Automatic deployment

Deploy to Production:

git tag v1.0.0
git push origin v1.0.0
gh release create v1.0.0  # Triggers blue/green deployment

Documentation

📘 CI/CD Setup Guide - Complete setup instructions for GitHub Actions
🔄 Blue/Green Deployment - Zero-downtime deployment guide (includes two workflow options)
✅ Deployment Checklist - Pre and post-deployment validation
🧪 Testing Guide - Comprehensive test suite for workflows and deployments

Materialize Concepts Demonstrated

This project showcases key Materialize features with practical examples:

Views - Staging and intermediate transformations without storage overhead
Indexes - Query optimization on intermediate views
Materialized Views - Real-time maintained analytics in the marts layer
Sources - Load generator creating streaming auction data
Sinks - Kafka exports with DEBEZIUM envelope format

📚 See MATERIALIZE_CONCEPTS.md for detailed explanations and examples of each concept.

Project Structure

This project follows dbt best practices for organizing transformations into distinct layers. Understanding this structure is key to building maintainable real-time data pipelines.

Directory Layout

materialize_auction_house/
├── dbt_project.yml          # Project configuration - defines project name, version, and model defaults
├── profiles.yml             # Connection profiles for dev/staging/production environments
├── packages.yml             # External package dependencies (e.g., dbt-utils)
├── selectors.yml            # Custom selectors for deployment patterns
├── models/                  # All data transformations
│   ├── sources/            # Source definitions (Materialize-specific)
│   │   └── *.sql          # CREATE SOURCE statements
│   ├── staging/            # Raw data preparation layer
│   │   ├── _sources.yml   # Source configurations and tests
│   │   ├── schema.yml     # Model documentation and tests
│   │   └── stg_*.sql      # Staging transformations (views)
│   ├── intermediate/       # Business logic layer
│   │   ├── schema.yml     # Model documentation and tests
│   │   └── int_*.sql      # Intermediate transformations (views + indexes)
│   ├── marts/             # Analytics layer
│   │   ├── schema.yml     # Model documentation and tests
│   │   └── *.sql          # Final models (materialized views)
│   └── sinks/             # Data export layer (Materialize-specific)
│       └── sink_*.sql     # Kafka sink definitions
├── seeds/                  # Static reference data
│   └── *.csv              # CSV files loaded into tables
├── scripts/               # Utility scripts
│   ├── setup.sh          # Environment setup
│   └── *.sql             # Database initialization
└── docker-compose.yml     # Local development environment

Key Configuration Files

`dbt_project.yml`

The main project configuration file (dbt docs) defines:

Project metadata (name, version)
Model materialization defaults by directory
Variable definitions for environment-specific settings
Cluster assignments for Materialize workloads

Example from this project:

models:
  materialize_auction_house:
    staging:
      +materialized: view        # Simple views for data prep
      +schema: staging
      +cluster: compute
    intermediate:
      +materialized: view        # Views that can be indexed
      +schema: intermediate
      +cluster: compute
    marts:
      +schema: marts
      +cluster: compute          # Materialized views defined per-model

`profiles.yml`

Defines connection parameters (dbt docs) for different environments:

Development (local Materialize)
Staging (test environment)
Production (Materialize Cloud)

`selectors.yml`

Custom selector definitions (dbt docs) for deployment patterns:

Run only sources, only transformations, or only sinks
Exclude certain model types by default
Group related models for testing

Model Organization Best Practices

Staging Layer (models/staging/)
- One-to-one with source tables
- Minimal transformations (renaming, casting)
- Always views (no persistence needed)
- Naming convention: stg_<source>_<table>
Intermediate Layer (models/intermediate/)
- Business logic and complex joins
- Can be indexed for performance
- Reusable transformations
- Naming convention: int_<entity>_<verb>
Marts Layer (models/marts/)
- Final consumption-ready models
- Often materialized views for real-time results
- Optimized for specific use cases
- Naming convention: fct_ (facts) or dim_ (dimensions)
Sources (models/sources/)
- Materialize-specific source definitions
- Load generators, Kafka, PostgreSQL, etc.
- Managed by dbt for better lineage
Sinks (models/sinks/)
- Materialize-specific data exports
- Stream results to Kafka topics
- Real-time data pipelines

Additional Resources

Quick Start with Docker

Prerequisites

Docker and Docker Compose
dbt-materialize adapter: pip install dbt-materialize
psql (PostgreSQL client) for running setup scripts

Docker Setup

Start the services:
```
docker compose up -d
```
This starts:
- Materialize (port 6875)
- Redpanda/Kafka (port 9092)
- Redpanda Console (port 8080 - web UI for Kafka)

Run the setup script:

psql "postgresql://materialize@localhost:6875/materialize" -f scripts/setup_materialize.sql

Or use the automated setup:

./scripts/setup.sh

Install dbt dependencies:
```
pip install dbt-materialize
dbt deps
```

Test the connection:

dbt debug --profiles-dir . --profile materialize_auction_house

Run the project:

dbt seed --profiles-dir . --profile materialize_auction_house                        # Load known flippers
dbt run --selector sources_only --profiles-dir . --profile materialize_auction_house     # Create sources (first time only)
dbt run --profiles-dir . --profile materialize_auction_house                            # Create all models except sources
dbt test --profiles-dir . --profile materialize_auction_house                           # Run tests

Or use selectors:

dbt run --selector sources_only --profiles-dir . --profile materialize_auction_house      # Only create sources
dbt run --profiles-dir . --profile materialize_auction_house                             # Default: all models except sources/sinks
dbt run --selector transformations --profiles-dir . --profile materialize_auction_house  # Only staging, intermediate, marts
dbt run --selector sinks_only --profiles-dir . --profile materialize_auction_house       # Only create sinks

Monitoring

Materialize Console: psql "postgresql://materialize@localhost:6875/materialize"
Redpanda Console: http://localhost:8080 (view topics and messages)

View streaming data:

-- Connect to Materialize
psql "postgresql://materialize@localhost:6875/materialize"

-- Watch flippers in real-time
COPY (SUBSCRIBE (SELECT * FROM public_marts.fct_auction_flippers)) TO STDOUT;

Testing

This project includes comprehensive tests for workflows and deployments:

# Quick test using Make
make test                 # Run all tests (workflow + deployment)
make test-workflows       # Validate GitHub Actions workflows only
make test-deployment      # Test blue/green deployment macros
make test-dbt            # Run dbt data quality tests

# Manual test execution
./scripts/test_workflows.sh              # Workflow validation
./scripts/test_blue_green_deployment.sh  # Deployment integration tests

What's tested:

✅ GitHub Actions workflow structure and syntax
✅ Blue/green deployment macros (deploy_init, deploy_promote, etc.)
✅ Atomic swap and rollback capabilities
✅ Cluster hydration checks
✅ Data quality via dbt tests

See TESTING.md for complete testing documentation.

Cleanup

make clean               # Clean all artifacts
docker compose down -v  # Stop and remove containers/volumes

Cloud/Production Setup

For Materialize Cloud or production deployments:

Update profiles.yml with your Materialize Cloud credentials:

materialize_auction_house:
  outputs:
    production:
      type: materialize
      host: <your-host>.aws.materialize.cloud
      port: 6875
      user: <your-email>
      pass: <your-app-password>
      database: materialize
      schema: public
      cluster: production_cluster
      sslmode: require

Initialize Materialize resources:

-- Create clusters
CREATE CLUSTER sources REPLICAS (r1 (SIZE = '25cc'));
CREATE CLUSTER compute REPLICAS (r1 (SIZE = '25cc'));
CREATE CLUSTER sinks REPLICAS (r1 (SIZE = '25cc'));

-- Create source
CREATE SOURCE auction_house
  IN CLUSTER sources
  FROM LOAD GENERATOR AUCTION
  (TICK INTERVAL '1s', AS OF 100000)
  FOR ALL TABLES;

-- Configure Kafka for production
CREATE SECRET kafka_password AS 'your-password';
CREATE CONNECTION kafka_connection TO KAFKA (
  BROKER 'your-kafka-broker:9092',
  SASL MECHANISMS = 'PLAIN',
  SASL USERNAME = 'user',
  SASL PASSWORD = SECRET kafka_password
);

Run dbt against production:

dbt run --target production --profiles-dir . --profile materialize_auction_house
dbt test --target production --profiles-dir . --profile materialize_auction_house

Usage

Run all models

dbt run --profiles-dir . --profile materialize_auction_house

Run specific model layers

dbt run --select staging --profiles-dir . --profile materialize_auction_house     # Raw data preparation
dbt run --select intermediate --profiles-dir . --profile materialize_auction_house # Business logic
dbt run --select marts --profiles-dir . --profile materialize_auction_house       # Final analytics

Run tests

dbt test --profiles-dir . --profile materialize_auction_house

Load seed data

dbt seed --profiles-dir . --profile materialize_auction_house

Generate and view documentation

# Generate documentation
dbt docs generate --profiles-dir . --profile materialize_auction_house

# Serve documentation locally
dbt docs serve --profiles-dir . --profile materialize_auction_house --port 8080

The documentation includes:

Model Lineage: Visual DAG showing data flow from sources to sinks
Model Documentation: Descriptions and column-level details for all models
Test Results: Data quality test results integrated into the docs
Schema Information: Complete catalog of tables, views, and materialized views

Selectors

The project includes pre-configured selectors for common deployment patterns:

default: Runs all transformations (excludes sources and sinks)
sources_only: Only create/update sources
sinks_only: Only create/update sinks
transformations: Only staging, intermediate, and marts
staging_only: Only staging models
intermediate_only: Only intermediate models
marts_only: Only mart models
flippers: Models related to flipper detection
metrics: Models related to metrics and monitoring
no_sinks: Everything except sinks

Example usage:

# Initial setup - create sources first
dbt run --selector sources_only --profiles-dir . --profile materialize_auction_house

# Regular deployment - transformations only
dbt run --selector transformations --profiles-dir . --profile materialize_auction_house

# Update only metrics models
dbt run --selector metrics --profiles-dir . --profile materialize_auction_house

Key Features Demonstrated

Materialize-Specific Features

Views: Efficient transformations without persistence (staging/, intermediate/)
Materialized Views: Incrementally maintained results (marts/)
Indexes: Multi-column indexes for join optimization (int_winning_bids)
Sources: Load generator for streaming data (auction_load_generator)
Sinks: Real-time Kafka exports with DEBEZIUM envelope (sinks/)
Clusters: Workload isolation (sources, compute, sinks)

Real-time Analytics

Live auction status tracking with window functions
Continuous flipper detection using self-joins
Streaming health metrics with hourly aggregations
Real-time suspicious activity alerts with pattern matching

dbt Best Practices

Modular model architecture with clear layer separation
60+ data quality tests including custom tests for real-time data
Full documentation with column descriptions
Environment-specific configurations (dev/staging/production)

Custom dbt Tests

This project includes custom test macros designed for real-time data validation:

value_in_range: Validates that numeric values fall within expected bounds
outlier_detection: Uses z-score method to identify statistical outliers
data_recency: Ensures data has been updated within a specified time window
monotonic_increase: Verifies that values only increase over time (useful for counters)
referential_integrity_cascade: Tests relationships across multiple table levels

Example usage in models/marts/schema.yml:

- name: total_flips
  tests:
    - value_in_range:
        min_value: 1
        max_value: 1000
    - outlier_detection:
        z_threshold: 3.5
        config:
          severity: warn  # Warning instead of error for expected outliers

Customization

Key variables in dbt_project.yml:

flipper_threshold_days: Days to consider for flip detection (default: 8)
min_profit_margin: Minimum profit margin for flippers (default: 0.20)
Cluster names for sources, compute, and sinks

Resources

Materialize Documentation

dbt Resources

Community

License

This example project is provided as-is for demonstration purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
docs		docs
macros/tests		macros/tests
models		models
scripts		scripts
seeds		seeds
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MATERIALIZE_CONCEPTS.md		MATERIALIZE_CONCEPTS.md
Makefile		Makefile
README.md		README.md
dbt_project.yml		dbt_project.yml
docker-compose.yml		docker-compose.yml
packages.yml		packages.yml
profiles-ci.yml		profiles-ci.yml
profiles.yml		profiles.yml
selectors.yml		selectors.yml

License

MaterializeIncLabs/dbt-example

Folders and files

Latest commit

History

Repository files navigation

Materialize dbt Project Example

Overview

CI/CD & Deployment

Workflows

Quick Start

Documentation

Materialize Concepts Demonstrated

Project Structure

Directory Layout

Key Configuration Files

dbt_project.yml

profiles.yml

selectors.yml

Model Organization Best Practices

Additional Resources

Quick Start with Docker

Prerequisites

Docker Setup

Monitoring

Testing

Cleanup

Cloud/Production Setup

Usage

Run all models

Run specific model layers

Run tests

Load seed data

Generate and view documentation

Selectors

Key Features Demonstrated

Materialize-Specific Features

Real-time Analytics

dbt Best Practices

Custom dbt Tests

Customization

Resources

Materialize Documentation

dbt Resources

Community

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`dbt_project.yml`

`profiles.yml`

`selectors.yml`

Packages