Skip to content

streambased-io/breakstream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BreakStream

Docker-based demo and testing environment for Streambased components.

What is Streambased?

Streambased unifies Apache Kafka and Apache Iceberg, enabling real-time analytics across streaming and historical data without traditional batch pipelines. See Streambased Overview for details.

Running in github Codespaces

The easiest way to try out BreakStream and Streambased is via GitHub Codespaces.

  1. Click the green Code button above and select Open with Codespaces > New codespace. Start Codespace
  2. Wait for the Codespace to initialize (this may take a few minutes).
  3. Open a terminal in the Codespace and run:
    ./bin/start.sh
    Start Breakstream

Demo Quickstart

Run the interactive demo to see Streambased in action:

./bin/start.sh

This starts the full stack and walks through:

  1. Unified data views - Query streaming (hotset) and historical (coldset) data with SQL
  2. Data tiering - Move data from Kafka to Iceberg
  3. Extended retention - Kafka consumers reading historical data via KSI

See Demo Quickstart Guide for the full walkthrough.

Documentation

Requirements

  • Docker and Docker Compose
  • jq
  • Java 11+
  • ~8GB RAM available for Docker

Commands

# Run interactive demo
./bin/start.sh

# Run automated tests
./bin/start.sh core_functions

# Run all test specs
./bin/run_all.sh

# Stop environment
./bin/stop.sh

# Setup mode (start environment, skip tests)
SETUP_MODE=true ./bin/start.sh core_functions

Test Framework

BreakStream also serves as a black-box testing framework for Streambased components.

Test Structure

A test consists of 3 components:

  1. Spec (specs/<name>/spec.json) - Defines Docker components, datasets, and tests to run:

    {
        "components": ["kafka", "iceberg", "datagen-setup", "datagen-background", "streambased"],
        "setup_datasets": ["core"],
        "background_dataset": "core",
        "tests": ["core_functions"]
    }
  2. Datasets (datasets/<name>/) - ShadowTraffic configs for data generation:

    • setup.json - One-shot data prepopulation
    • background.json - Continuous background traffic
    • post_setup.sh - Post-load actions (e.g., copy to Iceberg)
  3. Test Cases (tests/<name>/run.sh) - Shell script returning 0 for success

Demo Specs

Specs prefixed with demo_ are interactive demonstrations that:

  • Expect user interaction (press key to continue)
  • Leave environment running after completion
  • Don't verify correctness, just demonstrate features

About

blaock box testing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •