ByteFreezer Connector

Data export tool for ByteFreezer parquet files. Query with DuckDB, export to Elasticsearch, Splunk, webhooks, or any custom destination.

Overview

ByteFreezer stores all ingested data as Parquet files in S3/MinIO. The Connector reads those files using DuckDB and exports filtered subsets to external systems. Instead of sending everything to your SIEM, export only the 5% you need for active investigation.

packer --> parquet (S3/MinIO) --> [CONNECTOR] --> Elasticsearch / Splunk / webhook

Web UI: http://localhost:8090 — explore datasets, write queries, preview results, configure export destinations.

This is not a black-box product — it's a working codebase you own and modify. The connector ships with three destinations (stdout, Elasticsearch, webhook) and a simple Destination interface. Need Splunk HEC, Snowflake, Kafka, or a custom internal API? Point Claude Code at this repo with the ByteFreezer MCP connected, and it has everything it needs: the destination interface pattern, MCP tools to discover your datasets and schema, and CLAUDE.md with step-by-step instructions.

Query vs Connector

ByteFreezer includes two tools for querying data. Use both or either:

	Query Service (port 8000)	Connector (port 8090)
Purpose	Interactive analysis	Data export to external systems
AI/NL queries	Yes (Anthropic, OpenAI, Ollama)	No
Export to SIEM	No	Yes (Elasticsearch, Splunk, webhook)
Batch/watch modes	No	Yes — scheduled, cursor-tracked export
Best for	Ad-hoc investigation	Continuous SIEM feed, alerting pipelines

Modes

Mode	Command	Description
interactive	`--mode interactive` (default)	Web UI at `:8090` for exploring datasets and testing queries
batch	`--mode batch`	Run configured query once, send to destination, exit
watch	`--mode watch`	Poll for new data on a timer, continuously export

Quick Start

On-Prem (Docker Compose)

The connector is included in the ByteFreezer on-prem Docker Compose stack. After deploying the full stack, the connector web UI is available at:

http://<your-host>:8090

To run a one-shot query from the command line:

docker exec -w /app bytefreezer-connector ./bytefreezer-connector --mode batch

Binary

go build -o bytefreezer-connector .

Standalone Docker

docker pull ghcr.io/bytefreezer/bytefreezer-connector:latest
docker run -p 8090:8090 -v ./config.yaml:/app/config.yaml:ro ghcr.io/bytefreezer/bytefreezer-connector:latest

Configure

Edit config.yaml with your control API credentials:

control:
  url: "https://api.bytefreezer.com"
  api_key: "your-service-key"
  account_id: "your-account-id"

For batch/watch modes, also set:

query:
  tenant_id: "your-tenant-id"
  dataset_id: "your-dataset-id"
  sql: >
    SELECT timestamp, source_ip, message
    FROM read_parquet('PARQUET_PATH', hive_partitioning=true, union_by_name=true)
    WHERE severity >= 4

destination:
  type: elasticsearch
  config:
    url: "http://localhost:9200"
    index: "security-alerts"

Run

# Interactive mode — open http://localhost:8090
./bytefreezer-connector --config config.yaml

# Batch export to stdout
./bytefreezer-connector --config config.yaml --mode batch

# Continuous watch mode
./bytefreezer-connector --config config.yaml --mode watch

# Re-export from beginning (reset cursor)
./bytefreezer-connector --config config.yaml --mode batch --reset-cursor

SQL Queries

Use PARQUET_PATH as placeholder. The connector replaces it with the S3 glob path for your dataset.

-- All records
SELECT * FROM read_parquet('PARQUET_PATH', hive_partitioning=true, union_by_name=true)
LIMIT 100

-- Filter by time partition
SELECT * FROM read_parquet('PARQUET_PATH', hive_partitioning=true, union_by_name=true)
WHERE year = 2026 AND month = 3

-- Specific fields only
SELECT timestamp, source_ip, message
FROM read_parquet('PARQUET_PATH', hive_partitioning=true, union_by_name=true)
WHERE severity >= 4

Built-in Destinations

Destination	Config Key	Description
`stdout`	—	JSON lines to stdout
`elasticsearch`	`url`, `index`, `username`, `password`	Elasticsearch bulk API
`webhook`	`url`, `method`, `headers`	HTTP POST to any endpoint

Adding a Destination

Create destinations/your_dest.go:

package destinations

import (
    "context"
    "github.com/bytefreezer/connector/connector"
)

func init() {
    connector.RegisterDestination("your_dest", func() connector.Destination {
        return &YourDest{}
    })
}

type YourDest struct{}

func (d *YourDest) Name() string                                          { return "your_dest" }
func (d *YourDest) Init(config map[string]interface{}) error              { return nil }
func (d *YourDest) Send(ctx context.Context, batch connector.Batch) error { return nil }
func (d *YourDest) Close() error                                          { return nil }

The init() function auto-registers the destination. Set destination.type: "your_dest" in config.

Project Structure

├── main.go                    # Entry point, HTTP routes, mode switching
├── ui.go                      # Embedded interactive web UI
├── config/config.go           # Config struct + koanf loader
├── connector/
│   ├── connector.go           # DuckDB engine, S3 config, query execution
│   ├── control_client.go      # Control API client (S3 creds, health reporting)
│   ├── cursor.go              # Cursor persistence (JSON file)
│   └── destination.go         # Destination interface + registry
├── destinations/
│   ├── stdout.go              # JSON lines to stdout
│   ├── elasticsearch.go       # Elasticsearch bulk API
│   └── webhook.go             # Generic HTTP POST
├── config.yaml                # Example configuration
├── Dockerfile                 # Docker image (debian:bookworm-slim)
└── CLAUDE.md                  # Claude Code instructions

Health Reporting

In watch and interactive modes, the connector registers with the ByteFreezer control plane as bytefreezer-connector and reports health every 30 seconds. It appears on the Health page in the UI alongside proxy, receiver, piper, packer, and query.

Documentation

Connector docs — full documentation
CLAUDE.md — instructions for Claude Code + MCP tools reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ByteFreezer Connector

Overview

Query vs Connector

Modes

Quick Start

On-Prem (Docker Compose)

Binary

Standalone Docker

Configure

Run

SQL Queries

Built-in Destinations

Adding a Destination

Project Structure

Health Reporting

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
config		config
connector		connector
destinations		destinations
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
config.yaml		config.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go
ui.go		ui.go

Folders and files

Latest commit

History

Repository files navigation

ByteFreezer Connector

Overview

Query vs Connector

Modes

Quick Start

On-Prem (Docker Compose)

Binary

Standalone Docker

Configure

Run

SQL Queries

Built-in Destinations

Adding a Destination

Project Structure

Health Reporting

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages