ArkFlow

English | 中文

High-performance Rust stream processing engine, providing powerful data stream processing capabilities, supporting multiple input/output sources and processors.

Features

High Performance: Built on Rust and Tokio async runtime, offering excellent performance and low latency
Multiple Data Sources: Support for Kafka, MQTT, HTTP, files, and other input/output sources
Powerful Processing Capabilities: Built-in SQL queries, JSON processing, Protobuf encoding/decoding, batch processing, and other processors
Extensible: Modular design, easy to extend with new input, output, and processor components

Installation

Building from Source

# Clone the repository
git clone https://github.com/chenquan/arkflow.git
cd arkflow

# Build the project
cargo build --release

# Run tests
cargo test

Quick Start

Create a configuration file config.yaml:

logging:
  level: info
streams:
  - input:
      type: "generate"
      context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
      interval: 1s
      batch_size: 10

    pipeline:
      thread_num: 4
      processors:
        - type: "json_to_arrow"
        - type: "sql"
          query: "SELECT * FROM flow WHERE value >= 10"

    output:
      type: "stdout"

Run ArkFlow:

./target/release/arkflow --config config.yaml

Configuration Guide

ArkFlow uses YAML format configuration files, supporting the following main configuration items:

Top-level Configuration

logging:
  level: info  # Log level: debug, info, warn, error

streams: # Stream definition list
  - input:      # Input configuration
    # ...
    pipeline:   # Processing pipeline configuration
    # ...
    output:     # Output configuration
    # ...
    buffer:     # Buffer configuration
    # ...

Input Components

ArkFlow supports multiple input sources:

Kafka: Read data from Kafka topics
MQTT: Subscribe to messages from MQTT topics
HTTP: Receive data via HTTP
File: Reading data from files(Csv,Json, Parquet, Avro, Arrow) using SQL
Generator: Generate test data
Database: Query data from databases(MySQL, PostgreSQL, SQLite, Duckdb)

Example:

input:
  type: kafka
  brokers:
    - localhost:9092
  topics:
    - test-topic
  consumer_group: test-group
  client_id: arkflow
  start_from_latest: true

Processors

ArkFlow provides multiple data processors:

JSON: JSON data processing and transformation
SQL: Process data using SQL queries
Protobuf: Protobuf encoding/decoding
Batch Processing: Process messages in batches

Example:

pipeline:
  thread_num: 4
  processors:
    - type: json_to_arrow
    - type: sql
      query: "SELECT * FROM flow WHERE value >= 10"

Output Components

ArkFlow supports multiple output targets:

Kafka: Write data to Kafka topics
MQTT: Publish messages to MQTT topics
HTTP: Send data via HTTP
Standard Output: Output data to the console
Drop: Discard data

Example:

output:
  type: kafka
  brokers:
    - localhost:9092
  topic: output-topic
  client_id: arkflow-producer

Buffer Components

ArkFlow provides buffer capabilities to handle backpressure and temporary storage of messages:

Memory Buffer: Memory buffer, for high-throughput scenarios and window aggregation

Example:

buffer:
  type: memory
  capacity: 10000  # Maximum number of messages to buffer
  timeout: 10s  # Maximum time to buffer messages

Examples

Kafka to Kafka Data Processing

streams:
  - input:
      type: kafka
      brokers:
        - localhost:9092
      topics:
        - test-topic
      consumer_group: test-group

    pipeline:
      thread_num: 4
      processors:
        - type: json_to_arrow
        - type: sql
          query: "SELECT * FROM flow WHERE value > 100"

    output:
      type: kafka
      brokers:
        - localhost:9092
      topic: processed-topic

Generate Test Data and Process

streams:
  - input:
      type: "generate"
      context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
      interval: 1ms
      batch_size: 10000

    pipeline:
      thread_num: 4
      processors:
        - type: "json_to_arrow"
        - type: "sql"
          query: "SELECT count(*) FROM flow WHERE value >= 10 group by sensor"

    output:
      type: "stdout"

License

ArkFlow is licensed under the Apache License 2.0.

Community

Discord: https://discord.gg/CwKhzb8pux

If you like or are using this project to learn or start your solution, please give it a star⭐. Thanks!

Name	Name	Last commit message	Last commit date
Latest commit dependabot[bot] chore(deps): bump half from 2.5.0 to 2.6.0 (ark-flow#197 ) Apr 9, 2025 af1b286 · Apr 9, 2025 History 136 Commits
.github	.github	chore: remove log file (arkflow-rs#200 )	Apr 9, 2025
crates	crates	feat(log): logs can be output to a file (arkflow-rs#196 )	Apr 8, 2025
docs	docs	chore: remove package-lock.json (arkflow-rs#188 )	Apr 4, 2025
examples	examples	feat(log): logs can be output to a file (arkflow-rs#196 )	Apr 8, 2025
.gitignore	.gitignore	feat: add generate input (arkflow-rs#31 )	Mar 4, 2025
Cargo.lock	Cargo.lock	chore(deps): bump half from 2.5.0 to 2.6.0 (ark-flow#197 )	Apr 9, 2025
Cargo.toml	Cargo.toml	feat(log): logs can be output to a file (arkflow-rs#196 )	Apr 8, 2025
LICENSE	LICENSE	init commit	Mar 1, 2025
README.md	README.md	chore: adjust the deps	Mar 30, 2025
README_zh.md	README_zh.md	chore: adjust the deps	Mar 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArkFlow

Features

Installation

Building from Source

Quick Start

Configuration Guide

Top-level Configuration

Input Components

Processors

Output Components

Buffer Components

Examples

Kafka to Kafka Data Processing

Generate Test Data and Process

License

Community

About

Releases

Packages

Languages

License

chenquan/arkflow

Folders and files

Latest commit

History

Repository files navigation

ArkFlow

Features

Installation

Building from Source

Quick Start

Configuration Guide

Top-level Configuration

Input Components

Processors

Output Components

Buffer Components

Examples

Kafka to Kafka Data Processing

Generate Test Data and Process

License

Community

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages