databridge

Migrate data between different databases with ease.

Features

Plugin architecture — Add new databases by implementing interfaces, no core changes needed
Dynamic plugin loading — Load .so plugins at runtime (Go plugin package), hot-reload with SIGHUP
Streaming migration — Batch-based processing, won't OOM on large tables
Checkpoint & resume — Interrupted migrations can continue from where they left off
Parallel table migration — Multiple tables migrated concurrently with configurable parallelism
Lifecycle hooks — Inject custom logic at pipeline/table/batch stages (e.g., TimescaleDB hypertable setup)
Schema mapping — Automatic type conversion between source and sink databases
Debug & diagnostics — Optional pprof profile collection for performance analysis

Supported Databases

Database	Source	Sink
MySQL	✓	✓
PostgreSQL	✓	✓
InfluxDB	✓	✓
ClickHouse	✓	✓
MongoDB	✓	✓
Redis	✓	✓
Kafka	✓	✓

Quick Start

Install

go install github.com/silves-xiang/data-bridge/cmd/databridge@latest

Usage

# Run a migration
databridge migrate -c config.yaml

# Validate config without running
databridge validate -c config.yaml

# List available connectors and hooks
databridge list

# Show version
databridge version

Configuration

Create a YAML config file:

task:
  name: "my-migration"
  mode: full

source:
  type: mysql
  connection:
    host: "127.0.0.1"
    port: 3306
    user: "root"
    password: "${MYSQL_PASSWORD}"
    database: "source_db"

sink:
  type: postgresql
  connection:
    host: "127.0.0.1"
    port: 5432
    user: "postgres"
    password: "${PG_PASSWORD}"
    database: "target_db"
    ssl_mode: "disable"

tables:
  - source: "users"
    target: "users"
    batch_size: 5000

parallelism: 4

checkpoint:
  enabled: true
  dir: "./.databridge/checkpoints"

InfluxDB Configuration

InfluxDB can be used as both source and sink. As a sink, you can configure which columns become tags and which column provides the timestamp:

source:
  type: influxdb
  connection:
    url: "http://localhost:8086"
    token: "${INFLUXDB_TOKEN}"
    org: "myorg"
    bucket: "mybucket"

sink:
  type: influxdb
  connection:
    url: "http://localhost:8086"
    token: "${INFLUXDB_TOKEN}"
    org: "myorg"
    bucket: "target_bucket"
  params:
    time_column: "created_at"     # source column to use as timestamp
    tag_columns: ["sensor_id"]    # source columns to store as tags

See examples/ for full configuration examples.

Architecture

Source (MySQL)  ──ReadBatch──>  Pipeline  ──WriteBatch──>  Sink (PostgreSQL)
                                    │
                              ┌─────┼─────┐
                              │     │     │
                         Checkpoint Hooks  Worker Pool

Core Interfaces

Source — Reads tables and row batches from a source database
Sink — Creates tables and writes row batches to a target database
Hook — Lifecycle callbacks: PipelineHook, TableHook, BatchHook

Adding a New Database

Compile-time (built-in):

Implement source.Source and/or sink.Sink interfaces
Implement schema mapping (SourceTypeMapper / TargetTypeMapper)
Register in init() via source.Register("name", factory) / sink.Register("name", factory)
Import the plugin package in cmd/databridge/main.go

Runtime (.so dynamic loading):

Plugins can be compiled as shared objects and loaded at runtime without recompiling the main binary. Each .so must export a Register function:

package main

import _ "github.com/silves-xiang/data-bridge/plugins/myplugin"

func Register() {}

Build with:

make plugin-myplugin    # produces plugins/myplugin.so

Set plugin_dir in your config, and plugins are loaded at startup:

plugin_dir: "./plugins"

To hot-reload plugins after adding or removing .so files:

kill -SIGHUP $(pgrep databridge)

Note: Go plugin requires the plugin and main binary to use the same Go version. Only supported on Linux, FreeBSD, and macOS.

Hooks

Hooks allow custom logic at migration lifecycle points:

PipelineHook — OnPipelineStart / OnPipelineEnd
TableHook — OnTableStart / OnTableEnd (e.g., create TimescaleDB hypertable)
BatchHook — OnBatchComplete (e.g., periodic aggregation)

hooks:
  - name: "create-hypertables"
    type: "timescale"
    params:
      partition_column: "created_at"
      hypertable_interval: "7 days"
      enable_compression: true
      compression_after: "30 days"

Debug & Diagnostics

debug:
  enabled: true
  verbose_batch: true    # Log every batch timing and row count
  log_memory: true       # Log memory usage per batch

pprof:
  enabled: true
  dir: "./.databridge/pprof"
  interval: "5m"         # Capture interval
  profiles:
    - "heap"
    - "goroutine"
    - "allocs"
  cpu_duration: "30s"

Analyze profiles with:

go tool pprof -http=:8080 .databridge/pprof/heap_20260101_120000.prof

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
cmd		cmd
examples		examples
internal		internal
pkg		pkg
plugins		plugins
testing/integration		testing/integration
testutil		testutil
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
README_zh.md		README_zh.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

databridge

Features

Supported Databases

Quick Start

Install

Usage

Configuration

InfluxDB Configuration

Architecture

Core Interfaces

Adding a New Database

Hooks

Debug & Diagnostics

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

databridge

Features

Supported Databases

Quick Start

Install

Usage

Configuration

InfluxDB Configuration

Architecture

Core Interfaces

Adding a New Database

Hooks

Debug & Diagnostics

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages