fleet-router

A reverse proxy and load balancer for the Antelope SHiP (State History Plugin) WebSocket protocol.

fleet-router sits in front of a fleet of Antelope SHiP nodes. Clients open a single WebSocket connection to the router; the router picks a healthy upstream SHiP node, forwards its ABI, and proxies the WebSocket bidirectionally. If an upstream drops, the router transparently fails over to another suitable node, replays the in-flight get_blocks request from the next block, and de-duplicates already-delivered blocks — so the client connection persists without any manual reconnect.

Use it when you run more than one SHiP node and want clients (Hyperion, dfuse-style indexers, custom consumers) to see a single, resilient endpoint with load balancing and automatic failover, instead of pinning each consumer to one node.

It is written in Rust on tokio and tokio-tungstenite, and uses rs_abieos (pure-Rust backend) for ABI handling.

Features

Range-aware, least-connections load balancing — prefers an upstream whose trace range covers the requested block; otherwise routes to the least-loaded healthy upstream.
Automatic failover with de-duplication — on upstream loss, reconnects to another suitable node, resumes get_blocks at the next block, de-duplicates replayed blocks, and buffers/replays client frames sent during the swap.
Stale-upstream deprioritization — upstreams that stop advancing their chain state are flagged stale and deprioritized (not hard-excluded) when routing.
Graceful shutdown — handles SIGINT/SIGTERM with a bounded drain and WebSocket close frames to connected clients.
Structured logging — tracing-based, controlled with RUST_LOG.
Optional health/metrics endpoint — liveness, readiness, and Prometheus metrics over HTTP, enabled on demand.
Resource safety — connection cap, handshake/idle timeouts, and a bounded maximum WebSocket message size.

How it works

A client connects over WebSocket. The router selects an upstream (range-aware, then least-connections), forwards that upstream's ABI to the client, and then proxies frames in both directions. Background loops poll each upstream's status and block progress; an upstream that stops advancing is marked stale and deprioritized. If the active upstream drops, the router selects another suitable upstream, resends the in-flight get_blocks request resumed at the next block, de-duplicates blocks the client has already received, and replays any client frames buffered during the swap. The client connection stays open throughout.

                                    +-----------------------+
                                    |  upstream SHiP node A | (active)
                            +-----> |  ws://hostA:port      |
                            |       +-----------------------+
+--------+   WebSocket   +--+-----------+
| client | ============> | fleet-router |   range-aware / least-connections
+--------+               +--+-----------+   selection + health monitoring
                            |       +-----------------------+
                            +-----> |  upstream SHiP node B | (failover target:
       on upstream A failure,       |  ws://hostB:port      |  resume at next block,
       transparent swap to B  ----> +-----------------------+  de-duplicate blocks)

Requirements and supported platforms

fleet-router is pure Rust and builds on Linux, macOS, and Windows (x86_64 and arm64) — no C/C++ toolchain, clang, or libclang required. The only prerequisite is a Rust toolchain.

Requirement	Notes
Rust ≥ 1.95 (MSRV)	Required by `rs_abieos` (uses `if let` guards, stabilized in Rust 1.95)

If you do not already have Rust, install it via rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Installation

From crates.io

cargo install fleet-router

From source

git clone https://github.com/eosrio/fleet-router.git
cd fleet-router
cargo install --path .

Docker

A prebuilt image is published to the GitHub Container Registry on tagged releases:

docker pull ghcr.io/eosrio/fleet-router

The image runs as a non-root user and defines a HEALTHCHECK on the proxy port (17000). See Running in production for a full docker run example.

Quick start

Write a sample config file:
```
fleet-router config init ./config.json
```
Edit config.json to list your SHiP nodes (set each endpoint to host:port, no scheme) and adjust the bind address/port and intervals.
Validate the config and test upstream connectivity:
```
fleet-router config test ./config.json
```
Run the proxy:
```
fleet-router run --config ./config.json
```

On startup, run validates the configuration before binding. Invalid configs fail fast with an actionable error; on success the router begins listening and logs its upstream monitors:

configuration is valid.
INFO starting upstream monitor name="SHiP Node 1" upstream=127.0.0.1:18080
INFO listening for clients address=0.0.0.0 port=17000

Configuration

Configuration is a single JSON file (config.json by default). The three *_ms fields are in milliseconds. Each upstream endpoint is host:port with no scheme — the router prepends ws:// itself.

Field	Type	Required	Default	Description
`listen_address`	string	yes	—	Bind address for client connections (e.g. `0.0.0.0`).
`listen_port`	u16	yes (non-zero)	`17000` (sample)	Port for client connections.
`upstream_reconnect_ms`	u64	yes (> 0)	—	Milliseconds between upstream reconnection attempts.
`upstream_monitoring_ms`	u64	yes (> 0)	—	Milliseconds between block-progress logging and staleness checks.
`upstream_status_ms`	u64	yes (> 0)	—	Milliseconds between status requests sent to each upstream.
`servers`	array	yes (≥ 1 enabled)	—	List of upstream SHiP nodes (see below).
`max_connections`	usize	no	`10000`	Max concurrent client connections; excess are rejected (backpressure).
`handshake_timeout_ms`	u64	no	`10000`	Client WebSocket handshake timeout; `0` disables it.
`idle_timeout_ms`	u64	no	`0` (disabled)	Close a connection idle (no data in either direction) for this long.
`max_message_bytes`	usize	no	`268435456` (256 MiB)	Max WebSocket message size on both client and upstream links.
`shutdown_grace_ms`	u64	no	`5000`	How long to wait for in-flight connections to drain on shutdown.
`metrics_address`	string	no	falls back to `listen_address`	Bind address for the health/metrics HTTP endpoint.
`metrics_port`	u16	no	unset (endpoint disabled)	Port for the health/metrics HTTP endpoint. Setting it enables the endpoint.

Each entry in servers is an object:

Field	Type	Required	Description
`name`	string	yes	Human-readable name used in logs.
`endpoint`	string	yes	Upstream as `host:port` (no scheme; `ws://` is prepended). Endpoints must be unique.
`enabled`	bool	yes	Whether the router may use this upstream. At least one enabled server is required.

Sample `config.json`

{
  "listen_address": "0.0.0.0",
  "listen_port": 17000,
  "upstream_reconnect_ms": 3000,
  "upstream_monitoring_ms": 5000,
  "upstream_status_ms": 5000,
  "servers": [
    {
      "name": "SHiP Node 1",
      "endpoint": "127.0.0.1:18080",
      "enabled": true
    },
    {
      "name": "SHiP Node 2",
      "endpoint": "127.0.0.1:28080",
      "enabled": true
    }
  ]
}

Usage

fleet-router config init <path>     Write a sample config file to <path>.
fleet-router config test <path>     Parse and validate a config, then test upstream connectivity.
fleet-router run [--config <path>]  Run the proxy.
fleet-router --version              Print the version.
fleet-router --help                 Print help.

The --config flag is global and defaults to ./config.json, so fleet-router run with a config.json in the working directory is equivalent to passing --config ./config.json.

Observability

Logging

Logging uses tracing. Control verbosity with the RUST_LOG environment variable (default info):

# Everything at debug
RUST_LOG=debug fleet-router run --config config.json

# Debug for fleet-router, info for everything else
RUST_LOG=fleet_router=debug,info fleet-router run --config config.json

Health and metrics endpoint

Set metrics_port (and optionally metrics_address) to enable an HTTP endpoint. It serves GET requests on:

Route	Response
`/health`	`200` while the process is running.
`/ready`	`200` if at least one upstream is online, otherwise `503`.
`/metrics`	Prometheus text exposition of router and upstream state.

Exposed metrics include:

fleet_router_up
fleet_router_upstream_up{endpoint}
fleet_router_upstream_stale{endpoint}
fleet_router_active_connections{endpoint}
fleet_router_upstream_chain_state_end_block{endpoint}

Running in production

Do not expose fleet-router directly to the public internet. See Security and limitations.

systemd

[Unit]
Description=Fleet SHiP Router
After=network-online.target
Wants=network-online.target

[Service]
User=fleet
ExecStart=/usr/local/bin/fleet-router run --config /etc/fleet-router/config.json
Restart=on-failure
# SIGTERM triggers a graceful, bounded drain. Allow more than shutdown_grace_ms
# so systemd does not SIGKILL the process mid-drain.
TimeoutStopSec=10
Environment=RUST_LOG=info

[Install]
WantedBy=multi-user.target

SIGTERM (sent by systemctl stop) triggers graceful shutdown: the router stops accepting new connections, drains in-flight ones for up to shutdown_grace_ms, and sends WebSocket close frames to clients. Keep TimeoutStopSec comfortably larger than shutdown_grace_ms.

Docker

docker run -p 17000:17000 \
  -v "$PWD/config.json:/etc/fleet-router.json:ro" \
  ghcr.io/eosrio/fleet-router run --config /etc/fleet-router.json

The image runs as a non-root user and defines a HEALTHCHECK against port 17000.

Tuning notes

Intervals (upstream_reconnect_ms, upstream_monitoring_ms, upstream_status_ms): lower values detect failures and staleness faster at the cost of more polling traffic to upstreams; raise them to reduce chatter.
Limits (max_connections, handshake_timeout_ms, idle_timeout_ms, max_message_bytes): size max_connections for your expected concurrency (excess connections are rejected, not queued); set idle_timeout_ms to reap dead clients; lower max_message_bytes only if you are sure your blocks fit, since oversized frames are rejected on both links.

Security and limitations

Transport is plaintext ws:// on both the client listener and the upstream connections. No TLS / wss:// is compiled in.
The client listener is unauthenticated — anyone who can reach the port can stream data.
rs_abieos (pure Rust) parses untrusted upstream data; treat your upstreams as part of the trust boundary.

Because of the above:

Deploy on a trusted or internal network, and/or behind a TLS-terminating reverse proxy (nginx, Caddy, Envoy) that adds access control.
Do not expose fleet-router directly to the public internet.

Contributing

Contributions are welcome. Please read CONTRIBUTING.md for the development setup, the CI checks, and the pre-PR checklist (cargo fmt --all, cargo clippy --workspace --all-targets --all-features -- -D warnings, cargo test --workspace, and a CHANGELOG.md entry). Tests run against an in-repo mock SHiP double and need no external services.

Security policy

To report a vulnerability, see SECURITY.md. Please do not open public issues for security reports.

Changelog

See CHANGELOG.md for release notes (Keep a Changelog format).

Code of conduct

This project follows the Code of Conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
docker		docker
mock-ship		mock-ship
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
deny.toml		deny.toml
example.config.json		example.config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fleet-router

Features

How it works

Requirements and supported platforms

Installation

From crates.io

From source

Docker

Quick start

Configuration

Sample `config.json`

Usage

Observability

Logging

Health and metrics endpoint

Running in production

systemd

Docker

Tuning notes

Security and limitations

Contributing

Security policy

Changelog

Code of conduct

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

fleet-router

Features

How it works

Requirements and supported platforms

Installation

From crates.io

From source

Docker

Quick start

Configuration

Sample config.json

Usage

Observability

Logging

Health and metrics endpoint

Running in production

systemd

Docker

Tuning notes

Security and limitations

Contributing

Security policy

Changelog

Code of conduct

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Sample `config.json`

Packages