Gruf Relay is a lean and performant gRPC proxy server crafted to optimize resource usage within microservice architectures, especially those leveraging single-threaded languages like Ruby. In Kubernetes deployments, a common pattern involves sidecar containers, which inevitably consume resources alongside your primary application. By deploying Gruf Relay within a pod, you unlock the ability to run multiple Ruby workers (hosting your Gruf application) as sub-processes within a single container. Gruf Relay then intelligently load balances gRPC requests across these workers, maximizing CPU utilization within the pod, without the overhead of additional sidecars or pod scaling.
Key features include built-in load balancing, which ensures requests are evenly distributed across available workers, and robust health checks that contribute to high availability. Kubernetes-native readiness and liveness probes are present for seamless integration with orchestration platforms. Comprehensive metrics provide detailed insights into performance and resource consumption. A built-in request queue is also present, preventing request drops during bursts of traffic, enhancing overall system reliability. Ultimately, Gruf Relay amplifies pod capacity by enabling multiple Ruby instances to coexist within a single pod, sharing sidecars and substantially reducing overall resource demand.
- Features
- Benchmarks
- Installation
- Configuration
- Usage
- Architecture
- Example Usage
- Contributing
- License
- Random Load Balancing: Distribute requests across healthy backend instances.
- Health Checking: Regular gRPC health checks with configurable intervals.
- Dynamic Configuration: YAML config + environment variables support.
- Metrics Exposure: Prometheus metrics endpoint for monitoring.
- Worker Management: Automated worker lifecycle management.
- Horizontal Scaling: Easily scale backend worker instances.
- Structured Logging: JSON-formatted logs with configurable levels.
- Request Handling: Includes request queuing to help prevent request loss, offering enhanced reliability compared to a basic Gruf setup.
To ensure Gruf Relay provides exceptional performance and stability, a series of load tests were conducted using the k6 tool. Below are the results of comparison benchmarks between Gruf Relay, Gruf, and Gruf with Backlog Patch, each executed under identical conditions.
- System Configuration: MacOS 15.4.1, Apple M1 Pro CPU
- CPU and RAM Limits:
- CPU Limit: 500m for Gruf (default & backlog patch), 1500m for Gruf Relay.
- Memory Limit: 400Mi for Gruf (default & backlog patch), 1200Mi for Gruf Relay.
- Deployment Configuration:
- Gruf & Gruf with Patch: 6 pods with 5 threads each (30 total threads)
- Gruf Relay: 2 pods with 3 workers per pod, each worker using 5 threads (30 total threads)
- Testing Tool: k6
- Scenario:
- Ramp up to 60 users over 3 stages.
- Max 60 Virtual Users (VUs) during 1m50s run duration.
- Graceful cooldown and stop of 30 seconds each.
- Script:
k6.js
. - RPC Method Simulation: The benchmark simulates realistic workloads by calling an RPC method that includes both I/O bound operations and CPU-bound operations. This approach ensures a more accurate representation of real-world server load.
Test | Total Requests | Success Rate | Error Rate | p95 Latency |
---|---|---|---|---|
Gruf Relay | 6500 | 100% | 0% | 80.33ms |
Gruf | 6547 | 98.09% | 1.90% | 77.04ms |
Gruf with Backlog Patch | 6518 | 100% | 0% | 78.41ms |
-
Gruf Relay
-
Gruf (Default)
-
Gruf with Backlog Patch
- Successfully matched Gruf Relay's 100% success rate by mitigating dropped requests via custom request queuing.
- Introduced operational complexity due to the backlog patch, which requires compatibility checks for every minor version update of the
grpc
gem. - Slight increase in latency (~2ms) likely due to additional queuing logic.
Gruf Relay performed well in this benchmark, demonstrating:
- Zero errors and 100% success rate.
- Stable latency.
- Built-in queuing, eliminating the need for custom patches.
Gruf with Backlog Patch achieved comparable success, but adds operational complexity. Gruf (Default) exhibited errors under load due to insufficient request handling.
# Install with Go
go install github.com/bibendi/gruf-relay/cmd/gruf-relay@latest
# Or build from source
git clone https://github.com/bibendi/gruf-relay
cd gruf-relay
make build
Gruf Relay is also available as a Ruby gem containing the pre-compiled binary for different platforms:
# Install the gem
gem install gruf-relay
# Or add to your Gemfile
gem "gruf-relay"
After installation, you can run Gruf Relay using:
bundle exec gruf-relay
The gem packages binaries for multiple platforms including:
- Linux AMD64 (x86_64)
- Linux ARM64
- macOS AMD64 (x86_64)
- macOS ARM64 (Apple Silicon)
Gruf Relay prioritizes configuration in the following order:
- Environment Variables: Override configuration settings.
- Configuration File: If the
CONFIG_PATH
environment variable is set, Gruf Relay attempts to load the configuration from the specified YAML file. If not set, it defaults to searching for a file namedgruf-relay.yml
in the current directory. - Default Values: If a configuration value is not provided by either environment variables or the config file, the default value is used.
log:
level: "debug"
format: "json"
server:
host: "0.0.0.0"
port: 8080
proxy_timeout: "5s"
workers:
count: 2
start_port: 9000
metrics_path: "/metrics"
pool_size: 5
health_check:
interval: "5s"
timeout: "3s"
probes:
enabled: true
port: 5555
metrics:
enabled: true
port: 9394
path: "/metrics"
interval: "5s"
The following environment variables can be used to override settings in the config.yaml
file. Environment variables take precedence over the configuration file.
LOG_LEVEL
: Logging level (default:debug
). Possible values:debug
,info
,warn
,error
.LOG_FORMAT
: Logging format (default:json
). Possible values:json
,text
.SERVER_HOST
: Host address for the gRPC proxy (default:0.0.0.0
).SERVER_PORT
: Port for the gRPC proxy (default:8080
).SERVER_PROXY_TIMEOUT
: Timeout for proxy requests (default:5s
). Must be a valid duration string (e.g., "10s", "1m", "1m30s").HEALTH_CHECK_INTERVAL
: Interval for health checks (default:5s
). Must be a valid duration string (e.g., "10s", "1m", "1m30s").HEALTH_CHECK_TIMEOUT
: Timeout for health checks (default:3s
). Must be a valid duration string (e.g., "10s", "1m", "1m30s").WORKERS_COUNT
: Number of backend workers (default:2
).WORKERS_START_PORT
: Starting port for workers (default:9000
).WORKERS_METRICS_PATH
: Path for worker metrics endpoint (default:/metrics
).WORKERS_POOL_SIZE
: Size of the worker pool (default:5
).PROBES_ENABLED
: Enable/disable liveness/readiness probes (default:true
).PROBES_PORT
: Port for liveness/readiness probes (default:5555
).METRICS_ENABLED
: Enable/disable metrics exposure (default:true
).METRICS_PORT
: Port for Prometheus metrics (default:9394
).METRICS_PATH
: Path for Prometheus metrics (default:/metrics
).METRICS_INTERVAL
: Interval for metrics collection (default:5s
). Must be a valid duration string (e.g., "10s", "1m", "1m30s").
Example:
export SERVER_PORT=8081
export WORKERS_COUNT=3
export METRICS_ENABLED=true
export HEALTH_CHECK_INTERVAL=10s
# Start server with config
./gruf-relay
Endpoint | Port | Description |
---|---|---|
gRPC Proxy | 8080 | Main proxy endpoint |
Metrics | 9394 | Prometheus metrics |
Liveness Probe | 5555 | Kubernetes liveness check (/liveness ) |
Readiness Probe | 5555 | Kubernetes readiness check (/readiness ) |
Startup Probe | 5555 | Kubernetes startup check (/startup ) |
- Manager: Controls worker lifecycle
- Health Checker: Monitors worker availability
- Random Balancer: Distributes requests evenly
- Request Queue: Buffers incoming requests to prevent drops during high load.
- Metrics Server: Exposes Prometheus metrics
- Probes Server: Provides endpoints for liveness, readiness and startup probes.
To run with the example Gruf application:
# Run the Gruf Relay server with the example application
make run
This command will start Gruf Relay and the example Gruf application side-by-side, allowing you to test the proxy's functionality.
curl -v http://localhost:5555/startup
curl -v http://localhost:5555/readiness
curl -v http://localhost:5555/liveness
example/script/job
example/script/health_check
Contributions welcome!
MIT
Made with ❤️ by @bibendi