Skip to content

rdcp technical analysis

Doug Fennell edited this page Sep 23, 2025 · 2 revisions

RDCP: Technical Analysis and Architecture Overview

Executive Summary

RDCP (Runtime Debug Control Protocol) represents a new infrastructure primitive for operational control of distributed applications. Unlike traditional debugging approaches that require code deployments, RDCP establishes a standardized HTTP-based protocol for runtime behavior modification across any compliant application.

Protocol Architecture

Core Discovery Pattern

RDCP implements a discovery-driven approach where applications self-describe their debugging capabilities:

# Universal discovery endpoint
GET /.well-known/rdcp
# Returns: protocol version, endpoints, capabilities, security levels

# Service-specific capabilities  
GET /rdcp/v1/discovery
# Returns: available debug categories, current states, descriptions

Standard Endpoints

All RDCP-compliant applications expose five required endpoints:

  • /.well-known/rdcp - Protocol discovery
  • /rdcp/v1/discovery - Component and category discovery
  • /rdcp/v1/control - Runtime behavior modification
  • /rdcp/v1/status - Operational status and metrics
  • /rdcp/v1/health - Health checking

Control Mechanism

Runtime control operates through standardized HTTP requests:

# Enable debugging categories
POST /rdcp/v1/control
{
  "action": "enable",
  "categories": ["DATABASE", "AUTH"],
  "duration": "30m"  // Optional TTL
}

# Multi-tenant control
POST /rdcp/v1/tenants/{tenantId}/control
{
  "action": "enable", 
  "categories": ["PAYMENTS"],
  "duration": "1h"
}

Infrastructure Control Plane

Classification within Infrastructure Stack

RDCP establishes a new control plane category:

  • Kubernetes = Container Control Plane
  • Istio = Service Mesh Control Plane
  • RDCP = Application Behavior Control Plane

Network Effect Architecture

The protocol's universality enables cross-system interoperability:

  • ANY admin tool works with ANY service
  • ANY incident response system can control ANY application
  • ANY compliance system can audit ANY behavior
  • ANY monitoring tool can dynamically enable debugging

Standards Positioning

RDCP addresses a gap in infrastructure standards:

HTTP   β†’ Web communication standard
gRPC   β†’ RPC communication standard
OTEL   β†’ Observability data standard  
RDCP   β†’ Operational control standard

Universal Service Discovery (environment-wide inventory)

Any RDCP-compliant service exposes a well-known discovery endpoint that describes its protocol support and capabilities. This is intended to be machine-readable so tools can enumerate, reason about, and (subject to policy) act upon what services can do at runtime.

# Any service can advertise what it supports at runtime
curl -sS https://payment-service/.well-known/rdcp
curl -sS https://user-service/.well-known/rdcp
curl -sS https://auth-service/.well-known/rdcp

# Category-level details are available via service discovery
curl -sS https://payment-service/rdcp/v1/discovery

What this enables in practice:

  • Environment-wide inventory: a collector can query a list of service base URLs (from DNS, service registry, Kubernetes, config management, etc.), call /.well-known/rdcp and /rdcp/v1/discovery, and assemble a current map of controllable categories and states per service.
  • Policy-aware visibility: combined with multi-tenancy and authentication, the same endpoint can present tenant-scoped capabilities. With no credentials you may see generic metadata; with appropriate tenant-scoped credentials you see the subset applicable to that tenant.
  • Operational safety: discovery is read-only and subject to the same rate limiting and auth as other endpoints. Control operations remain gated behind authentication/authorization and are audited.

Typical usage pattern for a control or inventory tool:

  1. Obtain a list of candidate services (by environment, cluster, namespace, or tag).
  2. For each service, GET /.well-known/rdcp to confirm protocol support and collect endpoints, security levels, and capabilities.
  3. Optionally, GET /rdcp/v1/discovery for category-level details (debug categories, current state, descriptions).
  4. Persist the results as an inventory for incident response, audits, or change planning.
  5. When a change is required, issue a control request to /rdcp/v1/control with tenant-scoped credentials and respect RateLimit headers and audit policies.

Scope and boundaries:

  • RDCP does not provide service location or discovery of hostnames; it standardizes capability discovery and control semantics once you know where a service is.
  • Responses reflect deployment policy: authentication, authorization, multi-tenancy, rate limits, and audit configuration are all enforced by each service.
  • This is transport- and framework-agnostic: any service that can expose HTTP endpoints can implement RDCP, regardless of language or runtime.

Why it matters (neutral framing):

  • Observability answers "what is happening"; RDCP surfaces "what is controllable now" and provides a standard way to change behavior safely.
  • This reduces bespoke admin interfaces and per-service scripts by providing a consistent protocol surface for discovery and control across services and environments.

Implementation Analysis

Test Coverage Verification

The implementation includes comprehensive testing across 34 test suites with 222 passing tests:

Framework Universality

  • βœ… Express.js adapter with middleware integration
  • βœ… Fastify plugin and middleware patterns
  • βœ… Koa middleware integration
  • βœ… Next.js App Router implementation
  • βœ… Cross-adapter header validation and rate limiting

Multi-Tenancy Implementation

  • βœ… Tenant-scoped RBAC with JWT scopes
  • βœ… Tenant-specific control endpoints (/rdcp/v1/tenants/{id}/control)
  • βœ… Isolation level support (global, process, namespace, organization)
  • βœ… Tenant context in all responses

Enterprise Security

  • βœ… Basic security (API key with 32+ character requirement)
  • βœ… Standard security (JWT with scopes: discovery, status, control, admin)
  • βœ… Enterprise security (mTLS + JWT hybrid)
  • βœ… Certificate validation and subject matching
  • βœ… Fallback behavior for hybrid authentication

Production Features

  • βœ… Temporary controls with automatic TTL expiration
  • βœ… Rate limiting with RFC-compliant headers
  • βœ… Audit trails with compliance metadata
  • βœ… JWKS client cache with ETag/304 optimization
  • βœ… Request deduplication and background refresh

JWKS Infrastructure

Recent enhancements include enterprise-grade JWT infrastructure:

  • Performance: Inflight request deduplication prevents thundering herd
  • Caching: Pluggable cache stores (file, Redis, DynamoDB)
  • Optimization: ETag revalidation and background refresh
  • Multi-Instance: Shared cache support for distributed deployments

Use Case Analysis

Incident Response Integration

// Production incident management
const services = await discoverRDCPServices()
const affectedServices = services.filter(s => 
  s.categories.includes("DATABASE")
)

await Promise.all(
  affectedServices.map(svc => 
    svc.enableDebug("DATABASE", "2h", {
      auditTrail: true,
      rateLimit: true
    })
  )
)

Multi-Cloud Operations

// Cross-cloud debugging coordination
const [awsServices, gcpServices, azureServices] = await Promise.all([
  discoverRDCP("*.aws.company.com"),
  discoverRDCP("*.gcp.company.com"), 
  discoverRDCP("*.azure.company.com")
])

// Enable payment debugging across all clouds
const paymentServices = [
  ...awsServices,
  ...gcpServices, 
  ...azureServices
].filter(s => s.categories.includes("PAYMENTS"))

await Promise.all(
  paymentServices.map(svc => 
    svc.enableDebug("PAYMENTS", "1h")
  )
)

Kubernetes Integration

# Potential operator integration
apiVersion: rdcp.dev/v1
kind: DebugSession
metadata:
  name: database-investigation
spec:
  selector:
    app: payment-service
  categories: ["DATABASE", "QUERIES"]
  duration: "1h"
  auditTrail: true

Protocol Compliance

RDCP v1.0 Implementation Status

  • βœ… All required endpoints with correct response formats
  • βœ… All three authentication security levels supported
  • βœ… Multi-tenancy with standard header support
  • βœ… Protocol-compliant error handling
  • βœ… Client & Server SDKs with TypeScript support
  • βœ… Temporary controls (TTL) in core implementation

Framework Support

Universal middleware integration:

  • Express 4.18+ with adapters.express.createRDCPMiddleware
  • Fastify 4.0+ with plugin patterns
  • Koa 2.0+ with middleware functions
  • Next.js with App Router route handlers

Package Distribution

  • Server SDK: @rdcp.dev/server
  • Client SDK: Full protocol-compliant client with convenience methods
  • TypeScript: Complete type definitions included
  • Zero Configuration: Works with sensible defaults

Technical Differentiators

Compared to Traditional Approaches

Traditional debugging requires:

  • Code modifications and deployments
  • 30+ minute CI/CD pipeline delays
  • Risk of introducing bugs during incidents
  • Manual cleanup of debug configurations

RDCP provides:

  • Zero-deployment runtime control
  • Sub-second response times
  • Automatic cleanup with TTL
  • Standardized audit trails

Compared to Configuration Management

Feature flags and configuration systems focus on business logic, while RDCP addresses operational control:

  • Real-time debugging category management
  • Incident-specific temporary controls
  • Cross-service debugging coordination
  • Production-safe automatic expiration

Future Integration Patterns

Service Mesh Integration

// Envoy/Istio integration potential
await serviceMesh.enableRDCPDebug({
  selector: "version=v2",
  categories: ["HTTP_REQUESTS"],
  duration: "30m"
})

Observability Platform Integration

// DataDog/New Relic integration
await observabilityPlatform.correlateWithTraces({
  rdcpCategories: ["DATABASE"],
  traceFilters: { service: "payments" }
})

Infrastructure as Code

# Terraform provider potential
resource "rdcp_debug_policy" "incident_response" {
  services   = ["payment-*", "user-*"]
  categories = ["DATABASE", "AUTH"]
  triggers   = ["alert:high_error_rate"]
  duration   = "1h"
}

Conclusion

RDCP establishes a new infrastructure primitive that addresses the operational control gap in modern distributed systems. Through standardized endpoints, universal discovery, and protocol compliance, it enables a new class of operational tooling that works across any compliant application regardless of implementation details.

The comprehensive test coverage and multi-framework support demonstrate production readiness, while the extensible architecture supports future integration with container orchestration, service mesh, and observability platforms.

Clone this wiki locally