Skip to content

NeuralBudget v0.2.0 - Production-Ready SLO Platform

Latest

Choose a tag to compare

@pristley pristley released this 27 Jun 10:37

NeuralBudget v0.2.0 - Production-Ready SLO Platform

Welcome to NeuralBudget v0.2.0! This release includes the complete feature set for production SLO evaluation with Apache-2.0 licensing.

🎉 Major Features

Command-Line Interface (NEW)

  • neuralbudget binary with 4 powerful subcommands:
    • eval - Evaluate SLOs against metrics with human-readable and JSON output
    • gen-rules - Generate Prometheus alerting rules in YAML and Kubernetes CRD formats
    • check - Validate SLO configurations with strict mode and detailed error reporting
    • serve - HTTP server mode (preview, full release in v0.3)
  • Multi-platform builds: Linux (x86_64, ARM64), macOS (Intel, Apple Silicon), Windows
  • Docker multi-stage build support for optimized distribution

GenAI Quality Features (Complete Suite)

  • LLM-as-Judge - Reference-free quality evaluation with cached embeddings
  • Hallucination Detection - Groundedness-based quality SLOs
  • Cost-Based SLOs - Token usage budgets and cost control
  • Agent Reliability - Track LLM agent steps, tools, loops, and success rates
  • TTFT SLO (NEW) - Time to First Token tracking for streaming responses with inter-token latency metrics

Composite DAG Evaluation

  • Model relationships between services, not just individual SLOs
  • Automatic failure propagation through dependency graph
  • System-wide health scoring
  • Detect cascading failures before they impact users

Streaming & Performance

  • High-frequency metric collection: 15,000+ samples per second
  • Adaptive windowing: automatic memory management
  • Zero-copy streaming aggregator
  • Sub-microsecond evaluation latency

Standards & Compatibility

  • Apache-2.0 Licensed - Full open-source with commercial flexibility
  • OpenSLO Compatible - Bidirectional conversion (parse and generate)
  • Prometheus Native - OTLP ingestion and Prometheus exporter
  • 87%+ Test Coverage - Comprehensive unit and integration tests

📊 SLO Evaluation Modes

All modes in a single tool:

  • HTTP/gRPC - P50/P99 latency, availability, error rate
  • Stateful Services - Replication lag, queue depth, saturation
  • ML Serving - Latency, GPU utilization, model drift, accuracy
  • GenAI Workloads - TTFT, throughput, semantic quality, cost
  • Composite DAG - Cross-service dependencies, system-wide health

📈 Multi-Burn-Rate Alerting

Google SRE-inspired alerting strategy:

  • Automatic recording rule generation
  • Multi-window burn rate calculation (1h, 6h, 24h, 3d)
  • Configurable alert severity levels (Info, Warning, Critical)
  • Kubernetes PrometheusRule CRD support
  • Validation of PromQL expressions

🚀 What's New vs v0.1.3

  • ✅ Production CLI tool with 4 subcommands
  • ✅ TTFT SLO for streaming GenAI responses
  • ✅ Multi-burn-rate alerting with thresholds
  • ✅ OpenSLO conversion and compatibility
  • ✅ Prometheus rule generation
  • ✅ Apache-2.0 relicensing
  • ✅ Enhanced documentation with deployment guides

📦 Downloads

Pre-built binaries available for:

  • Linux x86_64, ARM64
  • macOS Intel, Apple Silicon
  • Windows x86_64

Python wheels available on PyPI: pip install neuralbudget==0.2.0

📚 Documentation

🐛 Known Issues

None currently. If you find an issue, please open a GitHub issue.

📄 Full Changelog

See CHANGELOG.md for complete details.


License: Apache-2.0 | Repository: https://github.com/pristley/NeuralBudget