NeuralBudget v0.2.0 - Production-Ready SLO Platform
Welcome to NeuralBudget v0.2.0! This release includes the complete feature set for production SLO evaluation with Apache-2.0 licensing.
🎉 Major Features
Command-Line Interface (NEW)
- neuralbudget binary with 4 powerful subcommands:
eval- Evaluate SLOs against metrics with human-readable and JSON outputgen-rules- Generate Prometheus alerting rules in YAML and Kubernetes CRD formatscheck- Validate SLO configurations with strict mode and detailed error reportingserve- HTTP server mode (preview, full release in v0.3)
- Multi-platform builds: Linux (x86_64, ARM64), macOS (Intel, Apple Silicon), Windows
- Docker multi-stage build support for optimized distribution
GenAI Quality Features (Complete Suite)
- LLM-as-Judge - Reference-free quality evaluation with cached embeddings
- Hallucination Detection - Groundedness-based quality SLOs
- Cost-Based SLOs - Token usage budgets and cost control
- Agent Reliability - Track LLM agent steps, tools, loops, and success rates
- TTFT SLO (NEW) - Time to First Token tracking for streaming responses with inter-token latency metrics
Composite DAG Evaluation
- Model relationships between services, not just individual SLOs
- Automatic failure propagation through dependency graph
- System-wide health scoring
- Detect cascading failures before they impact users
Streaming & Performance
- High-frequency metric collection: 15,000+ samples per second
- Adaptive windowing: automatic memory management
- Zero-copy streaming aggregator
- Sub-microsecond evaluation latency
Standards & Compatibility
- Apache-2.0 Licensed - Full open-source with commercial flexibility
- OpenSLO Compatible - Bidirectional conversion (parse and generate)
- Prometheus Native - OTLP ingestion and Prometheus exporter
- 87%+ Test Coverage - Comprehensive unit and integration tests
📊 SLO Evaluation Modes
All modes in a single tool:
- HTTP/gRPC - P50/P99 latency, availability, error rate
- Stateful Services - Replication lag, queue depth, saturation
- ML Serving - Latency, GPU utilization, model drift, accuracy
- GenAI Workloads - TTFT, throughput, semantic quality, cost
- Composite DAG - Cross-service dependencies, system-wide health
📈 Multi-Burn-Rate Alerting
Google SRE-inspired alerting strategy:
- Automatic recording rule generation
- Multi-window burn rate calculation (1h, 6h, 24h, 3d)
- Configurable alert severity levels (Info, Warning, Critical)
- Kubernetes PrometheusRule CRD support
- Validation of PromQL expressions
🚀 What's New vs v0.1.3
- ✅ Production CLI tool with 4 subcommands
- ✅ TTFT SLO for streaming GenAI responses
- ✅ Multi-burn-rate alerting with thresholds
- ✅ OpenSLO conversion and compatibility
- ✅ Prometheus rule generation
- ✅ Apache-2.0 relicensing
- ✅ Enhanced documentation with deployment guides
📦 Downloads
Pre-built binaries available for:
- Linux x86_64, ARM64
- macOS Intel, Apple Silicon
- Windows x86_64
Python wheels available on PyPI: pip install neuralbudget==0.2.0
📚 Documentation
- Getting Started - 10-minute quickstart
- User Guide - Complete feature guide
- CLI User Guide - Command-line tool documentation
- Production Deployment - Deploy to production
- GenAI Features - LLM quality SLOs
🐛 Known Issues
None currently. If you find an issue, please open a GitHub issue.
📄 Full Changelog
See CHANGELOG.md for complete details.
License: Apache-2.0 | Repository: https://github.com/pristley/NeuralBudget