NeuralBudget v0.2.0 - Production-Ready SLO Platform

Latest

Latest

pristley released this 27 Jun 10:37

8d42f3f

NeuralBudget v0.2.0 - Production-Ready SLO Platform

Welcome to NeuralBudget v0.2.0! This release includes the complete feature set for production SLO evaluation with Apache-2.0 licensing.

🎉 Major Features

Command-Line Interface (NEW)

neuralbudget binary with 4 powerful subcommands:
- eval - Evaluate SLOs against metrics with human-readable and JSON output
- gen-rules - Generate Prometheus alerting rules in YAML and Kubernetes CRD formats
- check - Validate SLO configurations with strict mode and detailed error reporting
- serve - HTTP server mode (preview, full release in v0.3)
Multi-platform builds: Linux (x86_64, ARM64), macOS (Intel, Apple Silicon), Windows
Docker multi-stage build support for optimized distribution

GenAI Quality Features (Complete Suite)

LLM-as-Judge - Reference-free quality evaluation with cached embeddings
Hallucination Detection - Groundedness-based quality SLOs
Cost-Based SLOs - Token usage budgets and cost control
Agent Reliability - Track LLM agent steps, tools, loops, and success rates
TTFT SLO (NEW) - Time to First Token tracking for streaming responses with inter-token latency metrics

Composite DAG Evaluation

Model relationships between services, not just individual SLOs
Automatic failure propagation through dependency graph
System-wide health scoring
Detect cascading failures before they impact users

Streaming & Performance

High-frequency metric collection: 15,000+ samples per second
Adaptive windowing: automatic memory management
Zero-copy streaming aggregator
Sub-microsecond evaluation latency

Standards & Compatibility

Apache-2.0 Licensed - Full open-source with commercial flexibility
OpenSLO Compatible - Bidirectional conversion (parse and generate)
Prometheus Native - OTLP ingestion and Prometheus exporter
87%+ Test Coverage - Comprehensive unit and integration tests

📊 SLO Evaluation Modes

All modes in a single tool:

HTTP/gRPC - P50/P99 latency, availability, error rate
Stateful Services - Replication lag, queue depth, saturation
ML Serving - Latency, GPU utilization, model drift, accuracy
GenAI Workloads - TTFT, throughput, semantic quality, cost
Composite DAG - Cross-service dependencies, system-wide health

📈 Multi-Burn-Rate Alerting

Google SRE-inspired alerting strategy:

Automatic recording rule generation
Multi-window burn rate calculation (1h, 6h, 24h, 3d)
Configurable alert severity levels (Info, Warning, Critical)
Kubernetes PrometheusRule CRD support
Validation of PromQL expressions

🚀 What's New vs v0.1.3

✅ Production CLI tool with 4 subcommands
✅ TTFT SLO for streaming GenAI responses
✅ Multi-burn-rate alerting with thresholds
✅ OpenSLO conversion and compatibility
✅ Prometheus rule generation
✅ Apache-2.0 relicensing
✅ Enhanced documentation with deployment guides

📦 Downloads

Pre-built binaries available for:

Linux x86_64, ARM64
macOS Intel, Apple Silicon
Windows x86_64

Python wheels available on PyPI: pip install neuralbudget==0.2.0

📚 Documentation

Getting Started - 10-minute quickstart
User Guide - Complete feature guide
CLI User Guide - Command-line tool documentation
Production Deployment - Deploy to production
GenAI Features - LLM quality SLOs

🐛 Known Issues

None currently. If you find an issue, please open a GitHub issue.

📄 Full Changelog

See CHANGELOG.md for complete details.

License: Apache-2.0 | Repository: https://github.com/pristley/NeuralBudget

Assets 2