Production feedback loop management for ML systems in the Crucible ecosystem.
This library provides telemetry ingestion, quality assessment, drift detection,
data curation, retraining triggers, and export capabilities for Elixir-based
machine learning pipelines.
Core Features:
Ingestion Pipeline
- Batch-buffered event ingestion with configurable flush intervals
- Automatic PII sanitization for emails, phones, SSNs, credit cards
- User ID hashing for privacy protection
- Ecto schema validation for incoming events
- Telemetry events for monitoring ingestion health
User Signal Capture
- Support for thumbs up/down, regenerate, edit, copy, share, report signals
- User edit tracking with edited response storage
- Signal scoring utilities for downstream curation
Quality Assessment
- Format detection for JSON, markdown, and plain text responses
- Length validation with configurable bounds
- Refusal pattern detection for unwanted model behaviors
- Repetition and stuttering detection via trigram analysis
- Optional LLM-as-judge integration point
- Rolling average quality score computation
Drift Detection
- Statistical drift via Kolmogorov-Smirnov test and PSI metrics
- Embedding-based drift using centroid distance comparison
- Output drift tracking for quality and response length shifts
- Pluggable embedding client behavior for provider flexibility
Data Curation
- High-quality example selection based on quality scores and signals
- Hard example mining for low-quality or negatively signaled events
- Diversity selection via embedding space distance from centroid
- User edit example prioritization for preference learning
- Automatic deduplication across curation strategies
Retraining Triggers
- Drift threshold trigger when distribution shift exceeds limits
- Quality drop trigger when rolling average falls below threshold
- Data count trigger when sufficient examples are collected
- Schedule trigger for time-based retraining intervals
Export Capabilities
- JSONL export for standard training pipelines
- HuggingFace dataset directory format with metadata
- Parquet placeholder for future columnar export
- Preference pair export from user edits for DPO training
- Export batch tracking with idempotent marking
Storage Backends
- Ecto/Postgres backend for production deployments
- In-memory backend for testing and local development
- ClickHouse placeholder for high-volume telemetry
- Storage behaviour for custom backend implementation
Crucible Integration
- ExportFeedback stage for pipeline-based data export
- CheckTriggers stage for automated retraining decisions
- Context artifact storage for inter-stage communication
Infrastructure
- Comprehensive test suite with Mox-based mocking
- ExMachina factories for test data generation
- Telemetry instrumentation throughout the pipeline
- MIT license for open source distribution