A cascade classification engine for product signals. Routes signals (metrics, errors, deploys, audits) into categories using a multi-stage pipeline: heuristic keywords, Bayesian gating, optional LLM classification, and self-improving keyword learning.
Signal ──► Heuristic Classifier ──► Bayesian Gate ──► LLM Classifier
│ │ │
│ │ ▼
│ │ Keyword Learning
│ │ │
▼ ▼ ▼
Category + Score ◄── Gate Skip (fast) ◄── LLM Category
│
▼
Vector Search (entity resolution)
│
▼
ClassifyResult
go install github.com/MostViableProduct/contextual-compiler/cmd/compiler@latest
compiler
# Listening on :8200docker build -t contextual-compiler .
docker run -p 8200:8200 contextual-compilerdocker-compose up -d
# compiler on :8200, compiler-with-postgres on :8201curl -s http://localhost:8200/v1/classify \
-H 'Content-Type: application/json' \
-d '{"type":"metric","content":"High p99 latency detected in API gateway","source":"prometheus"}' | jq{
"category": "performance",
"relevance_score": 0.6,
"classification_source": "heuristic",
"confidence": 0,
"signal_class": "SEMANTIC"
}Configuration is loaded from CONFIG_PATH (default: config.yaml). Supports JSON and YAML.
classifier:
categories:
- name: performance
keywords: [latency, throughput, p99, slow, timeout]
weights: {p99: 2.0, timeout: 2.0}
- name: reliability
keywords: [error, failure, crash, outage]
weights: {outage: 2.0}
source_priors:
sentry: {reliability: 3.0}
prometheus: {performance: 3.0}
type_to_category:
metric: performance
error: reliability
gate:
agreement_threshold: 0.75
uncertainty_max: 0.10
shadow_mode: false
health:
severities:
- name: critical
weight: 0.40
direction: negative
default_alpha: 5.0
default_beta: 1.0
keywords:
min_confidence: 0.7
min_observations: 10| Variable | Purpose | Default |
|---|---|---|
CONFIG_PATH |
Path to YAML/JSON config | config.yaml |
PORT |
HTTP server port | 8200 |
DATABASE_URL |
PostgreSQL DSN | (in-memory) |
SQLITE_PATH |
SQLite file path | (in-memory) |
ANTHROPIC_API_KEY |
Anthropic API key | (heuristic only) |
OPENAI_API_KEY |
OpenAI API key (LLM + embeddings) | (heuristic only) |
GEMINI_API_KEY |
Google Gemini API key | (heuristic only) |
LOG_EVENTS |
Enable stdout event logging | (disabled) |
Liveness probe with dependency status.
Prometheus-format counters (requires WithMetrics in handler setup).
Classify signal content directly.
{"type": "metric", "content": "p99 latency spike", "source": "prometheus", "tenant_id": "t1"}Classify by flattening a JSON payload into searchable text.
{"type": "error", "payload": {"message": "connection refused"}, "source": "sentry"}Bayesian health score for an entity.
Record a health-affecting event.
{"severity": "critical", "category": "reliability", "confidence": 0.9}Promote high-confidence learned keywords to the heuristic classifier.
Persist state (gate entries, health priors) to storage adapters.
| Category | Adapter | Package |
|---|---|---|
| LLM | Anthropic (Claude) | adapters/llm/anthropic |
| LLM | OpenAI (GPT) | adapters/llm/openai |
| LLM | Google Gemini | adapters/llm/gemini |
| Embeddings | OpenAI | adapters/embeddings/openai |
| Vector | In-Memory (cosine) | adapters/vector/memory |
| Storage | PostgreSQL | adapters/storage/postgres |
| Storage | SQLite | adapters/storage/sqlite |
| Events | Log Writer (stdout) | adapters/events/logwriter |
All adapters are optional. The compiler gracefully degrades to in-memory, heuristic-only mode when adapters are nil.
import (
"github.com/MostViableProduct/contextual-compiler/pkg/compiler"
"github.com/MostViableProduct/contextual-compiler/pkg/classifier"
)
cfg := compiler.DefaultConfig()
cfg.Classifier = classifier.Config{
Categories: []classifier.CategoryConfig{
{Name: "performance", Keywords: []string{"latency", "p99", "slow"}},
{Name: "reliability", Keywords: []string{"error", "crash", "outage"}},
},
}
c := compiler.New(cfg, compiler.Deps{}, compiler.Callbacks{})
result, err := c.Classify(ctx, compiler.Signal{
Source: "prometheus",
Type: "metric",
Content: "High p99 latency detected",
})docker build -t contextual-compiler .
docker run -p 8200:8200 \
-e ANTHROPIC_API_KEY=sk-... \
-e DATABASE_URL=postgres://... \
contextual-compilerdocker-compose up -dServices:
compiler— in-memory mode on port 8200postgres— PostgreSQL 17 on port 5432compiler-with-postgres— persistent mode on port 8201
# All tests with race detector
make test
# Short (unit) tests only
make test-unit
# Benchmarks
make bench
# Lint
make lint| Package | Purpose |
|---|---|
pkg/compiler |
Top-level orchestrator — ties all stages together |
pkg/classifier |
Heuristic cascade classifier with keyword matching |
pkg/gate |
Bayesian Beta-Binomial gate for LLM skip decisions |
pkg/health |
Per-entity Bayesian health model |
pkg/keywords |
Self-improving keyword extraction and promotion |
pkg/belief |
Dempster-Shafer evidence theory for confidence fusion |
- Heuristic — keyword matching with configurable weights and source priors
- Bayesian Gate — learns P(heuristic agrees with LLM) per (tenant, category, source); skips LLM when confident
- LLM — deep classification when heuristics are unreliable
- Keyword Learning — extracts novel keywords from heuristic/LLM disagreements
- Vector Search — optional entity resolution via cosine similarity