Skip to content

v1.0 — Production Hardening

Choose a tag to compare

@timyl timyl released this 21 May 06:00

What's New in v1.0

This release focuses on production hardening across four areas: safety, reliability, resilience, and configurability.

Safety

  • Defense-in-depth PLMN whitelistauto_fix node performs a hard code-level check before writing to PCF, independent of LLM confidence. Rejects any PLMN not in the config whitelist, guarding against prompt injection and LLM hallucination.

Configurability

  • Config-driven domain knowledgeALLOWED_PLMNS, _FIXABLE_TYPOS, and _VENDOR_FIELDS extracted from source code into config/agent_config.yaml. Operators can update allowed PLMNs or fixable typo lists without touching Python code or rebuilding the image. Supports AGENT_CONFIG_PATH env var for K8s ConfigMap volume mount.

Reliability

  • Kafka manual offset commitenable_auto_commit=False; offset committed only after successful graph.invoke() + audit log write. Prevents silent message loss if the worker crashes mid-processing.
  • LangGraph version pinlanggraph<1.0.0 to avoid 1.x breaking API changes (KeyError: '__end__' in conditional edge routing). Added END: END to analyze path_map for correct termination.

Resilience

  • PCF REST API retry — All PCF API calls now use tenacity with 3 attempts, exponential backoff (1–8s), and 5s timeout. Retries on network errors and 5xx responses only; 4xx errors fail immediately.

Files Changed

File Change
config/agent_config.yaml New — operator config for PLMNs and field whitelists
agent/graph.py Config loader, auto_fix safety gate, LangGraph END fix
agent/worker.py Kafka manual offset commit
tools/pcf_tool.py Tenacity retry + 5s timeout
requirements.txt Pin langgraph<1.0.0, add pyyaml, tenacity
Dockerfile Include config/ directory
k8s/aiops/webhook.yaml Service type: LoadBalancer

Upgrade Notes

  • config/agent_config.yaml is required at runtime. Default path: <project_root>/config/agent_config.yaml. Override with AGENT_CONFIG_PATH env var.
  • For K8s: mount agent_config.yaml as a ConfigMap volume and set AGENT_CONFIG_PATH to the mount path for config updates without image rebuilds.

Next: v1.1 will introduce LLM function calling for dynamic tool selection, enabling extensibility to new NF types and REST API operations without graph restructuring.