v1.0 — Production Hardening
What's New in v1.0
This release focuses on production hardening across four areas: safety, reliability, resilience, and configurability.
Safety
- Defense-in-depth PLMN whitelist —
auto_fixnode performs a hard code-level check before writing to PCF, independent of LLM confidence. Rejects any PLMN not in the config whitelist, guarding against prompt injection and LLM hallucination.
Configurability
- Config-driven domain knowledge —
ALLOWED_PLMNS,_FIXABLE_TYPOS, and_VENDOR_FIELDSextracted from source code intoconfig/agent_config.yaml. Operators can update allowed PLMNs or fixable typo lists without touching Python code or rebuilding the image. SupportsAGENT_CONFIG_PATHenv var for K8s ConfigMap volume mount.
Reliability
- Kafka manual offset commit —
enable_auto_commit=False; offset committed only after successfulgraph.invoke()+ audit log write. Prevents silent message loss if the worker crashes mid-processing. - LangGraph version pin —
langgraph<1.0.0to avoid 1.x breaking API changes (KeyError: '__end__'in conditional edge routing). AddedEND: ENDtoanalyzepath_map for correct termination.
Resilience
- PCF REST API retry — All PCF API calls now use
tenacitywith 3 attempts, exponential backoff (1–8s), and 5s timeout. Retries on network errors and 5xx responses only; 4xx errors fail immediately.
Files Changed
| File | Change |
|---|---|
config/agent_config.yaml |
New — operator config for PLMNs and field whitelists |
agent/graph.py |
Config loader, auto_fix safety gate, LangGraph END fix |
agent/worker.py |
Kafka manual offset commit |
tools/pcf_tool.py |
Tenacity retry + 5s timeout |
requirements.txt |
Pin langgraph<1.0.0, add pyyaml, tenacity |
Dockerfile |
Include config/ directory |
k8s/aiops/webhook.yaml |
Service type: LoadBalancer |
Upgrade Notes
config/agent_config.yamlis required at runtime. Default path:<project_root>/config/agent_config.yaml. Override withAGENT_CONFIG_PATHenv var.- For K8s: mount
agent_config.yamlas a ConfigMap volume and setAGENT_CONFIG_PATHto the mount path for config updates without image rebuilds.
Next: v1.1 will introduce LLM function calling for dynamic tool selection, enabling extensibility to new NF types and REST API operations without graph restructuring.