fix: add retry and fallback for KG extraction failures (#6557)#6705
fix: add retry and fallback for KG extraction failures (#6557)#6705AasheeshLikePanner wants to merge 2 commits intoBasedHardware:mainfrom
Conversation
Fixes issue BasedHardware#6557 - Knowledge graph extraction: no retry/backoff on LLM calls Changes: - clients.py: Add max_retries=3 to llm_mini for transient API failures - knowledge_graph.py: Add try/except fallback in both extract_knowledge_from_memory and process_memory to handle malformed JSON from LLM This addresses both failure modes: 1. OpenAI API transient failure → retry up to 3 times 2. LLM returns malformed JSON → return empty KG instead of crashing Impact: Prevents silent data loss for knowledge graph extraction.
Greptile SummaryThis PR adds Confidence Score: 4/5Not safe to merge as-is — the core fix in knowledge_graph.py is broken due to an undefined The backend/utils/llm/knowledge_graph.py — missing
|
| Filename | Overview |
|---|---|
| backend/utils/llm/clients.py | Adds max_retries=3 to the module-level llm_mini singleton — a safe, broadly beneficial change that will retry transient API failures globally. |
| backend/utils/llm/knowledge_graph.py | Adds inner try/except around parser.parse() in both extract_knowledge_from_memory and process_memory, but logger is never defined in the module — causing the fallback to silently fail with NameError and leaving the intended graceful degradation unreachable. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[llm_mini.invoke prompt] --> B[parser.parse response.content]
B -->|success| C[Build nodes & edges]
B -->|parse error| D{inner except}
D --> E["logger.error(...)"]
E -->|logger undefined| F["NameError raised"]
F --> G{outer except}
G --> H["logging.exception(NameError)"]
H --> I["return None / empty dict"]
C --> J["return nodes & edges"]
style E fill:#f99,stroke:#c00
style F fill:#f99,stroke:#c00
style I fill:#fdd,stroke:#c66
style J fill:#dfd,stroke:#6a6
Reviews (1): Last reviewed commit: "fix: add retry and fallback for KG extra..." | Re-trigger Greptile
Fixes:
- Add logger = logging.getLogger(__name__) at module level
- Change except (ValueError, Exception) to just except Exception
- Change {e} to {type(e).__name__} to prevent PII leak in logs
Summary
Fixes issue #6557 - Knowledge graph extraction: no retry/backoff on LLM calls, no fallback on parse failures
Changes
max_retries=3to llm_mini for transient API failuresextract_knowledge_from_memoryandprocess_memoryto handle malformed JSON from LLMImpact
Addresses both failure modes:
This prevents silent data loss where memories get
kg_extracted=Falseand are permanently lost due to one temporary failure.Testing