Skip to content

Bug: silent JSON-parse failures drop triple extractions permanently #28

@raphasouthall

Description

@raphasouthall

Summary

Triple extraction silently fails on malformed LLM JSON output with no retry. Once a note fails, it stays at zero triples forever unless the user manually runs backfill. Audit of the vault shows 25 notes (6%) with no triples — 5 of these are substantive content (>5 KB, one 14 KB).

Root cause

Two layers of try/except that swallow the failure, plus a write-skip on empty triples with no sentinel row:

src/neurostack/triples.py:86-90:
```python
try: triples = json.loads(raw)
except json.JSONDecodeError as e:
log.warning(...)
return []
```

src/neurostack/watcher.py:475-488 wraps the call in a bare try/except that also swallows. Then watcher.py:578 only writes triples if triples: — empty list = nothing inserted, no sentinel row, no retry scheduled.

Result: indistinguishable in the DB from notes that legitimately produce no triples (e.g., 42-byte stubs).

Proposed fix

  1. JSON mode + one-shot retry. In triples.py:65-77, add "response_format": {"type": "json_object"} (supported by Ollama and OpenAI-compatible endpoints). On parse failure, retry once with a follow-up turn.
  2. Sentinel / retry queue. In watcher.py:_index_triples_for_note (line 369) and _write_note_results (line 578), when extract_triples returns [] on content >200 chars, write a triples_failed row (or a boolean column on notes) with next_retry_at. neurostack backfill picks these up with exponential backoff.

Expected effect

Triple coverage climbs from 94% → 99%+ without user intervention. Silent failures become observable via a retry queue.

Key files

  • src/neurostack/triples.py:65-90
  • src/neurostack/watcher.py:369, 475-488, 578, 848-905

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions