Release v1.0.13 - Cypher Graph-RAG pipeline · orneryd/NornicDB

This release introduces a full Cypher Graph-RAG pipeline with new procedures that let you retrieve, inspect, rerank, and infer in one query flow, while keeping control over each step.

Highlights

New Cypher Graph-RAG procedures:
- CALL db.retrieve({...})
- CALL db.rretrieve({...})
- CALL db.rerank({...})
- CALL db.infer({...})
- CALL db.index.vector.embed(...)
Cache policy support for new primitives:
- db.infer is not cached by default
- explicit opt-in: cache: true
Refactor and stability work in core Cypher/storage execution paths for maintainability and safer evolution.

Why this matters

Most RAG stacks hide retrieval internals behind one API call. This release makes those stages explicit in Cypher so you can:

Inspect candidates before reranking
Apply workload-specific filters/transformations between stages
Compare retrieval vs reranked outputs
Tune quality/latency tradeoffs in-query
Keep everything auditable and composable in one graph-native pipeline

New Procedures

`db.retrieve`

First-stage retrieval (hybrid/vector/text depending on config and inputs).

`db.rretrieve`

Convenience path for retrieve + rerank when reranking is enabled.

`db.rerank`

Pure reranker over explicit candidate lists (great for A/B checks and custom candidate assembly).

`db.infer`

LLM inference procedure for final synthesis/classification/extraction.

Default: non-cacheable
Optional caching: pass cache: true

`db.index.vector.embed`

Returns embedding vectors directly from your configured embedding service, enabling custom retrieval pipelines in Cypher.

Cypher Pipeline Patterns

1) Compare baseline retrieval vs reranked output

CALL db.retrieve({query: $q, limit: 20}) YIELD node, score, rrf_score
WITH collect({
  id: node.id,
  content: coalesce(node.content, node.title, ''),
  score: coalesce(rrf_score, score)
}) AS baseline

CALL db.rerank({
  query: $q,
  candidates: baseline,
  rerankTopK: 10
}) YIELD id, final_score, new_rank
RETURN id, final_score, new_rank
ORDER BY new_rank ASC;

2) Analyze intermediate candidates before inference

CALL db.retrieve({query: $q, limit: 40}) YIELD node, score
WITH node, score
WHERE node.source = 'runbook' AND score > 0.12
WITH collect({
  id: node.id,
  content: node.content,
  score: score
}) AS filtered

CALL db.rerank({query: $q, candidates: filtered, rerankTopK: 12})
YIELD id, content, final_score
WITH collect({id:id, content:content, score:final_score}) AS topk

CALL db.infer({
  prompt: 'Answer using only these passages: ' + toString(topk),
  cache: true
}) YIELD text, usage, latencyMs
RETURN text, usage, latencyMs;

3) Explicit embedding + vector query control

CALL db.index.vector.embed('how do we rotate credentials in prod?') YIELD embedding
CALL db.index.vector.queryNodes('doc_embeddings', 25, embedding) YIELD node, score
RETURN node.id, node.title, score
ORDER BY score DESC;

Real-world Scenarios

Customer support copilots
Retrieve broad KB snippets, filter by product/version in Cypher, rerank, then infer final response.
Benefit: fewer hallucinations from stale docs.
Security/SRE runbook retrieval
Retrieve from incidents + runbooks, enforce environment/team constraints in Cypher before inference.
Benefit: higher operational precision, lower risk of cross-environment leakage.
Regulated document QA (legal/finance/health)
Use Cypher to enforce source/label constraints and confidence gates before model synthesis.
Benefit: traceable provenance and policy-aware answers.
Developer knowledge assistants
Retrieve code/doc chunks, compare reranked vs non-reranked relevance, tune thresholds per repo/workload.
Benefit: faster iterative quality tuning without external orchestration code.

Security Considerations (important)

When chaining db.infer output into downstream actions:

Treat model output as untrusted data
- Never directly execute model-generated Cypher as code.
- Keep model output as values, not query text.
Use parameterized Cypher
- Bind inferred fields into fixed query templates ($param) rather than string concatenation.
Apply allowlists/validation
- Validate labels, relationship types, and property keys against approved sets.
- Enforce schema/type checks before writes.
Split privileges by role
- Retrieval/inference principals should be read-scoped when possible.
- Use separate, minimal-privilege writers for mutations.
Gate high-impact operations
- Add human approval or policy checks for destructive writes.
- Log inference inputs/outputs and resulting mutations for auditability.
Defend against prompt/data injection
- Assume retrieved text can contain adversarial instructions.
- Anchor prompts with strict system constraints and provenance-aware post-validation.

Upgrade Notes

Existing workloads continue to function.
If enabling inference caching, do so intentionally via cache: true in db.infer.
Review your write pipelines to ensure inferred content is validated and parameterized before persistence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.0.13 - Cypher Graph-RAG pipeline

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Why this matters

New Procedures

`db.retrieve`

`db.rretrieve`

`db.rerank`

`db.infer`

`db.index.vector.embed`

Cypher Pipeline Patterns

1) Compare baseline retrieval vs reranked output

2) Analyze intermediate candidates before inference

3) Explicit embedding + vector query control

Real-world Scenarios

Security Considerations (important)

Upgrade Notes

Uh oh!