Conversation
CoordiNode v0.3.1+ supports wildcard [r] in MATCH patterns, type(r), labels(n), and MERGE for relationship patterns. Update adapters to use these instead of the old workarounds. LangChain adapter: - refresh_schema(): wildcard MATCH (a)-[r]->(b) with labels()/type() - _create_edge(): CREATE → MERGE (idempotent edge upsert) - _link_document_to_entities(): CREATE → MERGE for MENTIONS edges LlamaIndex adapter: - get_triplets(): wildcard [r] pattern; type(r) instead of r.__type__ - get_rel_map(): wildcard [r]; ignore_rels pushed into Cypher WHERE so LIMIT applies only to non-ignored edges - upsert_relations(): CREATE → MERGE (idempotent) - Remove _parse_edge_types_from_schema() (no longer needed) Tests: - count(*) → count(r) for relationship counting - cnt >= 1 → cnt == 1 for idempotent edge assertions - get_triplets() test uses wildcard (no relation_names) Closes #24
522fb71 to
6d009a7
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdapters and schema refresh now use untyped relationship matches ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Adapter as Adapter
participant DB as CoordinodeDB
participant VS as VectorService
Client->>Adapter: refresh_schema() / get_triplets() / upsert_relations()
Adapter->>DB: MATCH (a)-[r]->(b) RETURN labels(a), type(r), labels(b)
DB-->>Adapter: rows with labels()/type(r)
Adapter->>Adapter: _first_label(labels) -> choose stable label
Client->>Adapter: vector_query(query_embedding)
Adapter->>VS: vector_search(embedding, top_k)
VS-->>Adapter: top-k node ids + scores
Adapter->>DB: MATCH nodes by id / fetch node data
DB-->>Adapter: node data
Adapter-->>Client: aggregated results (triplets / upsert confirmation / vector results)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py (1)
120-155:⚠️ Potential issue | 🔴 CriticalRemove use of untyped
[r]Cypher pattern—CoordiNode does not support it.The change removes the
NotImplementedErrorguard and uses the untyped[r]pattern whenrelation_namesisNone. CoordiNode does not support untyped relationship patterns in Cypher; queries using[r]will silently return empty results instead of raising an error, causingget_triplets()to fail silently without returning any relationships.Restore the
NotImplementedErrorfor therelation_names=Nonecase to maintain fail-fast behavior, or require callers to always provide explicit relation names.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py` around lines 120 - 155, get_triplets currently falls back to an untyped rel pattern (`rel_pattern = "[r]"`) when relation_names is None which CoordiNode doesn't support; revert to fail-fast behavior: in get_triplets raise NotImplementedError when relation_names is None (or otherwise require callers to pass explicit relation_names) instead of using the untyped pattern. Update the logic around relation_names/rel_pattern (the variables and branch that build rel_filter/rel_pattern in get_triplets) to throw NotImplementedError with a clear message when relation_names is not provided so callers must supply explicit relation names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py`:
- Around line 120-155: get_triplets currently falls back to an untyped rel
pattern (`rel_pattern = "[r]"`) when relation_names is None which CoordiNode
doesn't support; revert to fail-fast behavior: in get_triplets raise
NotImplementedError when relation_names is None (or otherwise require callers to
pass explicit relation_names) instead of using the untyped pattern. Update the
logic around relation_names/rel_pattern (the variables and branch that build
rel_filter/rel_pattern in get_triplets) to throw NotImplementedError with a
clear message when relation_names is not provided so callers must supply
explicit relation names.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f767206c-3d48-49ea-9bd6-56f5e94d70c4
📒 Files selected for processing (5)
.gitignorelangchain-coordinode/langchain_coordinode/graph.pyllama-index-coordinode/llama_index/graph_stores/coordinode/base.pytests/integration/adapters/test_langchain.pytests/integration/adapters/test_llama_index.py
There was a problem hiding this comment.
Pull request overview
Updates the LangChain and LlamaIndex CoordiNode adapters to rely on CoordiNode v0.3.1+ Cypher capabilities (wildcard relationship patterns, type(r), labels(n), and relationship MERGE) and removes prior schema/workaround logic.
Changes:
- LangChain adapter: use wildcard relationship matching +
labels()/type()for schema refresh; switch relationship creation toMERGE(includingMENTIONS). - LlamaIndex adapter: allow
get_triplets()withoutrelation_names, use wildcard patterns +type(r)across queries, push ignore filters into Cypher, andMERGErelationships on upsert. - Integration tests updated to validate idempotent edges and relationship counting with
count(r).
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
langchain-coordinode/langchain_coordinode/graph.py |
Switches schema refresh to wildcard [r] + labels()/type(), and makes edge creation idempotent via MERGE. |
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py |
Drops schema edge-type parsing/workarounds; uses wildcard patterns + type(r); makes relationship upsert idempotent with MERGE. |
tests/integration/adapters/test_langchain.py |
Adjusts assertions to validate MERGE idempotency and count(r) semantics. |
tests/integration/adapters/test_llama_index.py |
Removes relation_names requirement for get_triplets() when wildcard [r] is supported. |
.gitignore |
Updates ignored version file path and adds CLAUDE.md ignore. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@coderabbitai The wildcard |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
⛔ Learnings removed🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 230-236: The current _first_label(labels) helper collapses
multi-label nodes to a single arbitrary label, causing incomplete and
nondeterministic relationship triples in refresh_schema(); change the logic so
refresh_schema() does not call _first_label() to drop labels but instead
iterates all labels on each node and emits one relationship triple per label
combination (or, if you prefer a single deterministic choice, replace
_first_label() with a deterministic selector that sorts labels and returns the
lexicographically smallest); update the code paths in refresh_schema() and any
callers of _first_label() (referring to the _first_label function and
refresh_schema method) to either expand labels into multiple triples or use the
new deterministic selector so schema generation becomes complete and stable.
In `@tests/integration/adapters/test_llama_index.py`:
- Around line 159-180: The test test_vector_query_returns_results currently only
checks types and counts but may pass due to existing :Chunk nodes; modify it to
ensure the specific seeded node (id f"vec-{tag}") is returned by
VectorStoreQuery by either constraining the query to that id or adding an
assertion that the returned nodes from store.vector_query(query) contain a node
with id "vec-{tag}" (use the same id string built when seeding via
store._client.cypher). Locate the seeding and the call to VectorStoreQuery/query
and add an assertion that scans nodes for the matching id before asserting
scores.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: be9b29d0-0fda-4ad0-a2af-f714c55a4d32
📒 Files selected for processing (2)
langchain-coordinode/langchain_coordinode/graph.pytests/integration/adapters/test_llama_index.py
…ection openCypher does not guarantee a stable ordering for labels(), so using labels[0] produced nondeterministic schema entries across refresh_schema() calls. Replace with min(labels) to always select the lexicographically smallest label consistently. Also strengthen the vector_query() integration test: capture the seeded node's internal CoordiNode ID from CREATE RETURN and assert it appears in the returned ChunkNode list, proving the specific seeded node was found rather than any pre-existing Chunk.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/adapters/test_llama_index.py`:
- Around line 170-188: The seed creation and extraction of seeded_internal_id
should be inside the try so the finally cleanup always runs; move the call to
store._client.cypher that sets seed_rows and the seeded_internal_id =
str(seed_rows[0]["nid"]) assignment into the try block that surrounds the
VectorStoreQuery/store.vector_query assertions (or put the try before seeding),
ensuring the finally still calls store._client.cypher("MATCH (n:Chunk {id: $id})
DELETE n", params={"id": f"vec-{tag}"}) to always delete the created node.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 191d0b13-8235-4df6-ab9f-501fd8a7ff96
📒 Files selected for processing (2)
langchain-coordinode/langchain_coordinode/graph.pytests/integration/adapters/test_llama_index.py
…anup Ensures the finally DELETE runs even if seeded_internal_id extraction fails after a successful CREATE. Addresses CodeRabbit nitpick on test robustness.
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…viour
Add detailed comments explaining:
- In CoordiNode, CREATE ... RETURN n yields the internal integer node ID,
not a property map — verified empirically (seed_rows[0]["nid"] == int)
- vector_search returns Node(properties={}) so node.properties.get("id")
is always None and cannot be used for node identification
- ChunkNode.id_ == str(r.node.id) is the correct comparison target
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 80-92: The Cypher RETURN DISTINCT is on raw label lists but
_first_label() can collapse different lists into the same (start, type, end)
triple causing duplicates; modify the relationship construction that builds
structured["relationships"] from rows to deduplicate after normalization:
iterate rows, compute start=_first_label(row.get("src_labels")), rel=row["rel"],
end=_first_label(row.get("dst_labels")), skip any row where any of those is
falsy, and use a seen set keyed by (start, rel, end) to only append unique
triples to structured["relationships"] (look for the block using rows,
_first_label, and structured["relationships"]).
In `@tests/integration/adapters/test_llama_index.py`:
- Around line 165-197: The seeded embedding vec is not unique across shared DBs,
causing flaky top-1 hits; make the seed deterministic and unique per test tag by
deriving a small, deterministic perturbation from the test tag (e.g.,
hash/tag-based offset) when constructing vec (where vec is created and used to
seed via store._client.cypher and later passed to VectorStoreQuery with
similarity_top_k=1), so the created node's embedding is distinct and the
subsequent check against seeded_internal_id (from seed_rows and used in the
any(str(getattr(node, "id_", "")) == seeded_internal_id ...) assertion) reliably
finds the seeded node; keep the try/finally cleanup and rest of the
vector_query/assertions unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 73909a98-5a1d-423f-a4a4-76d848bf002c
📒 Files selected for processing (5)
.gitignorelangchain-coordinode/langchain_coordinode/graph.pyllama-index-coordinode/llama_index/graph_stores/coordinode/base.pytests/integration/adapters/test_langchain.pytests/integration/adapters/test_llama_index.py
…ormalization RETURN DISTINCT operates on raw label lists, but min()-based _first_label() can collapse different multi-label combos (e.g. ['Employee','Person'] and ['Person','Employee']) into the same (start, type, end) triple. Use a set for deduplication after normalization so each relationship triple appears once. Also make the vector_query() integration test embedding unique per test tag (derived from tag bytes) and increase similarity_top_k to 5 to prevent flaky results in shared integration DBs where another :Chunk may share the same vector.
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



Summary
[r]patterns,type(r),labels(n), andMERGEfor relationships__type__/__label__field access that were used as fallbacksChanges
LangChain adapter (
langchain_coordinode/graph.py)refresh_schema(): useMATCH (a)-[r]->(b) RETURN DISTINCT labels(a), type(r), labels(b)instead of typed pattern workaround_create_edge():CREATE→MERGE(idempotent upsert)_link_document_to_entities():CREATE→MERGEforMENTIONSedgesLlamaIndex adapter (
llama_index/graph_stores/coordinode/base.py)get_triplets(): wildcard[r]pattern +type(r)instead ofr.__type__; removedNotImplementedErrorfor missingrelation_namesget_rel_map(): wildcard[r]+type(r);ignore_relsfilter pushed into CypherWHEREclause soLIMITapplies only to non-ignored edgesupsert_relations():CREATE→MERGE(idempotent)_parse_edge_types_from_schema()helper (no longer needed)Tests
count(*) → count(r)for relationship countingcnt >= 1 → cnt == 1for idempotent edge assertionsget_triplets()uses wildcard (norelation_names)Closes #24