Skip to content

feat: use MERGE for edges, wildcard patterns, type()/labels() functions#25

Merged
polaz merged 7 commits intomainfrom
feat/#24-adapter-updates-merge-wildcard-vector
Apr 12, 2026
Merged

feat: use MERGE for edges, wildcard patterns, type()/labels() functions#25
polaz merged 7 commits intomainfrom
feat/#24-adapter-updates-merge-wildcard-vector

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 12, 2026

Summary

  • Switch LangChain and LlamaIndex adapters to use Cypher features now supported in CoordiNode v0.3.1+: wildcard [r] patterns, type(r), labels(n), and MERGE for relationships
  • Remove schema-lookup workarounds and __type__/__label__ field access that were used as fallbacks

Changes

LangChain adapter (langchain_coordinode/graph.py)

  • refresh_schema(): use MATCH (a)-[r]->(b) RETURN DISTINCT labels(a), type(r), labels(b) instead of typed pattern workaround
  • _create_edge(): CREATEMERGE (idempotent upsert)
  • _link_document_to_entities(): CREATEMERGE for MENTIONS edges

LlamaIndex adapter (llama_index/graph_stores/coordinode/base.py)

  • get_triplets(): wildcard [r] pattern + type(r) instead of r.__type__; removed NotImplementedError for missing relation_names
  • get_rel_map(): wildcard [r] + type(r); ignore_rels filter pushed into Cypher WHERE clause so LIMIT applies only to non-ignored edges
  • upsert_relations(): CREATEMERGE (idempotent)
  • Removed _parse_edge_types_from_schema() helper (no longer needed)

Tests

  • count(*) → count(r) for relationship counting
  • cnt >= 1 → cnt == 1 for idempotent edge assertions
  • get_triplets() uses wildcard (no relation_names)

Closes #24

CoordiNode v0.3.1+ supports wildcard [r] in MATCH patterns, type(r),
labels(n), and MERGE for relationship patterns. Update adapters to use
these instead of the old workarounds.

LangChain adapter:
- refresh_schema(): wildcard MATCH (a)-[r]->(b) with labels()/type()
- _create_edge(): CREATE → MERGE (idempotent edge upsert)
- _link_document_to_entities(): CREATE → MERGE for MENTIONS edges

LlamaIndex adapter:
- get_triplets(): wildcard [r] pattern; type(r) instead of r.__type__
- get_rel_map(): wildcard [r]; ignore_rels pushed into Cypher WHERE
  so LIMIT applies only to non-ignored edges
- upsert_relations(): CREATE → MERGE (idempotent)
- Remove _parse_edge_types_from_schema() (no longer needed)

Tests:
- count(*) → count(r) for relationship counting
- cnt >= 1 → cnt == 1 for idempotent edge assertions
- get_triplets() test uses wildcard (no relation_names)

Closes #24
@structured-world structured-world deleted a comment from Copilot AI Apr 12, 2026
@structured-world structured-world deleted a comment from Copilot AI Apr 12, 2026
@structured-world structured-world deleted a comment from Copilot AI Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from Copilot AI Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from sonarqubecloud bot Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from Copilot AI Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@structured-world structured-world deleted a comment from coderabbitai bot Apr 12, 2026
@polaz polaz force-pushed the feat/#24-adapter-updates-merge-wildcard-vector branch from 522fb71 to 6d009a7 Compare April 12, 2026 09:27
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1163e3c2-ddcb-45dd-a8a5-5718d7d2ed84

📥 Commits

Reviewing files that changed from the base of the PR and between 543816f and f0e1ff3.

📒 Files selected for processing (2)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_llama_index.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added vector query functionality with similarity-based filtering.
  • Bug Fixes

    • Graph operations are idempotent—repeated ingests no longer create duplicate relationships.
    • Relationship queries work without requiring explicit type specifications and now return deterministic, de-duplicated results.
  • Tests

    • Integration tests updated and added to verify idempotent graph writes and vector query behavior.
  • Chores

    • Updated .gitignore entries.

Walkthrough

Adapters and schema refresh now use untyped relationship matches (MATCH (a)-[r]->(b)) with type(r) and a new _first_label() helper; relationship writes use MERGE for idempotency; relation_names is optional for triplet queries; vector-query tests and vector search flow added.

Changes

Cohort / File(s) Summary
LangChain adapter
langchain-coordinode/langchain_coordinode/graph.py
refresh_schema() now queries MATCH (a)-[r]->(b) and derives relationships from labels(a), type(r), labels(b); added _first_label(labels) helper; relationship upserts and MENTIONS linking changed from CREATEMERGE; docstrings updated for idempotency; dedupe/sort added for deterministic schema.
Llama‑Index adapter
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py
get_triplets() no longer requires relation_names and uses untyped [r]; get_rel_map()/get_triplets() use MATCH (n)-[r]->(m) with type(r) and inline WHERE type(r) NOT IN $ignored; removed schema edge-type parser; upsert_relations() changed from CREATEMERGE.
Integration tests
tests/integration/adapters/test_langchain.py, tests/integration/adapters/test_llama_index.py
Updated Cypher counts to count(r) == 1 and tightened assertions for idempotent behavior; removed relation_names from triplet test; added vector-query tests for non-empty and empty embeddings.
Repo config
.gitignore
Adjusted ignored path for generated version file and added CLAUDE.md to ignore list.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Adapter as Adapter
    participant DB as CoordinodeDB
    participant VS as VectorService

    Client->>Adapter: refresh_schema() / get_triplets() / upsert_relations()
    Adapter->>DB: MATCH (a)-[r]->(b) RETURN labels(a), type(r), labels(b)
    DB-->>Adapter: rows with labels()/type(r)
    Adapter->>Adapter: _first_label(labels) -> choose stable label
    Client->>Adapter: vector_query(query_embedding)
    Adapter->>VS: vector_search(embedding, top_k)
    VS-->>Adapter: top-k node ids + scores
    Adapter->>DB: MATCH nodes by id / fetch node data
    DB-->>Adapter: node data
    Adapter-->>Client: aggregated results (triplets / upsert confirmation / vector results)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

  • structured-world/coordinode-python#19 — Aligns with removing the relation_names requirement and switching to wildcard relationship matches with type(r).
  • structured-world/coordinode-python#21 — Matches the changes switching relationship writes from CREATE to MERGE and updating tests for idempotency.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the primary changes: switching to MERGE for edges, using wildcard patterns, and leveraging type()/labels() functions.
Description check ✅ Passed The description thoroughly explains the changes across both adapters (LangChain and LlamaIndex), lists specific method modifications, and references the closed issue.
Linked Issues check ✅ Passed All three coding objectives from issue #24 are met: wildcard patterns used in refresh_schema()/get_triplets()/get_rel_map(), MERGE implemented for edge upserts, and vector search integration tests added.
Out of Scope Changes check ✅ Passed All changes align with issue #24 objectives: adapter updates for MERGE/wildcard/type()/labels(), vector search tests, and .gitignore maintenance are in scope.
Docstring Coverage ✅ Passed Docstring coverage is 85.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#24-adapter-updates-merge-wildcard-vector

Comment @coderabbitai help to get the list of available commands and usage tips.

@polaz polaz requested review from Copilot and removed request for Copilot April 12, 2026 09:29
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py (1)

120-155: ⚠️ Potential issue | 🔴 Critical

Remove use of untyped [r] Cypher pattern—CoordiNode does not support it.

The change removes the NotImplementedError guard and uses the untyped [r] pattern when relation_names is None. CoordiNode does not support untyped relationship patterns in Cypher; queries using [r] will silently return empty results instead of raising an error, causing get_triplets() to fail silently without returning any relationships.

Restore the NotImplementedError for the relation_names=None case to maintain fail-fast behavior, or require callers to always provide explicit relation names.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py` around
lines 120 - 155, get_triplets currently falls back to an untyped rel pattern
(`rel_pattern = "[r]"`) when relation_names is None which CoordiNode doesn't
support; revert to fail-fast behavior: in get_triplets raise NotImplementedError
when relation_names is None (or otherwise require callers to pass explicit
relation_names) instead of using the untyped pattern. Update the logic around
relation_names/rel_pattern (the variables and branch that build
rel_filter/rel_pattern in get_triplets) to throw NotImplementedError with a
clear message when relation_names is not provided so callers must supply
explicit relation names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py`:
- Around line 120-155: get_triplets currently falls back to an untyped rel
pattern (`rel_pattern = "[r]"`) when relation_names is None which CoordiNode
doesn't support; revert to fail-fast behavior: in get_triplets raise
NotImplementedError when relation_names is None (or otherwise require callers to
pass explicit relation_names) instead of using the untyped pattern. Update the
logic around relation_names/rel_pattern (the variables and branch that build
rel_filter/rel_pattern in get_triplets) to throw NotImplementedError with a
clear message when relation_names is not provided so callers must supply
explicit relation names.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f767206c-3d48-49ea-9bd6-56f5e94d70c4

📥 Commits

Reviewing files that changed from the base of the PR and between 522fb71 and 6d009a7.

📒 Files selected for processing (5)
  • .gitignore
  • langchain-coordinode/langchain_coordinode/graph.py
  • llama-index-coordinode/llama_index/graph_stores/coordinode/base.py
  • tests/integration/adapters/test_langchain.py
  • tests/integration/adapters/test_llama_index.py

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the LangChain and LlamaIndex CoordiNode adapters to rely on CoordiNode v0.3.1+ Cypher capabilities (wildcard relationship patterns, type(r), labels(n), and relationship MERGE) and removes prior schema/workaround logic.

Changes:

  • LangChain adapter: use wildcard relationship matching + labels()/type() for schema refresh; switch relationship creation to MERGE (including MENTIONS).
  • LlamaIndex adapter: allow get_triplets() without relation_names, use wildcard patterns + type(r) across queries, push ignore filters into Cypher, and MERGE relationships on upsert.
  • Integration tests updated to validate idempotent edges and relationship counting with count(r).

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
langchain-coordinode/langchain_coordinode/graph.py Switches schema refresh to wildcard [r] + labels()/type(), and makes edge creation idempotent via MERGE.
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py Drops schema edge-type parsing/workarounds; uses wildcard patterns + type(r); makes relationship upsert idempotent with MERGE.
tests/integration/adapters/test_langchain.py Adjusts assertions to validate MERGE idempotency and count(r) semantics.
tests/integration/adapters/test_llama_index.py Removes relation_names requirement for get_triplets() when wildcard [r] is supported.
.gitignore Updates ignored version file path and adds CLAUDE.md ignore.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 12, 2026

@coderabbitai The wildcard [r] pattern is supported in CoordiNode v0.3.2 (the version this PR targets). The integration test test_upsert_and_get_triplets exercises exactly this code path — it calls get_triplets(entity_names=[...]) without relation_names, triggering the [r] branch — and passes against a live CoordiNode v0.3.2 instance. Restoring NotImplementedError would regress the feature this PR delivers.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz, thank you for the clarification. I accept your explanation — since CoordiNode v0.3.2 now natively supports the untyped [r] wildcard Cypher pattern, removing the NotImplementedError guard is the correct approach, and the integration test passing against a live v0.3.2 instance is strong confirmation. I'll update my long-term memory accordingly.


✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-python/pull/25

Timestamp: 2026-04-12T09:39:39.243Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), as of CoordiNode v0.3.2, the untyped wildcard `[r]` Cypher pattern IS supported and returns results correctly. `get_triplets()` no longer raises `NotImplementedError` when `relation_names` is `None`; instead it uses `[r]` to match all relationship types. Callers may omit `relation_names` to retrieve all relationships. Relationship type is extracted via `type(r)` rather than `r.__type__`.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


⛔ Learnings removed
Learnt from: polaz
Repo: structured-world/coordinode-python PR: 0
File: :0-0
Timestamp: 2026-04-09T17:05:06.821Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), `get_triplets()` raises `NotImplementedError` when `relation_names` is `None` because CoordiNode does not support the untyped wildcard `[r]` Cypher pattern — it returns empty results silently rather than raising an error. Callers must always supply explicit relation names.

🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-python PR: 0
File: :0-0
Timestamp: 2026-04-09T17:05:06.821Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), `get_triplets()` raises `NotImplementedError` when `relation_names` is `None` because CoordiNode does not support the untyped wildcard `[r]` Cypher pattern — it returns empty results silently rather than raising an error. Callers must always supply explicit relation names.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 230-236: The current _first_label(labels) helper collapses
multi-label nodes to a single arbitrary label, causing incomplete and
nondeterministic relationship triples in refresh_schema(); change the logic so
refresh_schema() does not call _first_label() to drop labels but instead
iterates all labels on each node and emits one relationship triple per label
combination (or, if you prefer a single deterministic choice, replace
_first_label() with a deterministic selector that sorts labels and returns the
lexicographically smallest); update the code paths in refresh_schema() and any
callers of _first_label() (referring to the _first_label function and
refresh_schema method) to either expand labels into multiple triples or use the
new deterministic selector so schema generation becomes complete and stable.

In `@tests/integration/adapters/test_llama_index.py`:
- Around line 159-180: The test test_vector_query_returns_results currently only
checks types and counts but may pass due to existing :Chunk nodes; modify it to
ensure the specific seeded node (id f"vec-{tag}") is returned by
VectorStoreQuery by either constraining the query to that id or adding an
assertion that the returned nodes from store.vector_query(query) contain a node
with id "vec-{tag}" (use the same id string built when seeding via
store._client.cypher). Locate the seeding and the call to VectorStoreQuery/query
and add an assertion that scans nodes for the matching id before asserting
scores.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: be9b29d0-0fda-4ad0-a2af-f714c55a4d32

📥 Commits

Reviewing files that changed from the base of the PR and between 6d009a7 and 69fa991.

📒 Files selected for processing (2)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_llama_index.py

…ection

openCypher does not guarantee a stable ordering for labels(), so using
labels[0] produced nondeterministic schema entries across refresh_schema()
calls. Replace with min(labels) to always select the lexicographically
smallest label consistently.

Also strengthen the vector_query() integration test: capture the seeded
node's internal CoordiNode ID from CREATE RETURN and assert it appears
in the returned ChunkNode list, proving the specific seeded node was found
rather than any pre-existing Chunk.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/adapters/test_llama_index.py`:
- Around line 170-188: The seed creation and extraction of seeded_internal_id
should be inside the try so the finally cleanup always runs; move the call to
store._client.cypher that sets seed_rows and the seeded_internal_id =
str(seed_rows[0]["nid"]) assignment into the try block that surrounds the
VectorStoreQuery/store.vector_query assertions (or put the try before seeding),
ensuring the finally still calls store._client.cypher("MATCH (n:Chunk {id: $id})
DELETE n", params={"id": f"vec-{tag}"}) to always delete the created node.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 191d0b13-8235-4df6-ab9f-501fd8a7ff96

📥 Commits

Reviewing files that changed from the base of the PR and between 69fa991 and 778e8c3.

📒 Files selected for processing (2)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_llama_index.py

…anup

Ensures the finally DELETE runs even if seeded_internal_id extraction fails
after a successful CREATE. Addresses CodeRabbit nitpick on test robustness.
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 12, 2026

@coderabbitai re-review

@polaz polaz requested a review from Copilot April 12, 2026 10:13
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

@polaz Sure, triggering a fresh review of the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-python PR: 0
File: :0-0
Timestamp: 2026-04-12T09:39:39.243Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), as of CoordiNode v0.3.2, the untyped wildcard `[r]` Cypher pattern IS supported and returns results correctly. `get_triplets()` no longer raises `NotImplementedError` when `relation_names` is `None`; instead it uses `[r]` to match all relationship types. Callers may omit `relation_names` to retrieve all relationships. Relationship type is extracted via `type(r)` rather than `r.__type__`.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 25
File: tests/integration/adapters/test_llama_index.py:159-180
Timestamp: 2026-04-12T09:55:45.660Z
Learning: In `tests/integration/adapters/test_llama_index.py`, the `vector_search` RPC in CoordiNode does not return node properties. Therefore, integration tests for `vector_query()` must assert against the internal CoordiNode node ID (captured from a `CREATE ... RETURN n` Cypher response) using `ChunkNode.id_`, rather than application-level property fields like `id`. This is the correct pattern for seeded-node identity assertions in vector query tests.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 15
File: tests/integration/adapters/test_langchain.py:22-25
Timestamp: 2026-04-09T17:03:25.582Z
Learning: In `tests/integration/adapters/test_langchain.py`, the `graph` fixture uses `scope="module"` intentionally to avoid per-test gRPC connection overhead with `CoordinodeGraph`. Test isolation is achieved via the `unique_tag` fixture, which generates a UUID-based suffix per test invocation to namespace all writes. This is a deliberate design decision by the maintainer (polaz).
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot stopped work on behalf of polaz due to an error April 12, 2026 10:28
…viour

Add detailed comments explaining:
- In CoordiNode, CREATE ... RETURN n yields the internal integer node ID,
  not a property map — verified empirically (seed_rows[0]["nid"] == int)
- vector_search returns Node(properties={}) so node.properties.get("id")
  is always None and cannot be used for node identification
- ChunkNode.id_ == str(r.node.id) is the correct comparison target
@polaz polaz requested a review from Copilot April 12, 2026 10:38
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 12, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

@polaz Sure, triggering a fresh full review of the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-python PR: 0
File: :0-0
Timestamp: 2026-04-12T09:39:39.243Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), as of CoordiNode v0.3.2, the untyped wildcard `[r]` Cypher pattern IS supported and returns results correctly. `get_triplets()` no longer raises `NotImplementedError` when `relation_names` is `None`; instead it uses `[r]` to match all relationship types. Callers may omit `relation_names` to retrieve all relationships. Relationship type is extracted via `type(r)` rather than `r.__type__`.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 25
File: tests/integration/adapters/test_llama_index.py:159-180
Timestamp: 2026-04-12T09:55:45.660Z
Learning: In `tests/integration/adapters/test_llama_index.py`, the `vector_search` RPC in CoordiNode does not return node properties. Therefore, integration tests for `vector_query()` must assert against the internal CoordiNode node ID (captured from a `CREATE ... RETURN n` Cypher response) using `ChunkNode.id_`, rather than application-level property fields like `id`. This is the correct pattern for seeded-node identity assertions in vector query tests.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 15
File: tests/integration/adapters/test_langchain.py:22-25
Timestamp: 2026-04-09T17:03:25.582Z
Learning: In `tests/integration/adapters/test_langchain.py`, the `graph` fixture uses `scope="module"` intentionally to avoid per-test gRPC connection overhead with `CoordinodeGraph`. Test isolation is achieved via the `unique_tag` fixture, which generates a UUID-based suffix per test invocation to namespace all writes. This is a deliberate design decision by the maintainer (polaz).
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 12, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

@polaz Sure, triggering a fresh full review of the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-python PR: 0
File: :0-0
Timestamp: 2026-04-12T09:39:39.243Z
Learning: In the `llama-index-coordinode` adapter (`llama_index/graph_stores/coordinode/base.py`), as of CoordiNode v0.3.2, the untyped wildcard `[r]` Cypher pattern IS supported and returns results correctly. `get_triplets()` no longer raises `NotImplementedError` when `relation_names` is `None`; instead it uses `[r]` to match all relationship types. Callers may omit `relation_names` to retrieve all relationships. Relationship type is extracted via `type(r)` rather than `r.__type__`.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 25
File: tests/integration/adapters/test_llama_index.py:159-180
Timestamp: 2026-04-12T09:55:45.660Z
Learning: In `tests/integration/adapters/test_llama_index.py`, the `vector_search` RPC in CoordiNode does not return node properties. Therefore, integration tests for `vector_query()` must assert against the internal CoordiNode node ID (captured from a `CREATE ... RETURN n` Cypher response) using `ChunkNode.id_`, rather than application-level property fields like `id`. This is the correct pattern for seeded-node identity assertions in vector query tests.

Learnt from: polaz
Repo: structured-world/coordinode-python PR: 15
File: tests/integration/adapters/test_langchain.py:22-25
Timestamp: 2026-04-09T17:03:25.582Z
Learning: In `tests/integration/adapters/test_langchain.py`, the `graph` fixture uses `scope="module"` intentionally to avoid per-test gRPC connection overhead with `CoordinodeGraph`. Test isolation is achieved via the `unique_tag` fixture, which generates a UUID-based suffix per test invocation to namespace all writes. This is a deliberate design decision by the maintainer (polaz).
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 80-92: The Cypher RETURN DISTINCT is on raw label lists but
_first_label() can collapse different lists into the same (start, type, end)
triple causing duplicates; modify the relationship construction that builds
structured["relationships"] from rows to deduplicate after normalization:
iterate rows, compute start=_first_label(row.get("src_labels")), rel=row["rel"],
end=_first_label(row.get("dst_labels")), skip any row where any of those is
falsy, and use a seen set keyed by (start, rel, end) to only append unique
triples to structured["relationships"] (look for the block using rows,
_first_label, and structured["relationships"]).

In `@tests/integration/adapters/test_llama_index.py`:
- Around line 165-197: The seeded embedding vec is not unique across shared DBs,
causing flaky top-1 hits; make the seed deterministic and unique per test tag by
deriving a small, deterministic perturbation from the test tag (e.g.,
hash/tag-based offset) when constructing vec (where vec is created and used to
seed via store._client.cypher and later passed to VectorStoreQuery with
similarity_top_k=1), so the created node's embedding is distinct and the
subsequent check against seeded_internal_id (from seed_rows and used in the
any(str(getattr(node, "id_", "")) == seeded_internal_id ...) assertion) reliably
finds the seeded node; keep the try/finally cleanup and rest of the
vector_query/assertions unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 73909a98-5a1d-423f-a4a4-76d848bf002c

📥 Commits

Reviewing files that changed from the base of the PR and between 082e031 and 543816f.

📒 Files selected for processing (5)
  • .gitignore
  • langchain-coordinode/langchain_coordinode/graph.py
  • llama-index-coordinode/llama_index/graph_stores/coordinode/base.py
  • tests/integration/adapters/test_langchain.py
  • tests/integration/adapters/test_llama_index.py

…ormalization

RETURN DISTINCT operates on raw label lists, but min()-based _first_label()
can collapse different multi-label combos (e.g. ['Employee','Person'] and
['Person','Employee']) into the same (start, type, end) triple. Use a set
for deduplication after normalization so each relationship triple appears once.

Also make the vector_query() integration test embedding unique per test tag
(derived from tag bytes) and increase similarity_top_k to 5 to prevent flaky
results in shared integration DBs where another :Chunk may share the same vector.
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: update adapters for MERGE edge support + wildcard patterns + vector search

2 participants