Skip to content

Multi granular#185

Merged
m1rl0k merged 2 commits intotestfrom
multi-granular
Jan 19, 2026
Merged

Multi granular#185
m1rl0k merged 2 commits intotestfrom
multi-granular

Conversation

@m1rl0k
Copy link
Collaborator

@m1rl0k m1rl0k commented Jan 18, 2026

No description provided.

Extends Cypher queries in Neo4j graph backend and async MCP implementation to support class-level queries, matching both exact symbol names and their methods (e.g., 'MyClass' and 'MyClass.method'). Updates benchmark indexer to support multi-granular (entity/relation) vector embeddings, including extraction, embedding, and upsert logic. Also ensures retriever outputs results in JSON format for consistency.
Introduces a compute_pagerank method to Neo4jGraphBackend, which calculates PageRank using an in-degree approximation and assigns a base rank to all nodes. The auto-backfill process now triggers PageRank computation after edge population. The fallback logic in Neo4jKnowledgeGraph is updated to ensure all nodes receive a base rank, not just those with incoming edges.
@m1rl0k m1rl0k merged commit 9236bcc into test Jan 19, 2026
1 check passed
@augmentcode
Copy link

augmentcode bot commented Jan 19, 2026

🤖 Augment PR Summary

Summary: This PR adds multi-granular indexing support and improves Neo4j-based code graph querying/scoring.

Changes:

  • Extend Neo4j CALLS/IMPORTS queries to support class/module prefix matching (e.g., MyClass.*, mypkg.*) in both the backend and MCP async helpers.
  • Compute symbol PageRank after auto-backfill and add a backend compute_pagerank implementation (in-degree approximation).
  • Update KnowledgeGraph PageRank fallback to assign a base score to nodes with zero in-degree.
  • Enhance the benchmark indexer to optionally create/upsert multi-granular entity/relation vector fields and store extracted imports/calls in payload metadata.
  • Force COIR benchmark search output to JSON for consistent downstream parsing.
  • Return the actual skipped_count when resuming partial benchmark indexes.

Technical Notes: Multi-vector indexing is gated by MULTI_GRANULAR_VECTORS and only used when the Qdrant collection supports the extra vector names.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

result = tx.run("""
MATCH (n:Symbol {collection: $collection})
WHERE n.repo = $repo
OPTIONAL MATCH (n)<-[r:CALLS|IMPORTS]-()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In compute_pagerank, in_degree is computed via OPTIONAL MATCH (n)<-[r:CALLS|IMPORTS]-() without constraining r.collection (and r.repo when repo is provided), so edges from other collections/repos could skew the ranking. Consider scoping the incoming relationships to the same collection/repo as n to keep PageRank isolated per graph.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

MATCH (n)<-[r:CALLS|IMPORTS]-()
MATCH (n:Symbol)
WHERE n.repo = $repo
OPTIONAL MATCH (n)<-[r:CALLS|IMPORTS]-()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the fallback PageRank path, the OPTIONAL MATCH (n)<-[r:CALLS|IMPORTS]-() doesn’t constrain relationships by repo, unlike the GDS rel_query branch (WHERE a.repo = $repo). This can make repo-scoped PageRank inconsistent with the GDS path when repo is provided.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant