-
Notifications
You must be signed in to change notification settings - Fork 27
Add Memory component #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Memory component #150
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds Graphiti-backed memory tooling and Azure OpenAI wiring, threads memory_context into AnalysisAgent and graphs route flows, refactors follow-up and relevancy agents, persists query/conversation memories, updates frontend streaming to store final responses, and introduces a VCS dependency for graphiti-core. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User
participant FE as Frontend
participant API as API / graphs route
participant Mem as MemoryTool
participant Ana as AnalysisAgent
participant DB as Database
participant FU as FollowUpAgent
U->>FE: Ask question
FE->>API: Stream request (question)
API->>Mem: create(user_id, graph_id)
API->>Mem: search_memories(question)
Mem-->>API: memory_context
API->>Ana: get_analysis(question, tables, db_desc, instructions, memory_context)
Ana-->>API: analysis_result (SQL / flags)
alt On-topic
API->>DB: Execute SQL
DB-->>API: Results / Error
API-->>FE: Stream query_result / ai_response (with final_response)
par Persist memory
API->>Mem: save_query_memory(query, sql, success/error)
API->>Mem: add_new_memory(full_response)
API->>Mem: clean_memory()
end
else Off-topic
API->>FU: generate_follow_up_question(question, analysis_result, found_tables)
FU-->>API: follow-up text
API-->>FE: Stream followup_questions (final_response)
par Persist off-topic attempt
API->>Mem: save_query_memory(query, "", false, "off-topic")
API->>Mem: add_new_memory(full_response)
API->>Mem: clean_memory()
end
end
sequenceDiagram
autonumber
participant FE as Frontend
participant API as API / destructive confirm
participant Mem as MemoryTool
participant DB as Database
FE->>API: Confirm destructive operation
API->>DB: Execute SQL (destructive)
DB-->>API: Success / Failure
API-->>FE: Stream destructive_confirmation + final_result (final_response)
par Persist outcome
API->>Mem: save_query_memory(query, sql, success, error?)
API->>Mem: add_new_memory(full_response)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Dependency ReviewThe following issues were found:
License IssuesPipfile.lock
OpenSSF Scorecard
Scanned Files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive memory component to QueryWeaver that stores and retrieves conversation history and database interactions using Graphiti (a graph-based memory system) with Azure OpenAI integration. The memory system helps improve SQL generation by providing context from previous user interactions and database queries.
- Adds Graphiti-based memory storage for user conversations and database interactions
- Integrates memory context into the SQL analysis pipeline to improve query generation
- Implements follow-up question generation for failed or incomplete queries
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| api/memory/graphiti_tool.py | New comprehensive memory management tool with Graphiti integration |
| api/memory/init.py | Memory module initialization |
| api/routes/graphs.py | Integrates memory tool into query processing pipeline and adds final_response flags |
| api/agents/analysis_agent.py | Updates analysis agent to accept and use memory context |
| api/agents/follow_up_agent.py | Refactored to generate conversational follow-up questions |
| api/agents/relevancy_agent.py | Updates relevancy checking with conversation context awareness |
| app/ts/modules/messages.ts | Removes result_history tracking from final result messages |
| app/ts/modules/chat.ts | Adds result_history tracking for final responses based on final_response flag |
| Pipfile | Adds graphiti-core dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
api/agents/analysis_agent.py (1)
14-21: Makeinstructionsandmemory_contextkeyword-only and update callsites
- Change
get_analysissignature to:def get_analysis( self, user_query: str, combined_tables: list, db_description: str, *, instructions: str | None = None, memory_context: str | None = None, ) -> dict:- Update the call in
api/routes/graphs.py:376–378to:answer_an = agent_an.get_analysis( queries_history[-1], result, db_description, instructions=instructions, memory_context=memory_context, )- Apply the same keyword-only refactor to
_build_promptand its invocation inanalysis_agent.py.api/agents/follow_up_agent.py (1)
80-83: Remove unused exception variable and shorten fallback (pylint F841/C0301).- except Exception as e: - # Fallback response if LLM call fails - return "I'm having trouble generating a follow-up question right now. Could you try rephrasing your question or providing more specific details about what you're looking for?" + except Exception: + # Fallback response if LLM call fails + return ("Sorry—I couldn’t generate a follow‑up. Could you rephrase or share the " + "specific details you’re after?")api/agents/relevancy_agent.py (1)
84-89: Async method calls sync LLM; switch toacompletionto avoid blocking.This prevents event-loop blocking and aligns with async API.
-from litellm import completion +from litellm import acompletion @@ - completion_result = completion( + completion_result = await acompletion( model=Config.COMPLETION_MODEL, messages=self.messages, temperature=0, )
♻️ Duplicate comments (2)
api/agents/follow_up_agent.py (1)
71-75: Lower temperature for consistency; prior feedback suggested ≤0.8.Set to 0.8 for more predictable follow-ups.
- temperature=0.9 + temperature=0.8api/routes/graphs.py (1)
610-613: Enhance callback error handling to ensure proper string formatting.The callback should handle the case where
t.exception()might returnNonemore explicitly.
🧹 Nitpick comments (9)
api/agents/analysis_agent.py (3)
121-140: Fix trailing whitespace and long lines (pylint C0303/C0301).Trim trailing spaces and wrap overlong prompt bullets.
- - # Include memory context in the prompt if available + # Include memory context in the prompt if available @@ - 13. CRITICAL PERSONALIZATION CHECK: If missing user identification/personalization is a significant or primary component of the query (e.g., "show my orders", "my account balance", "my recent purchases", "how many employees I have", "products I own") AND no user identification is available in memory context or schema, set "is_sql_translatable" to false. However, if memory context contains user identification (like user name or previous successful personal queries), then personal queries ARE translatable even if they are the primary component of the query. + 13. CRITICAL PERSONALIZATION CHECK: + - If the query is primarily personal (e.g., "my orders", "my account balance") + AND no user identification exists in memory or schema → set + "is_sql_translatable" to false. + - If memory context identifies the user (name/ID/prior personal queries) → + treat such queries as translatable.Also applies to: 235-239
52-103: Too many locals in _format_schema (pylint R0914). Extract helpers.Small extraction reduces locals and improves readability.
@@ - def _format_schema(self, schema_data: List) -> str: + def _format_schema(self, schema_data: List) -> str: + def _format_column(column: dict) -> str: + col_name = column.get("columnName", "") + col_type = column.get("dataType", None) + col_description = column.get("description", "") + col_key = column.get("keyType", None) + nullable = column.get("nullable", False) + key_info = ( + ", PRIMARY KEY" if col_key == "PRI" else ", FOREIGN KEY" if col_key == "FK" else "" + ) + return (f" - {col_name} ({col_type},{key_info},{col_key},{nullable}): " + f"{col_description}") @@ - for column in columns: - col_name = column.get("columnName", "") - col_type = column.get("dataType", None) - col_description = column.get("description", "") - col_key = column.get("keyType", None) - nullable = column.get("nullable", False) - - key_info = ( - ", PRIMARY KEY" - if col_key == "PRI" - else ", FOREIGN KEY" if col_key == "FK" else "" - ) - column_str = (f" - {col_name} ({col_type},{key_info},{col_key}," - f"{nullable}): {col_description}") - table_str += column_str + "\n" + for column in columns: + table_str += _format_column(column) + "\n"
49-50: Consider storing full analysis JSON in history, not only SQL.Appending only
sql_querydegrades future-context quality.- self.messages.append({"role": "assistant", "content": analysis["sql_query"]}) + self.messages.append({"role": "assistant", "content": response})api/agents/follow_up_agent.py (2)
56-60: Wrap long lines and strip trailing whitespace (pylint C0301/C0303).Also stringify list fields for a cleaner prompt.
- is_translatable = analysis_result.get("is_sql_translatable", False) if analysis_result else False - missing_info = analysis_result.get("missing_information", []) if analysis_result else [] - ambiguities = analysis_result.get("ambiguities", []) if analysis_result else [] - explanation = analysis_result.get("explanation", "No detailed explanation available") if analysis_result else "No analysis result available" + is_translatable = ( + analysis_result.get("is_sql_translatable", False) + if analysis_result else False + ) + missing_info = ( + ", ".join(analysis_result.get("missing_information", [])) + if analysis_result else "" + ) + ambiguities = ( + ", ".join(analysis_result.get("ambiguities", [])) + if analysis_result else "" + ) + explanation = ( + analysis_result.get("explanation", "No detailed explanation available") + if analysis_result else "No analysis result available" + ) @@ - MISSING_INFO=missing_info, - AMBIGUITIES=ambiguities, + MISSING_INFO=missing_info, + AMBIGUITIES=ambiguities, EXPLANATION=explanation )Also applies to: 69-79
36-41:found_tablesis unused; either use or mark as intentionally unused.Minimal change: rename to
_found_tables.- found_tables: list = None + _found_tables: list | None = Noneapi/memory/graphiti_tool.py (3)
225-225: Remove f-strings without placeholders (Ruff F541).- print(f"Query with same user_query and sql_query already exists, skipping creation") + print("Query with same user_query and sql_query already exists, skipping creation") @@ - memory_context += f"\n" + memory_context += "\n"Also applies to: 484-486
376-377: Replace prints with logging; avoid noisy stdout in production.Introduce a module logger and use
logger.info/debug/exception.+import logging +logger = logging.getLogger(__name__) @@ - print(f'Found database node: {node.name} with UUID: {node.uuid}') + logger.debug("Found database node: %s (uuid=%s)", node.name, node.uuid) @@ - print(f'\nDatabase Facts Search Results for {self.graph_id}:') + logger.debug("Database Facts search results for %s", self.graph_id)Also applies to: 393-398
594-607: Avoid mutating environment inAzureOpenAIConfig; derive from env only.Setting
MODEL_NAMEhas side effects; remove it.- def __init__(self): - # Set the model name as requested - os.environ["MODEL_NAME"] = "gpt-4.1" - + def __init__(self):api/routes/graphs.py (1)
296-297: Consider adding error handling for memory tool initialization.The memory tool creation is awaited later but lacks explicit error handling. If memory tool creation fails, it could cause issues downstream.
Add try-catch to handle potential memory tool initialization failures:
- memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id)) - + memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))Then wrap the memory tool usage in error handling:
try: memory_tool = await memory_tool_task search_memories = await memory_tool.search_memories( query=queries_history[-1] ) memory_context = search_memories.get("memory_context", "") except Exception as e: logging.error(f"Memory tool initialization or search failed: {e}") memory_context = "" # Fallback to empty context memory_tool = None # Set to None to skip memory operations later
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Pipfile.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
Pipfile(1 hunks)api/agents/analysis_agent.py(5 hunks)api/agents/follow_up_agent.py(1 hunks)api/agents/relevancy_agent.py(2 hunks)api/memory/__init__.py(1 hunks)api/memory/graphiti_tool.py(1 hunks)api/routes/graphs.py(18 hunks)app/ts/modules/chat.ts(1 hunks)app/ts/modules/messages.ts(0 hunks)
💤 Files with no reviewable changes (1)
- app/ts/modules/messages.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Adhere to pylint standards across all Python files (repository uses
make lint)
Files:
api/memory/__init__.pyapi/memory/graphiti_tool.pyapi/agents/relevancy_agent.pyapi/agents/analysis_agent.pyapi/routes/graphs.pyapi/agents/follow_up_agent.py
{Pipfile,Pipfile.lock}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use Pipenv for dependency management; add and lock dependencies via
PipfileandPipfile.lock
Files:
Pipfile
app/**
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Keep the TypeScript frontend sources in
app/and build them before production runs
Files:
app/ts/modules/chat.ts
🧬 Code graph analysis (3)
app/ts/modules/chat.ts (1)
app/ts/modules/config.ts (1)
state(74-78)
api/memory/graphiti_tool.py (1)
api/config.py (3)
Config(47-137)get_vector_size(33-43)embed(18-31)
api/routes/graphs.py (1)
api/memory/graphiti_tool.py (6)
MemoryTool(29-591)create(47-56)search_memories(419-502)save_query_memory(166-253)add_new_memory(132-164)clean_memory(504-518)
🪛 Ruff (0.12.2)
api/memory/graphiti_tool.py
225-225: f-string without any placeholders
Remove extraneous f prefix
(F541)
245-245: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
484-484: f-string without any placeholders
Remove extraneous f prefix
(F541)
508-508: Local variable driver is assigned to but never used
Remove assignment to unused variable driver
(F841)
509-509: Local variable query is assigned to but never used
Remove assignment to unused variable query
(F841)
api/agents/follow_up_agent.py
80-80: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
🪛 GitHub Actions: Pylint
api/agents/analysis_agent.py
[error] 106-106: C0301: Line too long (115/100)
[error] 121-121: C0303: Trailing whitespace
[error] 140-140: C0303: Trailing whitespace
[error] 237-237: C0301: Line too long (206/100)
[error] 238-238: C0301: Line too long (573/100)
[error] 14-14: R0913: Too many arguments (6/5)
[error] 14-14: R0917: Too many positional arguments (6/5)
[error] 52-52: R0914: Too many local variables (21/15)
[error] 105-105: R0913: Too many arguments (6/5)
[error] 105-105: R0917: Too many positional arguments (6/5)
api/agents/follow_up_agent.py
[error] 37-37: C0303: Trailing whitespace
[error] 56-56: C0301: Line too long (105/100)
[error] 59-59: C0301: Line too long (148/100)
[error] 60-60: C0303: Trailing whitespace
[error] 69-69: C0303: Trailing whitespace
[error] 76-76: C0303: Trailing whitespace
[error] 79-79: C0303: Trailing whitespace
[error] 82-82: C0301: Line too long (187/100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e-tests
- GitHub Check: e2e-tests
🔇 Additional comments (10)
Pipfile (1)
19-19: Fix Pipenv VCS dependency spec and pin to a commit for reproducibility
Pipenv expectsgit = "https://..."(nogit+) and a stable commit SHA instead of a moving branch (staging). After updating, runpipenv lock --clearand verify the import locally since pipenv isn’t available in this environment.
Apply:- graphiti-core = {ref = "staging", git = "git+https://github.com/FalkorDB/graphiti.git"} + graphiti-core = {git = "https://github.com/FalkorDB/graphiti.git", ref = "<PINNED_COMMIT_SHA>"}api/agents/relevancy_agent.py (1)
69-75: RelevancyAgent inherits BaseAgent’s constructor; existing callsites are correct.
RelevancyAgent doesn’t define its own__init__and inheritsBaseAgent.__init__(queries_history, result_history)(api/agents/utils.py:10–13), soRelevancyAgent(queries_history, result_history)remains valid. Ignore the original constructor-removal warning.Likely an incorrect or invalid review comment.
api/memory/__init__.py (1)
1-7: LGTM! Clean module initialization.The module correctly exposes the public API for memory tooling with proper docstring and explicit
__all__declaration.app/ts/modules/chat.ts (1)
111-115: Good implementation for tracking final responses.The logic correctly captures messages marked with
final_response: trueinto the result history, ensuring memory persistence across different message types. This aligns well with the backend's memory context features.api/routes/graphs.py (6)
367-374: Good integration of memory context into analysis.The memory search is properly integrated before SQL generation, providing relevant context from past interactions and similar queries.
561-575: Well-structured follow-up question generation for non-translatable queries.The implementation properly handles cases where SQL generation fails by providing helpful follow-up questions to guide the user.
616-619: Good async implementation for memory persistence.The background task for saving conversation memory is properly implemented with error handling callbacks, ensuring the main flow isn't blocked.
661-663: Good memory tool initialization for destructive operations.The memory tool is properly created to track confirmed destructive operations.
746-758: Consistent memory tracking for confirmed operations.The implementation properly saves successful confirmed queries to memory with appropriate error handling.
763-775: Good error recovery with memory persistence.Failed destructive operations are properly tracked in memory, which is important for audit trails and learning from failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
♻️ Duplicate comments (6)
api/memory/graphiti_tool.py (4)
320-331: Fix docstring to match signature/return.Remove non-existent arg, clarify return.
Apply:
async def search_user_summary(self, limit: int = 5) -> str: - """ - Search for user node summary extracts the user's personal information and general preferences. - - Args: - query: Natural language query to search for - limit: Maximum number of results to return - - Returns: - List of user node summaries with metadata - """ + """ + Retrieve summary text for the current user node. + + Args: + limit: Maximum number of results to inspect. + + Returns: + A summary string if found; otherwise an empty string. + """
46-55: Initialize user node and make vector index creation idempotent.Ensure user node exists and ignore “already exists” errors to avoid startup failures.
Apply:
@classmethod async def create(cls, user_id: str, graph_id: str) -> "MemoryTool": """Async factory to construct and initialize the tool.""" self = cls(user_id, graph_id) - await self._ensure_database_node(graph_id, user_id) + await asyncio.gather( + self._ensure_user_node(user_id), + self._ensure_database_node(graph_id, user_id), + ) vector_size = Config.EMBEDDING_MODEL.get_vector_size() driver = self.graphiti_client.driver - await driver.execute_query(f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}") + try: + await driver.execute_query( + f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) " + f"OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}" + ) + except Exception as e: + if "already exists" not in str(e).lower(): + raise
205-241: Parameterize Cypher, remove manual escaping, and fix result handling.String interpolation risks injection and escaping breaks embeddings. Also treat execute_query result as tuple.
Apply:
- relationship_type = "SUCCESS" if success else "FAILED" - - # Escape quotes in the query and SQL for Cypher - escaped_query = query.replace("'", "\\'").replace('"', '\\"') - escaped_sql = sql_query.replace("'", "\\'").replace('"', '\\"') - escaped_error = error.replace("'", "\\'").replace('"', '\\"') if error else "" - embeddings = Config.EMBEDDING_MODEL.embed(escaped_query)[0] + relationship_type = "SUCCESS" if success else "FAILED" + embeddings = Config.EMBEDDING_MODEL.embed(query)[0] @@ - check_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MATCH (db)-[r]->(q:Query) - WHERE q.user_query = "{escaped_query}" AND q.sql_query = "{escaped_sql}" - RETURN q.uuid as existing_query_uuid - LIMIT 1 - """ + check_query = """ + MATCH (db:Entity {uuid: $db_uuid})-[]->(q:Query) + WHERE q.user_query = $user_query AND q.sql_query = $sql_query + RETURN q.uuid as existing_query_uuid + LIMIT 1 + """ @@ - check_result = await graph_driver.execute_query(check_query) - - # If query already exists, don't create a duplicate - if check_result[0]: # If records exist + records, _, _ = await graph_driver.execute_query( + check_query, + db_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + ) + if records: print(f"Query with same user_query and sql_query already exists, skipping creation") return True @@ - cypher_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MERGE (q:Query {{ - user_query: "{escaped_query}", - sql_query: "{escaped_sql}", - success: {str(success).lower()}, - error: "{escaped_error}", - timestamp: timestamp(), - embeddings: vecf32($embedding) - }}) - CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q) - RETURN q.uuid as query_uuid - """ + cypher_query = """ + MATCH (db:Entity {uuid: $db_uuid}) + MERGE (q:Query { + user_query: $user_query, + sql_query: $sql_query, + success: $success, + error: $error, + timestamp: timestamp(), + embeddings: vecf32($embedding) + }) + CREATE (db)-[:REL {type: $rel_type, timestamp: timestamp()}]->(q) + RETURN q.uuid as query_uuid + """ @@ - result = await graph_driver.execute_query(cypher_query, embedding=embeddings) - return True + await graph_driver.execute_query( + cypher_query, + db_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + success=bool(success), + error=error or "", + rel_type=relationship_type, + embedding=embeddings, + ) + return True except Exception as cypher_error: print(f"Error executing Cypher query: {cypher_error}") return FalseAlso applies to: 243-249, 221-226
573-579: Don’t block the event loop: use litellm.acompletion in async context.Switch to the async API and await it.
Apply:
-from litellm import completion +from litellm import acompletion @@ - response = completion( + response = await acompletion( model=self.config.COMPLETION_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0.1 )api/routes/graphs.py (2)
620-622: Done callbacks: guard None and stringify exceptions.Avoid f-string on None exceptions.
Apply:
- save_query_task.add_done_callback( - lambda t: logging.error(f"Query memory save failed: {t.exception()}") - if t.exception() else logging.info("Query memory saved successfully") - ) + save_query_task.add_done_callback( + lambda t: logging.error(f"Query memory save failed: {str(t.exception())}") + if t.exception() else logging.info("Query memory saved successfully") + ) @@ - save_task.add_done_callback(lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool")) + save_task.add_done_callback( + lambda t: logging.error(f"Memory save failed: {str(t.exception())}") + if t.exception() else logging.info("Conversation saved to memory tool") + )Also applies to: 627-627
630-636: Cleanup scheduling runs on every request; throttle it.Comment says “once per week,” but it’s per request. Add simple rate limiting until a scheduler is introduced.
Apply:
- # Clean old memory in background (once per week cleanup) - clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) - clean_memory_task.add_done_callback( - lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") - if t.exception() else logging.info("Memory cleanup completed successfully") - ) + # Clean old memory occasionally; replace with a real scheduler later + import random + if random.random() < 0.01: + clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) + clean_memory_task.add_done_callback( + lambda t: logging.error(f"Memory cleanup failed: {str(t.exception())}") + if t.exception() else logging.info("Memory cleanup completed successfully") + )
🧹 Nitpick comments (4)
api/memory/graphiti_tool.py (2)
58-94: Replace prints with logging and avoid bareexcept Exception.Use logging.{info,warning,exception} and catch narrower exceptions where possible. Improves observability and lint compliance.
Also applies to: 95-131, 160-163
291-305: Use parameters for k/limit and avoid f-strings; align k with requested limit.Prevents injection and keeps results bounded.
Apply:
- cypher_query = f""" - CALL db.idx.vector.queryNodes('Query', 'embeddings', 10, vecf32($embedding)) + cypher_query = """ + CALL db.idx.vector.queryNodes('Query', 'embeddings', $k, vecf32($embedding)) YIELD node, score - MATCH (db:Entity {{uuid: $database_node_uuid}})-[r]->(node) + MATCH (db:Entity {uuid: $database_node_uuid})-[r]->(node) RETURN node { .user_query, .sql_query, .success, .error } AS query, score ORDER BY score ASC - LIMIT {limit} + LIMIT $limit """ @@ - records, header, _ = await graph_driver.execute_query(cypher_query, embedding=query_embedding, database_node_uuid=database_node_uuid) + records, header, _ = await graph_driver.execute_query( + cypher_query, + embedding=query_embedding, + database_node_uuid=database_node_uuid, + k=max(10, int(limit)), + limit=int(limit), + )api/routes/graphs.py (2)
379-387: Pass memory_context safely; guard against non-str returns.Given MemoryTool.fix above, this should always be a string. If not, coerce to empty string to be defensive.
Apply:
- memory_context = await memory_tool.search_memories( + memory_context = await memory_tool.search_memories( query=queries_history[-1] ) + if not isinstance(memory_context, str): + memory_context = ""
768-783: Use logging.exception and stringify callback exceptions.Improves debuggability.
Apply:
- except Exception as e: - logging.error("Error executing confirmed SQL query: %s", str(e)) + except Exception: + logging.exception("Error executing confirmed SQL query") @@ - save_query_task.add_done_callback( - lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") - if t.exception() else logging.info("Failed confirmed query memory saved successfully") - ) + save_query_task.add_done_callback( + lambda t: logging.error(f"Failed confirmed query memory save failed: {str(t.exception())}") + if t.exception() else logging.info("Failed confirmed query memory saved successfully") + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
api/config.py(1 hunks)api/memory/graphiti_tool.py(1 hunks)api/routes/graphs.py(17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
api/config.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Define AI model configuration and system prompts in
api/config.pyrather than scattering them across modules
Files:
api/config.py
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Adhere to pylint standards across all Python files (repository uses
make lint)
Files:
api/config.pyapi/routes/graphs.pyapi/memory/graphiti_tool.py
🧬 Code graph analysis (2)
api/routes/graphs.py (4)
api/agents/analysis_agent.py (2)
AnalysisAgent(9-241)get_analysis(14-50)api/agents/follow_up_agent.py (2)
FollowUpAgent(33-82)generate_follow_up_question(36-82)api/config.py (1)
Config(47-130)api/memory/graphiti_tool.py (6)
MemoryTool(29-590)create(47-56)search_memories(419-497)save_query_memory(166-253)add_new_memory(132-164)clean_memory(499-518)
api/memory/graphiti_tool.py (1)
api/config.py (3)
Config(47-130)get_vector_size(33-43)embed(18-31)
🪛 Ruff (0.12.2)
api/config.py
58-58: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
api/routes/graphs.py
559-559: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
768-768: Do not catch blind exception: Exception
(BLE001)
769-769: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
781-781: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
api/memory/graphiti_tool.py
89-89: Consider moving this statement to an else block
(TRY300)
91-91: Do not catch blind exception: Exception
(BLE001)
126-126: Consider moving this statement to an else block
(TRY300)
128-128: Do not catch blind exception: Exception
(BLE001)
160-160: Do not catch blind exception: Exception
(BLE001)
166-166: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
225-225: f-string without any placeholders
Remove extraneous f prefix
(F541)
245-245: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
246-246: Consider moving this statement to an else block
(TRY300)
247-247: Do not catch blind exception: Exception
(BLE001)
251-251: Do not catch blind exception: Exception
(BLE001)
310-310: Consider moving this statement to an else block
(TRY300)
312-312: Do not catch blind exception: Exception
(BLE001)
316-316: Do not catch blind exception: Exception
(BLE001)
346-346: Consider moving this statement to an else block
(TRY300)
348-348: Do not catch blind exception: Exception
(BLE001)
415-415: Do not catch blind exception: Exception
(BLE001)
484-484: f-string without any placeholders
Remove extraneous f prefix
(F541)
488-488: Consider moving this statement to an else block
(TRY300)
490-490: Do not catch blind exception: Exception
(BLE001)
516-516: Do not catch blind exception: Exception
(BLE001)
582-584: Consider moving this statement to an else block
(TRY300)
586-586: Do not catch blind exception: Exception
(BLE001)
622-622: Avoid specifying long messages outside the exception class
(TRY003)
624-624: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (2)
api/config.py (1)
57-57: Good addition: bounded short-term memory length.Clear, sensible default for trimming chat history.
api/routes/graphs.py (1)
295-305: History truncation logic looks correct.Keeps N latest queries and aligns results size accordingly.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (9)
api/routes/graphs.py (2)
621-631: Harden done_callbacks: guard None exception and stringify.Avoid NoneType access and ensure readable logs.
- save_query_task.add_done_callback( - lambda t: logging.error(f"Query memory save failed: {t.exception()}") - if t.exception() else logging.info("Query memory saved successfully") - ) + save_query_task.add_done_callback( + lambda t: logging.error(f"Query memory save failed: {str(t.exception())}") + if t.exception() else logging.info("Query memory saved successfully") + ) @@ - save_task.add_done_callback(lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool")) + save_task.add_done_callback( + lambda t: logging.error(f"Memory save failed: {str(t.exception())}") + if t.exception() else logging.info("Conversation saved to memory tool") + ) @@ - lambda t: logging.error(f"Confirmed query memory save failed: {t.exception()}") + lambda t: logging.error(f"Confirmed query memory save failed: {str(t.exception())}")Also applies to: 766-769
632-637: Run memory cleanup periodically, not per request.Comment says weekly but code triggers every call; add simple rate-limit/scheduler.
- # Clean old memory in background (once per week cleanup) - clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) - clean_memory_task.add_done_callback( - lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") - if t.exception() else logging.info("Memory cleanup completed successfully") - ) + # Clean memory occasionally; TODO: move to scheduler/startup task + import random + if random.random() < 0.01: + clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) + clean_memory_task.add_done_callback( + lambda t: logging.error(f"Memory cleanup failed: {str(t.exception())}") + if t.exception() else logging.info("Memory cleanup completed successfully") + )api/memory/graphiti_tool.py (7)
32-36: Sanitize user_id before composing DB name.Prevents invalid identifiers and abuses.
- user_memory_db = f"{user_id}-memory" + import re + def _sanitize(s: str) -> str: + return re.sub(r"[^A-Za-z0-9_-]", "-", s)[:64] or "user" + user_memory_db = f"{_sanitize(user_id)}-memory"
320-331: Fix docstring: remove non-existentqueryarg and clarify return.- async def search_user_summary(self, limit: int = 5) -> str: - """ - Search for user node summary extracts the user's personal information and general preferences. - - Args: - query: Natural language query to search for - limit: Maximum number of results to return - - Returns: - List of user node summaries with metadata - """ + async def search_user_summary(self, limit: int = 5) -> str: + """ + Retrieve summary text for the current user node. + + Args: + limit: Max results to inspect. + + Returns: + A summary string; empty if none. + """
419-431: Return type docstring mismatch.Function returns a string; update docstring accordingly.
- Returns: - Dict containing user_summary, database_facts, similar_queries, and memory_context + Returns: + memory_context: a single string composed of user, database, and similar-query context
591-604: Centralize model names via Config to avoid drift.Remove hardcoded "gpt-4.1"/"text-embedding-ada-002" and read from api.config.Config.
class AzureOpenAIConfig: @@ - # Set the model name as requested - os.environ["MODEL_NAME"] = "gpt-4.1" - self.api_key = os.getenv('AZURE_API_KEY') - self.endpoint = os.getenv('AZURE_API_BASE') + self.endpoint = os.getenv('AZURE_API_BASE') self.api_version = os.getenv('AZURE_API_VERSION', '2024-02-01') - self.model_choice = "gpt-4.1" # Use the model name directly - self.embedding_model = "text-embedding-ada-002" # Use model name, not deployment + # Derive names from Config (strip optional provider prefix like "azure/") + self.model_choice = Config.COMPLETION_MODEL.split("/", 1)[-1] + self.embedding_model = Config.EMBEDDING_MODEL_NAME.split("/", 1)[-1] self.small_model = os.getenv('AZURE_SMALL_MODEL', 'gpt-4o-mini') @@ - small_model=config.small_model_deployment, - model=config.llm_deployment, + small_model=config.small_model_deployment, + model=config.llm_deployment, @@ - config=OpenAIEmbedderConfig(embedding_model=config.embedding_deployment), + config=OpenAIEmbedderConfig(embedding_model=config.embedding_deployment),Also applies to: 645-648, 655-663
46-55: Make initialization idempotent and ensure User node exists.Create user+db nodes together and guard vector index creation.
async def create(cls, user_id: str, graph_id: str) -> "MemoryTool": @@ - await self._ensure_database_node(graph_id, user_id) - - vector_size = Config.EMBEDDING_MODEL.get_vector_size() - driver = self.graphiti_client.driver - await driver.execute_query(f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}") + await asyncio.gather( + self._ensure_user_node(user_id), + self._ensure_database_node(graph_id, user_id), + ) + vector_size = Config.EMBEDDING_MODEL.get_vector_size() + driver = self.graphiti_client.driver + try: + await driver.execute_query( + f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) " + f"OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}" + ) + except Exception as e: + if "already exists" not in str(e).lower(): + raise
571-576: Do not block event loop: use async LLM call.Switch to acompletion in async context.
-from litellm import completion +from litellm import acompletion @@ - response = completion( + response = await acompletion( model=self.config.COMPLETION_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0.1 )
166-254: Cypher injection and duplicate-check bugs in save_query_memory.Manual escaping and f-strings are unsafe; also mis-handle execute_query result tuple and unused var. Parameterize and normalize relationship type.
@@ - relationship_type = "SUCCESS" if success else "FAILED" - - # Escape quotes in the query and SQL for Cypher - escaped_query = query.replace("'", "\\'").replace('"', '\\"') - escaped_sql = sql_query.replace("'", "\\'").replace('"', '\\"') - escaped_error = error.replace("'", "\\'").replace('"', '\\"') if error else "" - embeddings = Config.EMBEDDING_MODEL.embed(escaped_query)[0] + relationship_type = "SUCCESS" if success else "FAILED" + embeddings = Config.EMBEDDING_MODEL.embed(query)[0] @@ - check_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MATCH (db)-[r]->(q:Query) - WHERE q.user_query = "{escaped_query}" AND q.sql_query = "{escaped_sql}" - RETURN q.uuid as existing_query_uuid - LIMIT 1 - """ + check_query = """ + MATCH (db:Entity {uuid: $db_uuid})-[]->(q:Query) + WHERE q.user_query = $user_query AND q.sql_query = $sql_query + RETURN q.uuid as existing_query_uuid + LIMIT 1 + """ @@ - check_result = await graph_driver.execute_query(check_query) - - # If query already exists, don't create a duplicate - if check_result[0]: # If records exist + records, _, _ = await graph_driver.execute_query( + check_query, + db_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + ) + if records: print(f"Query with same user_query and sql_query already exists, skipping creation") return True @@ - cypher_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MERGE (q:Query {{ - user_query: "{escaped_query}", - sql_query: "{escaped_sql}", - success: {str(success).lower()}, - error: "{escaped_error}", - timestamp: timestamp(), - embeddings: vecf32($embedding) - }}) - CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q) - RETURN q.uuid as query_uuid - """ + cypher_query = """ + MATCH (db:Entity {uuid: $db_uuid}) + MERGE (q:Query { + user_query: $user_query, + sql_query: $sql_query, + success: $success, + error: $error, + timestamp: timestamp(), + embeddings: vecf32($embedding) + }) + CREATE (db)-[:REL {type: $rel_type, timestamp: timestamp()}]->(q) + RETURN q.uuid as query_uuid + """ @@ - result = await graph_driver.execute_query(cypher_query, embedding=embeddings) - return True + await graph_driver.execute_query( + cypher_query, + db_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + success=bool(success), + error=error or "", + rel_type=relationship_type, + embedding=embeddings, + ) + return True
🧹 Nitpick comments (3)
api/routes/graphs.py (2)
391-395: Normalize error handling to Optional[str] and avoid bool/string mixing.Use None as the sentinel and keep a consistent type; this simplifies truthiness checks and serialization.
@@ - # Initialize response variables - user_readable_response = "" - follow_up_result = "" - execution_error = False + # Initialize response variables + user_readable_response = "" + follow_up_result = "" + execution_error: Optional[str] = None @@ - except Exception as e: - execution_error = str(e) + except Exception as e: + execution_error = str(e) @@ - ) + MESSAGE_DELIMITER + ) + MESSAGE_DELIMITER @@ - else: - execution_error = "Missing information" + else: + execution_error = "Missing information" @@ - if execution_error: + if execution_error: full_response["error"] = execution_error full_response["success"] = False else: full_response["success"] = True @@ - memory_tool.save_query_memory( + memory_tool.save_query_memory( query=queries_history[-1], sql_query=answer_an["sql_query"], success=full_response["success"], - error=execution_error + error=execution_error )Add import (top of file):
from typing import OptionalAlso applies to: 559-559, 567-568, 571-577, 605-611, 614-620
561-561: Prefer logging.exception for errors with stack traces.Improves observability and aligns with linter hint TRY400.
- logging.error("Error executing SQL query: %s", str(e)) + logging.exception("Error executing SQL query") @@ - logging.error("Error executing confirmed SQL query: %s", str(e)) + logging.exception("Error executing confirmed SQL query") @@ - save_query_task.add_done_callback( - lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") + save_query_task.add_done_callback( + lambda t: logging.error(f"Failed confirmed query memory save failed: {str(t.exception())}")Also applies to: 772-772, 784-784
api/memory/graphiti_tool.py (1)
290-305: Parameterize/guard LIMIT and clamp result size.Avoid f-string LIMIT; enforce integer bounds.
- cypher_query = f""" + cypher_query = """ CALL db.idx.vector.queryNodes('Query', 'embeddings', 10, vecf32($embedding)) YIELD node, score MATCH (db:Entity {{uuid: $database_node_uuid}})-[r]->(node) RETURN node { .user_query, .sql_query, .success, .error } AS query, score ORDER BY score ASC - LIMIT {limit} + LIMIT $limit """ @@ - records, header, _ = await graph_driver.execute_query(cypher_query, embedding=query_embedding, database_node_uuid=database_node_uuid) + records, header, _ = await graph_driver.execute_query( + cypher_query, + embedding=query_embedding, + database_node_uuid=database_node_uuid, + limit=int(max(0, min(limit, 50))), + )Also applies to: 308-314
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Pipfile.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
Pipfile(1 hunks)api/memory/graphiti_tool.py(1 hunks)api/routes/graphs.py(17 hunks)app/ts/modules/chat.ts(1 hunks)app/ts/modules/messages.ts(0 hunks)
💤 Files with no reviewable changes (1)
- app/ts/modules/messages.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- Pipfile
- app/ts/modules/chat.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Adhere to pylint standards across all Python files (repository uses
make lint)
Files:
api/routes/graphs.pyapi/memory/graphiti_tool.py
🧠 Learnings (1)
📚 Learning: 2025-08-24T17:15:21.337Z
Learnt from: CR
PR: FalkorDB/QueryWeaver#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-24T17:15:21.337Z
Learning: Applies to api/config.py : Define AI model configuration and system prompts in `api/config.py` rather than scattering them across modules
Applied to files:
api/memory/graphiti_tool.py
🧬 Code graph analysis (2)
api/routes/graphs.py (4)
api/agents/analysis_agent.py (2)
AnalysisAgent(9-241)get_analysis(14-50)api/agents/follow_up_agent.py (2)
FollowUpAgent(33-82)generate_follow_up_question(36-82)api/config.py (1)
Config(47-130)api/memory/graphiti_tool.py (6)
MemoryTool(29-588)create(47-56)search_memories(419-492)save_query_memory(166-253)add_new_memory(132-164)clean_memory(494-517)
api/memory/graphiti_tool.py (1)
api/config.py (3)
Config(47-130)get_vector_size(33-43)embed(18-31)
🪛 Ruff (0.12.2)
api/routes/graphs.py
561-561: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
771-771: Do not catch blind exception: Exception
(BLE001)
772-772: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
784-784: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
api/memory/graphiti_tool.py
89-89: Consider moving this statement to an else block
(TRY300)
91-91: Do not catch blind exception: Exception
(BLE001)
126-126: Consider moving this statement to an else block
(TRY300)
128-128: Do not catch blind exception: Exception
(BLE001)
160-160: Do not catch blind exception: Exception
(BLE001)
225-225: f-string without any placeholders
Remove extraneous f prefix
(F541)
245-245: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
246-246: Consider moving this statement to an else block
(TRY300)
247-247: Do not catch blind exception: Exception
(BLE001)
251-251: Do not catch blind exception: Exception
(BLE001)
310-310: Consider moving this statement to an else block
(TRY300)
312-312: Do not catch blind exception: Exception
(BLE001)
316-316: Do not catch blind exception: Exception
(BLE001)
346-346: Consider moving this statement to an else block
(TRY300)
348-348: Do not catch blind exception: Exception
(BLE001)
415-415: Do not catch blind exception: Exception
(BLE001)
484-484: f-string without any placeholders
Remove extraneous f prefix
(F541)
488-488: Consider moving this statement to an else block
(TRY300)
490-490: Do not catch blind exception: Exception
(BLE001)
514-514: Consider moving this statement to an else block
(TRY300)
515-515: Do not catch blind exception: Exception
(BLE001)
580-582: Consider moving this statement to an else block
(TRY300)
584-584: Do not catch blind exception: Exception
(BLE001)
620-620: Avoid specifying long messages outside the exception class
(TRY003)
622-622: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e-tests
- GitHub Check: e2e-tests
🔇 Additional comments (3)
api/routes/graphs.py (2)
297-307: Short-term memory truncation logic looks correct.History/results are trimmed consistently with Config.SHORT_MEMORY_LENGTH.
310-311: Good: warm up MemoryTool early.Creating the tool concurrently hides its latency before first use.
api/memory/graphiti_tool.py (1)
498-515: Cleanup query looks good.Correct ordering (DESC) and parameters; safe fallback return.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
api/routes/graphs.py (3)
475-485: Avoid blocking the event loop on DB execution
execute_sql_queryappears synchronous; calling it inside the async generator can block the FastAPI event loop. Offload to a thread.- is_schema_modifying, operation_type = ( - loader_class.is_schema_modifying_query(sql_query) - ) - - query_results = loader_class.execute_sql_query(answer_an["sql_query"], db_url) + is_schema_modifying, operation_type = loader_class.is_schema_modifying_query(sql_query) + query_results = await asyncio.to_thread( + loader_class.execute_sql_query, answer_an["sql_query"], db_url + )
51-60: Fix mutable default in Pydantic model
chat: list = []creates shared state across requests. UseField(default_factory=list).-from pydantic import BaseModel +from pydantic import BaseModel, Field @@ class ConfirmRequest(BaseModel): @@ - chat: list = [] + chat: list = Field(default_factory=list)
684-689: Stream consistency: include final_response flags in confirm routeAlign with the main route and frontend expectations by flagging interim vs final messages.
- if not loader_class: - yield json.dumps({ - "type": "error", - "message": "Unable to determine database type" - }) + MESSAGE_DELIMITER + if not loader_class: + yield json.dumps({ + "type": "error", + "final_response": True, + "message": "Unable to determine database type" + }) + MESSAGE_DELIMITER return @@ - step = {"type": "reasoning_step", - "message": "Step 2: Executing confirmed SQL query"} + step = {"type": "reasoning_step", + "final_response": False, + "message": "Step 2: Executing confirmed SQL query"} yield json.dumps(step) + MESSAGE_DELIMITER @@ - yield json.dumps( + yield json.dumps( { "type": "query_result", "data": query_results, + "final_response": False } ) + MESSAGE_DELIMITER @@ - step = {"type": "reasoning_step", - "message": "Step 3: Schema change detected - refreshing graph..."} + step = {"type": "reasoning_step", + "final_response": False, + "message": "Step 3: Schema change detected - refreshing graph..."} yield json.dumps(step) + MESSAGE_DELIMITER @@ - yield json.dumps( + yield json.dumps( { "type": "schema_refresh", + "final_response": False, "message": (f"✅ Schema change detected ({operation_type} " "operation)\n\n🔄 Graph schema has been automatically " "refreshed with the latest database structure."), "refresh_status": "success" } ) + MESSAGE_DELIMITER @@ - yield json.dumps( + yield json.dumps( { "type": "schema_refresh", + "final_response": False, "message": (f"⚠️ Schema was modified but graph refresh failed: " f"{refresh_message}"), "refresh_status": "failed" } ) + MESSAGE_DELIMITER @@ - yield json.dumps( + yield json.dumps( { "type": "ai_response", + "final_response": True, "message": user_readable_response, } ) + MESSAGE_DELIMITER @@ - yield json.dumps( - {"type": "error", "message": "Error executing query"} + yield json.dumps( + {"type": "error", "final_response": True, "message": "Error executing query"} ) + MESSAGE_DELIMITERAlso applies to: 691-694, 700-705, 717-727, 728-735, 751-756, 789-791
♻️ Duplicate comments (3)
api/routes/graphs.py (3)
391-395: Normalize error sentinel to Optional[str] (follow prior suggestion)Use None instead of False for “no error” to avoid type ambiguity.
- execution_error = False + execution_error: Optional[str] = NoneAdditional import needed:
from typing import Optional
633-639: Throttle memory cleanup (weekly) or move to a schedulerCleanup currently runs on every request, contradicting the comment and adding overhead. Gate by time or probability, or schedule at app startup.
Example (module-level throttle):
# at module top _CLEAN_LAST_RUN = 0 _CLEAN_LOCK = asyncio.Lock() _CLEAN_INTERVAL_SECS = 7 * 24 * 3600- # Clean old memory in background (once per week cleanup) - clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) - clean_memory_task.add_done_callback( - lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") - if t.exception() else logging.info("Memory cleanup completed successfully") - ) + # Clean old memory occasionally (approx weekly) + now = time.time() + async with _CLEAN_LOCK: + if now - _CLEAN_LAST_RUN >= _CLEAN_INTERVAL_SECS: + _CLEAN_LAST_RUN = now + clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) + clean_memory_task.add_done_callback( + lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") + if t.exception() else logging.info("Memory cleanup completed successfully") + )
297-307: Remove stray extra blank line after truncation blockMinor style cleanup to keep linters happy.
- result_history = [] - + result_history = []
🧹 Nitpick comments (7)
api/routes/graphs.py (7)
310-311: Create MemoryTool only on on-topic path to avoid wasted work/leaksThe task is created before relevancy is known; on off-topic it’s never awaited/cancelled. Instantiate only when needed.
- memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id)) + # Defer MemoryTool creation until query is confirmed on-topic- memory_tool = await memory_tool_task + memory_tool = await MemoryTool.create(request.state.user_id, graph_id) memory_context = await memory_tool.search_memories( query=queries_history[-1] )Also applies to: 381-385
559-566: Log stack trace for SQL execution failuresPrefer
logging.exceptionto capture traceback.- except Exception as e: - execution_error = str(e) + except Exception as e: + execution_error = str(e) overall_elapsed = time.perf_counter() - overall_start - logging.error("Error executing SQL query: %s", str(e)) + logging.exception("Error executing SQL query")
89-93: Docstring mismatch with implementationThe function does not wrap with repr(); either update docstring or implement repr wrapping. Recommend updating docstring for accuracy.
- Sanitize input for safe logging—remove newlines, - carriage returns, tabs, and wrap in repr(). + Sanitize input for safe logging — remove newlines, carriage returns, and tabs.
77-83: Broaden DB URL detection for MySQL variantsSupport common SQLAlchemy-style prefixes.
- elif db_url_lower.startswith('mysql://'): + elif db_url_lower.startswith(('mysql://', 'mysql+pymysql://', 'mysql+asyncmy://')): return 'mysql', MySQLLoader
364-371: Optional: generate a conversational off-topic replyWhen off-topic, consider using FollowUpAgent to propose helpful rephrasing/questions instead of only echoing the reason.
411-411: Nit: fix spelling“postgress” → “Postgres”.
- # If the SQL query is valid, execute it using the postgress database db_url + # If the SQL query is valid, execute it using the Postgres database db_url
666-671: Security: confirm route executes client-sent SQLThe confirm endpoint executes
sql_queryfrom the request body. Bind this to a server-issued token or hash of the previously generated SQL to prevent arbitrary injection via the confirmation call.
- Store the generated SQL (and a short-lived token) server-side when you emit the destructive_confirmation.
- In confirm route, validate the token and ensure the SQL matches what was issued before executing.
Also applies to: 699-700
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
api/routes/graphs.py(17 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Adhere to pylint standards across all Python files (repository uses
make lint)
Files:
api/routes/graphs.py
🧬 Code graph analysis (1)
api/routes/graphs.py (7)
api/agents/analysis_agent.py (1)
get_analysis(14-50)api/agents/response_formatter_agent.py (1)
ResponseFormatterAgent(42-135)api/agents/follow_up_agent.py (2)
FollowUpAgent(33-82)generate_follow_up_question(36-82)api/auth/user_management.py (1)
token_required(254-294)api/config.py (1)
Config(47-130)api/graph.py (2)
find(241-331)get_db_description(39-54)api/memory/graphiti_tool.py (6)
MemoryTool(29-588)create(47-56)search_memories(419-492)save_query_memory(166-253)add_new_memory(132-164)clean_memory(494-517)
🪛 Ruff (0.12.2)
api/routes/graphs.py
493-493: SyntaxError: Expected ':', found '='
560-560: SyntaxError: Expected ',', found '='
561-561: SyntaxError: Expected ':', found name
561-561: SyntaxError: Expected ',', found '='
562-562: SyntaxError: Expected ':', found name
563-563: SyntaxError: Expected ',', found name
567-567: SyntaxError: Expected ':', found 'yield'
570-570: SyntaxError: Expected ',', found 'else'
570-570: SyntaxError: Expected ',', found ':'
571-571: SyntaxError: Expected ':', found '='
573-573: SyntaxError: Expected ':', found name
573-573: SyntaxError: Expected ',', found '='
580-580: SyntaxError: Expected ':', found 'yield'
588-588: SyntaxError: Expected ',', found name
588-588: SyntaxError: Expected ':', found '='
589-589: SyntaxError: Expected ':', found name
596-596: SyntaxError: Expected ',', found name
596-596: SyntaxError: Expected ':', found '='
599-599: SyntaxError: Expected ':', found name
599-599: SyntaxError: Expected ',', found '='
606-606: SyntaxError: Expected 'else', found ':'
607-607: SyntaxError: Expected ',', found '='
608-608: SyntaxError: Expected ':', found name
608-608: SyntaxError: Expected ',', found '='
609-609: SyntaxError: Expected ':', found 'else'
609-609: SyntaxError: Expected ',', found ':'
610-610: SyntaxError: Expected ':', found '='
614-614: SyntaxError: Expected ':', found name
614-614: SyntaxError: Expected ',', found '='
622-622: SyntaxError: Expected ':', found name
628-628: SyntaxError: Expected ',', found name
628-628: SyntaxError: Expected ':', found '='
630-630: SyntaxError: Expected ':', found name
631-631: SyntaxError: Expected ',', found name
634-634: SyntaxError: Expected ':', found name
634-634: SyntaxError: Expected ',', found '='
635-635: SyntaxError: Expected ':', found name
🪛 GitHub Actions: Pylint
api/routes/graphs.py
[error] 492-492: Pylint E0001: Parsing failed: '{' was never closed
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: e2e-tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/routes/graphs.py (1)
59-59: Fix mutable default in ConfirmRequest.chatUsing a mutable default list can cause cross-request state leaks. Make it Optional and guard at use-site.
class ConfirmRequest(BaseModel): @@ - chat: list = [] + chat: Optional[List[Any]] = NoneAlso handle None when reading:
- queries_history = confirm_data.chat if hasattr(confirm_data, 'chat') else [] + queries_history = (confirm_data.chat or []) if hasattr(confirm_data, 'chat') else []Add missing imports near the top of the file:
from typing import Any, List, OptionalAlso applies to: 666-667
♻️ Duplicate comments (2)
api/routes/graphs.py (2)
391-395: Normalize error handling: keep Exception, log stacktrace, serialize lazilyUse an Exception sentinel (None when no error), log with logging.exception, and convert to str only when persisting/serializing. This aligns with Ruff TRY400 and preserves type info.
- user_readable_response = "" - follow_up_result = "" - execution_error = False + user_readable_response = "" + follow_up_result = "" + execution_error: Optional[Exception] = None @@ - except Exception as e: - execution_error = str(e) + except Exception as e: + execution_error = e overall_elapsed = time.perf_counter() - overall_start - logging.error("Error executing SQL query: %s", str(e)) + logging.exception("Error executing SQL query") @@ - else: - execution_error = "Missing information" + else: + execution_error = ValueError("Missing information") # SQL query is not valid/translatable - generate follow-up questions follow_up_result = follow_up_agent.generate_follow_up_question( user_question=queries_history[-1], analysis_result=answer_an, found_tables=result ) @@ - if execution_error: - full_response["error"] = execution_error + if execution_error: + full_response["error"] = str(execution_error) full_response["success"] = False else: full_response["success"] = True @@ memory_tool.save_query_memory( query=queries_history[-1], sql_query=answer_an["sql_query"], success=full_response["success"], - error=execution_error + error=str(execution_error) if execution_error else None )Also applies to: 558-561, 570-576, 605-610, 613-620
621-624: Harden background task callbacks and throttle cleanup
- Convert exceptions to strings in callbacks to avoid None formatting issues.
- Cleanup runs on every request; throttle until a scheduler is added.
save_query_task.add_done_callback( - lambda t: logging.error(f"Query memory save failed: {t.exception()}") - if t.exception() else logging.info("Query memory saved successfully") + lambda t: logging.error("Query memory save failed: %s", str(t.exception())) + if t.exception() else logging.info("Query memory saved successfully") ) @@ save_task.add_done_callback( - lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool") + lambda t: logging.error("Memory save failed: %s", str(t.exception())) + if t.exception() else logging.info("Conversation saved to memory tool") ) @@ -# Clean old memory in background (once per week cleanup) -clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) -clean_memory_task.add_done_callback( - lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") - if t.exception() else logging.info("Memory cleanup completed successfully") -) +# Clean old memory in background (run occasionally, not on every request) +import random +if random.random() < 0.01: # ~1% of requests + clean_memory_task = asyncio.create_task(memory_tool.clean_memory()) + clean_memory_task.add_done_callback( + lambda t: logging.error("Memory cleanup failed: %s", str(t.exception())) + if t.exception() else logging.info("Memory cleanup completed successfully") + )Also applies to: 627-631, 632-637
🧹 Nitpick comments (4)
api/routes/graphs.py (4)
364-371: Consider generating a follow-up suggestion even for off-topic queriesYou already instantiate FollowUpAgent. Using it here can return a helpful redirect question rather than only the reason.
If you want, I can wire a minimal prompt that turns the relevancy reason into a clarifying question.
690-692: Make streaming payloads consistent: include final_response flags in confirm flowDownstream consumers rely on final_response. Add False for mid-stream steps and True for terminal events.
- step = {"type": "reasoning_step", - "message": "Step 2: Executing confirmed SQL query"} + step = {"type": "reasoning_step", + "final_response": False, + "message": "Step 2: Executing confirmed SQL query"} @@ - yield json.dumps( - { - "type": "query_result", - "data": query_results, - } - ) + MESSAGE_DELIMITER + yield json.dumps( + { + "type": "query_result", + "data": query_results, + "final_response": False + } + ) + MESSAGE_DELIMITER @@ - step = {"type": "reasoning_step", - "message": "Step 3: Schema change detected - refreshing graph..."} + step = {"type": "reasoning_step", + "final_response": False, + "message": "Step 3: Schema change detected - refreshing graph..."} @@ - yield json.dumps( - { - "type": "schema_refresh", - "message": (f"✅ Schema change detected ({operation_type} " - "operation)\n\n🔄 Graph schema has been automatically " - "refreshed with the latest database structure."), - "refresh_status": "success" - } - ) + MESSAGE_DELIMITER + yield json.dumps( + { + "type": "schema_refresh", + "message": (f"✅ Schema change detected ({operation_type} " + "operation)\n\n🔄 Graph schema has been automatically " + "refreshed with the latest database structure."), + "refresh_status": "success", + "final_response": False + } + ) + MESSAGE_DELIMITER @@ - yield json.dumps( - { - "type": "schema_refresh", - "message": (f"⚠️ Schema was modified but graph refresh failed: " - f"{refresh_message}"), - "refresh_status": "failed" - } - ) + MESSAGE_DELIMITER + yield json.dumps( + { + "type": "schema_refresh", + "message": (f"⚠️ Schema was modified but graph refresh failed: " + f"{refresh_message}"), + "refresh_status": "failed", + "final_response": False + } + ) + MESSAGE_DELIMITER @@ - step = {"type": "reasoning_step", - "message": f"Step {step_num}: Generating user-friendly response"} + step = {"type": "reasoning_step", + "final_response": False, + "message": f"Step {step_num}: Generating user-friendly response"} @@ - yield json.dumps( - { - "type": "ai_response", - "message": user_readable_response, - } - ) + MESSAGE_DELIMITER + yield json.dumps( + { + "type": "ai_response", + "message": user_readable_response, + "final_response": True + } + ) + MESSAGE_DELIMITER @@ - yield json.dumps( - {"type": "error", "message": "Error executing query"} - ) + MESSAGE_DELIMITER + yield json.dumps( + {"type": "error", "message": "Error executing query", "final_response": True} + ) + MESSAGE_DELIMITER @@ - yield json.dumps( - { - "type": "operation_cancelled", - "message": "Operation cancelled. The destructive SQL query was not executed." - } - ) + MESSAGE_DELIMITER + yield json.dumps( + { + "type": "operation_cancelled", + "message": "Operation cancelled. The destructive SQL query was not executed.", + "final_response": True + } + ) + MESSAGE_DELIMITERAlso applies to: 699-705, 708-711, 717-725, 727-734, 738-740, 750-755, 789-791, 793-798
772-772: Log exceptions with stack traces in confirm flowUse logging.exception in except paths to aid debugging and satisfy Ruff TRY400.
- logging.error("Error executing confirmed SQL query: %s", str(e)) + logging.exception("Error executing confirmed SQL query") @@ - lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") + lambda t: logging.error("Failed confirmed query memory save failed: %s", str(t.exception()))Also applies to: 784-784
673-675: Initialize MemoryTool lazily in confirm flowCreate the MemoryTool only when the operation is confirmed to avoid unnecessary I/O on cancelled requests.
- # Create memory tool for saving query results - memory_tool = await MemoryTool.create(request.state.user_id, graph_id) - + # Lazily create memory tool only when needed @@ - if confirmation == "CONFIRM": + if confirmation == "CONFIRM": + memory_tool = await MemoryTool.create(request.state.user_id, graph_id)Also applies to: 676-676
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
api/routes/graphs.py(17 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Adhere to pylint standards across all Python files (repository uses
make lint)
Files:
api/routes/graphs.py
🧬 Code graph analysis (1)
api/routes/graphs.py (5)
api/agents/analysis_agent.py (2)
AnalysisAgent(9-241)get_analysis(14-50)api/agents/follow_up_agent.py (2)
FollowUpAgent(33-82)generate_follow_up_question(36-82)api/auth/user_management.py (1)
token_required(254-294)api/config.py (1)
Config(47-130)api/memory/graphiti_tool.py (6)
MemoryTool(29-588)create(47-56)search_memories(419-492)save_query_memory(166-253)add_new_memory(132-164)clean_memory(494-517)
🪛 Ruff (0.12.2)
api/routes/graphs.py
561-561: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
771-771: Do not catch blind exception: Exception
(BLE001)
772-772: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
784-784: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e-tests
- GitHub Check: e2e-tests
🔇 Additional comments (2)
api/routes/graphs.py (2)
297-307: LGTM: short-term history truncation is correctThe slicing keeps chat/results aligned (N vs. N-1).
324-325: LGTM: consistent streaming final_response flags in main flowMid-stream steps are False; terminal responses are True. This prevents premature stream termination.
Also applies to: 340-340, 366-366, 460-461, 486-487, 493-496, 511-512, 522-523, 531-532, 546-546
| memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid spawning an unused MemoryTool task on off-topic queries
Creating MemoryTool up-front wastes work and can leak a background task when the query is off-topic. Instantiate it only for on-topic flows.
- memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))
@@
- memory_tool = await memory_tool_task
+ memory_tool = await MemoryTool.create(request.state.user_id, graph_id)
memory_context = await memory_tool.search_memories(
query=queries_history[-1]
)Also applies to: 381-384
🤖 Prompt for AI Agents
In api/routes/graphs.py around lines 310-311 (and similarly at 381-384), the
code unconditionally spawns MemoryTool.create as a background asyncio task which
wastes work and can leak when the query is off-topic; change the flow to avoid
creating the MemoryTool task up-front and instead instantiate or create the
MemoryTool only after you determine the query is on-topic (e.g., await or call
MemoryTool.create inside the branch that handles on-topic queries), removing or
deferring asyncio.create_task usage so no background task is spawned for
off-topic requests.
Summary by CodeRabbit
New Features
Improvements
UI
Chores