Skip to content

Conversation

@galshubeli
Copy link
Collaborator

@galshubeli galshubeli commented Aug 28, 2025

Summary by CodeRabbit

  • New Features

    • Memory-powered personalization influencing analysis/SQL generation; follow-up question generator for ambiguous/off-topic queries.
    • Persistent memory tooling for saving/searching queries, conversation summaries, and background maintenance.
  • Improvements

    • More accurate relevancy detection with standardized on/off-topic responses and suggestions.
    • Configurable short-term memory length for chat context.
  • UI

    • Final responses are persisted to result history while avoiding duplicate entries.
  • Chores

    • Added a VCS-sourced dependency; installation may require Git.

@galshubeli galshubeli requested review from Copilot and gkorland August 28, 2025 15:03
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 28, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds Graphiti-backed memory tooling and Azure OpenAI wiring, threads memory_context into AnalysisAgent and graphs route flows, refactors follow-up and relevancy agents, persists query/conversation memories, updates frontend streaming to store final responses, and introduces a VCS dependency for graphiti-core.

Changes

Cohort / File(s) Summary
Dependencies
Pipfile
Adds VCS dependency graphiti-core from https://github.com/FalkorDB/graphiti.git (ref: "staging").
Analysis Agent (memory-aware prompts)
api/agents/analysis_agent.py
Adds memory_context parameter to get_analysis and _build_prompt; injects memory section and personalization guidance into the prompt.
Follow-up Agent Refactor
api/agents/follow_up_agent.py
Replaces JSON-based follow-up flow with natural-language generation, renames prompt constant, subclasses BaseAgent, and changes API to generate_follow_up_question(...) -> str.
Relevancy Agent Prompt/Init
api/agents/relevancy_agent.py
Removes explicit __init__, rewrites RELEVANCY_PROMPT to require full conversation context, focuses on SQL-translatability, and standardizes output categories.
Memory Module Export
api/memory/__init__.py
Adds module docstring, re-exports MemoryTool, and defines __all__.
Graphiti Memory Tool (new)
api/memory/graphiti_tool.py
New MemoryTool class, async factory, node ensureers, memory ingestion, query persistence/deduplication, similarity/search APIs, summarization, cleanup routines, plus AzureOpenAIConfig, client wiring and Graphiti client creation helpers.
Graphs Route Integration
api/routes/graphs.py
Creates/uses MemoryTool, obtains memory_context before analysis, passes it to AnalysisAgent.get_analysis, persists query/conversation memories, schedules cleanup, integrates FollowUpAgent, and extends streaming payloads with final_response flags.
Config
api/config.py
Adds SHORT_MEMORY_LENGTH = 5; removes several commented AWS credential lines.
Frontend Stream Handling
app/ts/modules/chat.ts
Stores any streamed step with final_response === true into state.result_history before other handling.
Frontend Message Handling
app/ts/modules/messages.ts
Removes pushing final-result messages into state.result_history (now handled by stream handler).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant FE as Frontend
  participant API as API / graphs route
  participant Mem as MemoryTool
  participant Ana as AnalysisAgent
  participant DB as Database
  participant FU as FollowUpAgent

  U->>FE: Ask question
  FE->>API: Stream request (question)
  API->>Mem: create(user_id, graph_id)
  API->>Mem: search_memories(question)
  Mem-->>API: memory_context
  API->>Ana: get_analysis(question, tables, db_desc, instructions, memory_context)
  Ana-->>API: analysis_result (SQL / flags)
  alt On-topic
    API->>DB: Execute SQL
    DB-->>API: Results / Error
    API-->>FE: Stream query_result / ai_response (with final_response)
    par Persist memory
      API->>Mem: save_query_memory(query, sql, success/error)
      API->>Mem: add_new_memory(full_response)
      API->>Mem: clean_memory()
    end
  else Off-topic
    API->>FU: generate_follow_up_question(question, analysis_result, found_tables)
    FU-->>API: follow-up text
    API-->>FE: Stream followup_questions (final_response)
    par Persist off-topic attempt
      API->>Mem: save_query_memory(query, "", false, "off-topic")
      API->>Mem: add_new_memory(full_response)
      API->>Mem: clean_memory()
    end
  end
Loading
sequenceDiagram
  autonumber
  participant FE as Frontend
  participant API as API / destructive confirm
  participant Mem as MemoryTool
  participant DB as Database

  FE->>API: Confirm destructive operation
  API->>DB: Execute SQL (destructive)
  DB-->>API: Success / Failure
  API-->>FE: Stream destructive_confirmation + final_result (final_response)
  par Persist outcome
    API->>Mem: save_query_memory(query, sql, success, error?)
    API->>Mem: add_new_memory(full_response)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

  • Async graph worker #98 — Modifies api/agents/relevancy_agent.py; related to prompt and init changes.
  • Staging #123 — Overlaps agents and api/routes/graphs.py edits; likely touches the same integration.
  • Fix login handler #128 — Also touches api/agents/relevancy_agent.py and output formatting; strong code-level overlap.

Suggested reviewers

  • swilly22

Poem

Thump-thump, I tuck your queries in my burrowed brain,
I nibble context crumbs so answers bloom again.
I hop to schemas, hum a tiny tune,
I store a memory, then fetch it soon.
Together we query, and hop toward noon. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch memory-component

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link

github-actions bot commented Aug 28, 2025

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 1 package(s) with unknown licenses.
See the Details below.

License Issues

Pipfile.lock

PackageVersionLicenseIssue Type
posthog6.7.0NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
pip/graphiti-core UnknownUnknown
pip/backoff 2.2.1 UnknownUnknown
pip/diskcache 5.6.3 UnknownUnknown
pip/graphiti-core UnknownUnknown
pip/hf-xet 1.1.9 UnknownUnknown
pip/neo4j 5.28.2 🟢 6.8
Details
CheckScoreReason
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 2 issue activity found in the last 90 days -- score normalized to 10
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow⚠️ -1no workflows found
Token-Permissions⚠️ -1No tokens found
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 9license file detected
Binary-Artifacts🟢 10no binaries found in the repo
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Fuzzing⚠️ 0project is not fuzzed
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/numpy 2.3.2 🟢 7.2
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dependency-Update-Tool🟢 10update tool detected
Maintained🟢 1030 commit(s) and 16 issue activity found in the last 90 days -- score normalized to 10
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Binary-Artifacts🟢 10no binaries found in the repo
Token-Permissions🟢 9detected GitHub workflow tokens with excessive permissions
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Pinned-Dependencies🟢 3dependency not pinned by hash detected -- score normalized to 3
License🟢 9license file detected
Fuzzing🟢 10project is fuzzed
Branch-Protection🟢 3branch protection is not maximal on development and all release branches
Vulnerabilities⚠️ 026 existing vulnerabilities detected
Signed-Releases⚠️ 0Project has not signed or included provenance with any releases.
SAST🟢 9SAST tool detected but not run on all commits
Security-Policy🟢 9security policy file detected
CI-Tests🟢 1016 out of 16 merged PRs checked by a CI test -- score normalized to 10
Contributors🟢 10project has 99 contributing companies or organizations
pip/posthog 6.7.0 UnknownUnknown
pip/pytz 2025.2 🟢 4.6
Details
CheckScoreReason
Maintained⚠️ 00 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Code-Review⚠️ 1Found 3/25 approved changesets -- score normalized to 1
Security-Policy🟢 9security policy file detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Binary-Artifacts🟢 10no binaries found in the repo
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 10license file detected
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing🟢 10project is fuzzed
Packaging🟢 10packaging workflow detected
Signed-Releases⚠️ 0Project has not signed or included provenance with any releases.
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/tenacity 9.1.2 🟢 5.8
Details
CheckScoreReason
Code-Review🟢 8Found 17/21 approved changesets -- score normalized to 8
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Maintained⚠️ 01 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions🟢 9detected GitHub workflow tokens with excessive permissions
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Packaging🟢 10packaging workflow detected
Security-Policy⚠️ 0security policy file not detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0

Scanned Files

  • Pipfile
  • Pipfile.lock

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive memory component to QueryWeaver that stores and retrieves conversation history and database interactions using Graphiti (a graph-based memory system) with Azure OpenAI integration. The memory system helps improve SQL generation by providing context from previous user interactions and database queries.

  • Adds Graphiti-based memory storage for user conversations and database interactions
  • Integrates memory context into the SQL analysis pipeline to improve query generation
  • Implements follow-up question generation for failed or incomplete queries

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
api/memory/graphiti_tool.py New comprehensive memory management tool with Graphiti integration
api/memory/init.py Memory module initialization
api/routes/graphs.py Integrates memory tool into query processing pipeline and adds final_response flags
api/agents/analysis_agent.py Updates analysis agent to accept and use memory context
api/agents/follow_up_agent.py Refactored to generate conversational follow-up questions
api/agents/relevancy_agent.py Updates relevancy checking with conversation context awareness
app/ts/modules/messages.ts Removes result_history tracking from final result messages
app/ts/modules/chat.ts Adds result_history tracking for final responses based on final_response flag
Pipfile Adds graphiti-core dependency

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
api/agents/analysis_agent.py (1)

14-21: Make instructions and memory_context keyword-only and update callsites

  • Change get_analysis signature to:
    def get_analysis(
        self,
        user_query: str,
        combined_tables: list,
        db_description: str,
        *,
        instructions: str | None = None,
        memory_context: str | None = None,
    ) -> dict:
  • Update the call in api/routes/graphs.py:376–378 to:
        answer_an = agent_an.get_analysis(
            queries_history[-1],
            result,
            db_description,
            instructions=instructions,
            memory_context=memory_context,
        )
  • Apply the same keyword-only refactor to _build_prompt and its invocation in analysis_agent.py.
api/agents/follow_up_agent.py (1)

80-83: Remove unused exception variable and shorten fallback (pylint F841/C0301).

-        except Exception as e:
-            # Fallback response if LLM call fails
-            return "I'm having trouble generating a follow-up question right now. Could you try rephrasing your question or providing more specific details about what you're looking for?"
+        except Exception:
+            # Fallback response if LLM call fails
+            return ("Sorry—I couldn’t generate a follow‑up. Could you rephrase or share the "
+                    "specific details you’re after?")
api/agents/relevancy_agent.py (1)

84-89: Async method calls sync LLM; switch to acompletion to avoid blocking.

This prevents event-loop blocking and aligns with async API.

-from litellm import completion
+from litellm import acompletion
@@
-        completion_result = completion(
+        completion_result = await acompletion(
             model=Config.COMPLETION_MODEL,
             messages=self.messages,
             temperature=0,
         )
♻️ Duplicate comments (2)
api/agents/follow_up_agent.py (1)

71-75: Lower temperature for consistency; prior feedback suggested ≤0.8.

Set to 0.8 for more predictable follow-ups.

-                temperature=0.9
+                temperature=0.8
api/routes/graphs.py (1)

610-613: Enhance callback error handling to ensure proper string formatting.

The callback should handle the case where t.exception() might return None more explicitly.

🧹 Nitpick comments (9)
api/agents/analysis_agent.py (3)

121-140: Fix trailing whitespace and long lines (pylint C0303/C0301).

Trim trailing spaces and wrap overlong prompt bullets.

-        
-        # Include memory context in the prompt if available
+        # Include memory context in the prompt if available
@@
-            13. CRITICAL PERSONALIZATION CHECK: If missing user identification/personalization is a significant or primary component of the query (e.g., "show my orders", "my account balance", "my recent purchases", "how many employees I have", "products I own") AND no user identification is available in memory context or schema, set "is_sql_translatable" to false. However, if memory context contains user identification (like user name or previous successful personal queries), then personal queries ARE translatable even if they are the primary component of the query.
+            13. CRITICAL PERSONALIZATION CHECK:
+                - If the query is primarily personal (e.g., "my orders", "my account balance")
+                  AND no user identification exists in memory or schema → set
+                  "is_sql_translatable" to false.
+                - If memory context identifies the user (name/ID/prior personal queries) →
+                  treat such queries as translatable.

Also applies to: 235-239


52-103: Too many locals in _format_schema (pylint R0914). Extract helpers.

Small extraction reduces locals and improves readability.

@@
-    def _format_schema(self, schema_data: List) -> str:
+    def _format_schema(self, schema_data: List) -> str:
+        def _format_column(column: dict) -> str:
+            col_name = column.get("columnName", "")
+            col_type = column.get("dataType", None)
+            col_description = column.get("description", "")
+            col_key = column.get("keyType", None)
+            nullable = column.get("nullable", False)
+            key_info = (
+                ", PRIMARY KEY" if col_key == "PRI" else ", FOREIGN KEY" if col_key == "FK" else ""
+            )
+            return (f"  - {col_name} ({col_type},{key_info},{col_key},{nullable}): "
+                    f"{col_description}")
@@
-            for column in columns:
-                col_name = column.get("columnName", "")
-                col_type = column.get("dataType", None)
-                col_description = column.get("description", "")
-                col_key = column.get("keyType", None)
-                nullable = column.get("nullable", False)
-
-                key_info = (
-                    ", PRIMARY KEY"
-                    if col_key == "PRI"
-                    else ", FOREIGN KEY" if col_key == "FK" else ""
-                )
-                column_str = (f"  - {col_name} ({col_type},{key_info},{col_key},"
-                             f"{nullable}): {col_description}")
-                table_str += column_str + "\n"
+            for column in columns:
+                table_str += _format_column(column) + "\n"

49-50: Consider storing full analysis JSON in history, not only SQL.

Appending only sql_query degrades future-context quality.

-        self.messages.append({"role": "assistant", "content": analysis["sql_query"]})
+        self.messages.append({"role": "assistant", "content": response})
api/agents/follow_up_agent.py (2)

56-60: Wrap long lines and strip trailing whitespace (pylint C0301/C0303).

Also stringify list fields for a cleaner prompt.

-        is_translatable = analysis_result.get("is_sql_translatable", False) if analysis_result else False
-        missing_info = analysis_result.get("missing_information", []) if analysis_result else []
-        ambiguities = analysis_result.get("ambiguities", []) if analysis_result else []
-        explanation = analysis_result.get("explanation", "No detailed explanation available") if analysis_result else "No analysis result available"
+        is_translatable = (
+            analysis_result.get("is_sql_translatable", False)
+            if analysis_result else False
+        )
+        missing_info = (
+            ", ".join(analysis_result.get("missing_information", []))
+            if analysis_result else ""
+        )
+        ambiguities = (
+            ", ".join(analysis_result.get("ambiguities", []))
+            if analysis_result else ""
+        )
+        explanation = (
+            analysis_result.get("explanation", "No detailed explanation available")
+            if analysis_result else "No analysis result available"
+        )
@@
-            MISSING_INFO=missing_info,
-            AMBIGUITIES=ambiguities,
+            MISSING_INFO=missing_info,
+            AMBIGUITIES=ambiguities,
             EXPLANATION=explanation
         )

Also applies to: 69-79


36-41: found_tables is unused; either use or mark as intentionally unused.

Minimal change: rename to _found_tables.

-        found_tables: list = None
+        _found_tables: list | None = None
api/memory/graphiti_tool.py (3)

225-225: Remove f-strings without placeholders (Ruff F541).

-                print(f"Query with same user_query and sql_query already exists, skipping creation")
+                print("Query with same user_query and sql_query already exists, skipping creation")
@@
-                memory_context += f"\n"
+                memory_context += "\n"

Also applies to: 484-486


376-377: Replace prints with logging; avoid noisy stdout in production.

Introduce a module logger and use logger.info/debug/exception.

+import logging
+logger = logging.getLogger(__name__)
@@
-                    print(f'Found database node: {node.name} with UUID: {node.uuid}')
+                    logger.debug("Found database node: %s (uuid=%s)", node.name, node.uuid)
@@
-                print(f'\nDatabase Facts Search Results for {self.graph_id}:')
+                logger.debug("Database Facts search results for %s", self.graph_id)

Also applies to: 393-398


594-607: Avoid mutating environment in AzureOpenAIConfig; derive from env only.

Setting MODEL_NAME has side effects; remove it.

-    def __init__(self):
-        # Set the model name as requested
-        os.environ["MODEL_NAME"] = "gpt-4.1"
-        
+    def __init__(self):
api/routes/graphs.py (1)

296-297: Consider adding error handling for memory tool initialization.

The memory tool creation is awaited later but lacks explicit error handling. If memory tool creation fails, it could cause issues downstream.

Add try-catch to handle potential memory tool initialization failures:

-    memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))
-
+    memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))

Then wrap the memory tool usage in error handling:

try:
    memory_tool = await memory_tool_task
    search_memories = await memory_tool.search_memories(
        query=queries_history[-1]
    )
    memory_context = search_memories.get("memory_context", "")
except Exception as e:
    logging.error(f"Memory tool initialization or search failed: {e}")
    memory_context = ""  # Fallback to empty context
    memory_tool = None  # Set to None to skip memory operations later
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d3a69e2 and a76f547.

⛔ Files ignored due to path filters (1)
  • Pipfile.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • Pipfile (1 hunks)
  • api/agents/analysis_agent.py (5 hunks)
  • api/agents/follow_up_agent.py (1 hunks)
  • api/agents/relevancy_agent.py (2 hunks)
  • api/memory/__init__.py (1 hunks)
  • api/memory/graphiti_tool.py (1 hunks)
  • api/routes/graphs.py (18 hunks)
  • app/ts/modules/chat.ts (1 hunks)
  • app/ts/modules/messages.ts (0 hunks)
💤 Files with no reviewable changes (1)
  • app/ts/modules/messages.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to pylint standards across all Python files (repository uses make lint)

Files:

  • api/memory/__init__.py
  • api/memory/graphiti_tool.py
  • api/agents/relevancy_agent.py
  • api/agents/analysis_agent.py
  • api/routes/graphs.py
  • api/agents/follow_up_agent.py
{Pipfile,Pipfile.lock}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use Pipenv for dependency management; add and lock dependencies via Pipfile and Pipfile.lock

Files:

  • Pipfile
app/**

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Keep the TypeScript frontend sources in app/ and build them before production runs

Files:

  • app/ts/modules/chat.ts
🧬 Code graph analysis (3)
app/ts/modules/chat.ts (1)
app/ts/modules/config.ts (1)
  • state (74-78)
api/memory/graphiti_tool.py (1)
api/config.py (3)
  • Config (47-137)
  • get_vector_size (33-43)
  • embed (18-31)
api/routes/graphs.py (1)
api/memory/graphiti_tool.py (6)
  • MemoryTool (29-591)
  • create (47-56)
  • search_memories (419-502)
  • save_query_memory (166-253)
  • add_new_memory (132-164)
  • clean_memory (504-518)
🪛 Ruff (0.12.2)
api/memory/graphiti_tool.py

225-225: f-string without any placeholders

Remove extraneous f prefix

(F541)


245-245: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


484-484: f-string without any placeholders

Remove extraneous f prefix

(F541)


508-508: Local variable driver is assigned to but never used

Remove assignment to unused variable driver

(F841)


509-509: Local variable query is assigned to but never used

Remove assignment to unused variable query

(F841)

api/agents/follow_up_agent.py

80-80: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🪛 GitHub Actions: Pylint
api/agents/analysis_agent.py

[error] 106-106: C0301: Line too long (115/100)


[error] 121-121: C0303: Trailing whitespace


[error] 140-140: C0303: Trailing whitespace


[error] 237-237: C0301: Line too long (206/100)


[error] 238-238: C0301: Line too long (573/100)


[error] 14-14: R0913: Too many arguments (6/5)


[error] 14-14: R0917: Too many positional arguments (6/5)


[error] 52-52: R0914: Too many local variables (21/15)


[error] 105-105: R0913: Too many arguments (6/5)


[error] 105-105: R0917: Too many positional arguments (6/5)

api/agents/follow_up_agent.py

[error] 37-37: C0303: Trailing whitespace


[error] 56-56: C0301: Line too long (105/100)


[error] 59-59: C0301: Line too long (148/100)


[error] 60-60: C0303: Trailing whitespace


[error] 69-69: C0303: Trailing whitespace


[error] 76-76: C0303: Trailing whitespace


[error] 79-79: C0303: Trailing whitespace


[error] 82-82: C0301: Line too long (187/100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
🔇 Additional comments (10)
Pipfile (1)

19-19: Fix Pipenv VCS dependency spec and pin to a commit for reproducibility
Pipenv expects git = "https://..." (no git+) and a stable commit SHA instead of a moving branch (staging). After updating, run pipenv lock --clear and verify the import locally since pipenv isn’t available in this environment.
Apply:

- graphiti-core = {ref = "staging", git = "git+https://github.com/FalkorDB/graphiti.git"}
+ graphiti-core = {git = "https://github.com/FalkorDB/graphiti.git", ref = "<PINNED_COMMIT_SHA>"}
api/agents/relevancy_agent.py (1)

69-75: RelevancyAgent inherits BaseAgent’s constructor; existing callsites are correct.
RelevancyAgent doesn’t define its own __init__ and inherits BaseAgent.__init__(queries_history, result_history) (api/agents/utils.py:10–13), so RelevancyAgent(queries_history, result_history) remains valid. Ignore the original constructor-removal warning.

Likely an incorrect or invalid review comment.

api/memory/__init__.py (1)

1-7: LGTM! Clean module initialization.

The module correctly exposes the public API for memory tooling with proper docstring and explicit __all__ declaration.

app/ts/modules/chat.ts (1)

111-115: Good implementation for tracking final responses.

The logic correctly captures messages marked with final_response: true into the result history, ensuring memory persistence across different message types. This aligns well with the backend's memory context features.

api/routes/graphs.py (6)

367-374: Good integration of memory context into analysis.

The memory search is properly integrated before SQL generation, providing relevant context from past interactions and similar queries.


561-575: Well-structured follow-up question generation for non-translatable queries.

The implementation properly handles cases where SQL generation fails by providing helpful follow-up questions to guide the user.


616-619: Good async implementation for memory persistence.

The background task for saving conversation memory is properly implemented with error handling callbacks, ensuring the main flow isn't blocked.


661-663: Good memory tool initialization for destructive operations.

The memory tool is properly created to track confirmed destructive operations.


746-758: Consistent memory tracking for confirmed operations.

The implementation properly saves successful confirmed queries to memory with appropriate error handling.


763-775: Good error recovery with memory persistence.

Failed destructive operations are properly tracked in memory, which is important for audit trails and learning from failures.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (6)
api/memory/graphiti_tool.py (4)

320-331: Fix docstring to match signature/return.

Remove non-existent arg, clarify return.

Apply:

     async def search_user_summary(self, limit: int = 5) -> str:
-        """
-        Search for user node summary extracts the user's personal information and general preferences.
-        
-        Args:
-            query: Natural language query to search for
-            limit: Maximum number of results to return
-            
-        Returns:
-            List of user node summaries with metadata
-        """
+        """
+        Retrieve summary text for the current user node.
+
+        Args:
+            limit: Maximum number of results to inspect.
+
+        Returns:
+            A summary string if found; otherwise an empty string.
+        """

46-55: Initialize user node and make vector index creation idempotent.

Ensure user node exists and ignore “already exists” errors to avoid startup failures.

Apply:

     @classmethod
     async def create(cls, user_id: str, graph_id: str) -> "MemoryTool":
         """Async factory to construct and initialize the tool."""
         self = cls(user_id, graph_id)
-        await self._ensure_database_node(graph_id, user_id)
+        await asyncio.gather(
+            self._ensure_user_node(user_id),
+            self._ensure_database_node(graph_id, user_id),
+        )
 
         vector_size = Config.EMBEDDING_MODEL.get_vector_size()
         driver = self.graphiti_client.driver
-        await driver.execute_query(f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}")
+        try:
+            await driver.execute_query(
+                f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) "
+                f"OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}"
+            )
+        except Exception as e:
+            if "already exists" not in str(e).lower():
+                raise

205-241: Parameterize Cypher, remove manual escaping, and fix result handling.

String interpolation risks injection and escaping breaks embeddings. Also treat execute_query result as tuple.

Apply:

-            relationship_type = "SUCCESS" if success else "FAILED"
-            
-            # Escape quotes in the query and SQL for Cypher
-            escaped_query = query.replace("'", "\\'").replace('"', '\\"')
-            escaped_sql = sql_query.replace("'", "\\'").replace('"', '\\"')
-            escaped_error = error.replace("'", "\\'").replace('"', '\\"') if error else ""
-            embeddings = Config.EMBEDDING_MODEL.embed(escaped_query)[0]
+            relationship_type = "SUCCESS" if success else "FAILED"
+            embeddings = Config.EMBEDDING_MODEL.embed(query)[0]
@@
-            check_query = f"""
-            MATCH (db:Entity {{uuid: "{database_node_uuid}"}})
-            MATCH (db)-[r]->(q:Query)
-            WHERE q.user_query = "{escaped_query}" AND q.sql_query = "{escaped_sql}"
-            RETURN q.uuid as existing_query_uuid
-            LIMIT 1
-            """
+            check_query = """
+            MATCH (db:Entity {uuid: $db_uuid})-[]->(q:Query)
+            WHERE q.user_query = $user_query AND q.sql_query = $sql_query
+            RETURN q.uuid as existing_query_uuid
+            LIMIT 1
+            """
@@
-            check_result = await graph_driver.execute_query(check_query)
-            
-            # If query already exists, don't create a duplicate
-            if check_result[0]:  # If records exist
+            records, _, _ = await graph_driver.execute_query(
+                check_query,
+                db_uuid=database_node_uuid,
+                user_query=query,
+                sql_query=sql_query,
+            )
+            if records:
                 print(f"Query with same user_query and sql_query already exists, skipping creation")
                 return True
@@
-            cypher_query = f"""
-            MATCH (db:Entity {{uuid: "{database_node_uuid}"}})
-            MERGE (q:Query {{
-                user_query: "{escaped_query}",
-                sql_query: "{escaped_sql}",
-                success: {str(success).lower()},
-                error: "{escaped_error}",
-                timestamp: timestamp(),
-                embeddings: vecf32($embedding)
-            }})
-            CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q)
-            RETURN q.uuid as query_uuid
-            """
+            cypher_query = """
+            MATCH (db:Entity {uuid: $db_uuid})
+            MERGE (q:Query {
+                user_query: $user_query,
+                sql_query: $sql_query,
+                success: $success,
+                error: $error,
+                timestamp: timestamp(),
+                embeddings: vecf32($embedding)
+            })
+            CREATE (db)-[:REL {type: $rel_type, timestamp: timestamp()}]->(q)
+            RETURN q.uuid as query_uuid
+            """
@@
-                result = await graph_driver.execute_query(cypher_query, embedding=embeddings)
-                return True
+                await graph_driver.execute_query(
+                    cypher_query,
+                    db_uuid=database_node_uuid,
+                    user_query=query,
+                    sql_query=sql_query,
+                    success=bool(success),
+                    error=error or "",
+                    rel_type=relationship_type,
+                    embedding=embeddings,
+                )
+                return True
             except Exception as cypher_error:
                 print(f"Error executing Cypher query: {cypher_error}")
                 return False

Also applies to: 243-249, 221-226


573-579: Don’t block the event loop: use litellm.acompletion in async context.

Switch to the async API and await it.

Apply:

-from litellm import completion
+from litellm import acompletion
@@
-            response = completion(
+            response = await acompletion(
                 model=self.config.COMPLETION_MODEL,
                 messages=[{"role": "user", "content": prompt}],
                 temperature=0.1
             )
api/routes/graphs.py (2)

620-622: Done callbacks: guard None and stringify exceptions.

Avoid f-string on None exceptions.

Apply:

-            save_query_task.add_done_callback(
-                lambda t: logging.error(f"Query memory save failed: {t.exception()}") 
-                if t.exception() else logging.info("Query memory saved successfully")
-            )
+            save_query_task.add_done_callback(
+                lambda t: logging.error(f"Query memory save failed: {str(t.exception())}")
+                if t.exception() else logging.info("Query memory saved successfully")
+            )
@@
-            save_task.add_done_callback(lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool"))
+            save_task.add_done_callback(
+                lambda t: logging.error(f"Memory save failed: {str(t.exception())}")
+                if t.exception() else logging.info("Conversation saved to memory tool")
+            )

Also applies to: 627-627


630-636: Cleanup scheduling runs on every request; throttle it.

Comment says “once per week,” but it’s per request. Add simple rate limiting until a scheduler is introduced.

Apply:

-            # Clean old memory in background (once per week cleanup)
-            clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
-            clean_memory_task.add_done_callback(
-                lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") 
-                if t.exception() else logging.info("Memory cleanup completed successfully")
-            )
+            # Clean old memory occasionally; replace with a real scheduler later
+            import random
+            if random.random() < 0.01:
+                clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
+                clean_memory_task.add_done_callback(
+                    lambda t: logging.error(f"Memory cleanup failed: {str(t.exception())}") 
+                    if t.exception() else logging.info("Memory cleanup completed successfully")
+                )
🧹 Nitpick comments (4)
api/memory/graphiti_tool.py (2)

58-94: Replace prints with logging and avoid bare except Exception.

Use logging.{info,warning,exception} and catch narrower exceptions where possible. Improves observability and lint compliance.

Also applies to: 95-131, 160-163


291-305: Use parameters for k/limit and avoid f-strings; align k with requested limit.

Prevents injection and keeps results bounded.

Apply:

-            cypher_query = f"""
-                    CALL db.idx.vector.queryNodes('Query', 'embeddings', 10, vecf32($embedding))
+            cypher_query = """
+                    CALL db.idx.vector.queryNodes('Query', 'embeddings', $k, vecf32($embedding))
                         YIELD node, score
-                        MATCH (db:Entity {{uuid: $database_node_uuid}})-[r]->(node)
+                        MATCH (db:Entity {uuid: $database_node_uuid})-[r]->(node)
                         RETURN node {
                             .user_query,
                             .sql_query,
                             .success,
                             .error
                         } AS query,
                         score
                         ORDER BY score ASC
-                        LIMIT {limit}
+                        LIMIT $limit
             """
@@
-                records, header, _  = await graph_driver.execute_query(cypher_query, embedding=query_embedding, database_node_uuid=database_node_uuid)
+                records, header, _  = await graph_driver.execute_query(
+                    cypher_query,
+                    embedding=query_embedding,
+                    database_node_uuid=database_node_uuid,
+                    k=max(10, int(limit)),
+                    limit=int(limit),
+                )
api/routes/graphs.py (2)

379-387: Pass memory_context safely; guard against non-str returns.

Given MemoryTool.fix above, this should always be a string. If not, coerce to empty string to be defensive.

Apply:

-            memory_context = await memory_tool.search_memories(
+            memory_context = await memory_tool.search_memories(
                 query=queries_history[-1]
             )
+            if not isinstance(memory_context, str):
+                memory_context = ""

768-783: Use logging.exception and stringify callback exceptions.

Improves debuggability.

Apply:

-            except Exception as e:
-                logging.error("Error executing confirmed SQL query: %s", str(e))
+            except Exception:
+                logging.exception("Error executing confirmed SQL query")
@@
-                save_query_task.add_done_callback(
-                    lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") 
-                    if t.exception() else logging.info("Failed confirmed query memory saved successfully")
-                )
+                save_query_task.add_done_callback(
+                    lambda t: logging.error(f"Failed confirmed query memory save failed: {str(t.exception())}") 
+                    if t.exception() else logging.info("Failed confirmed query memory saved successfully")
+                )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a76f547 and c9f75c9.

📒 Files selected for processing (3)
  • api/config.py (1 hunks)
  • api/memory/graphiti_tool.py (1 hunks)
  • api/routes/graphs.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
api/config.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Define AI model configuration and system prompts in api/config.py rather than scattering them across modules

Files:

  • api/config.py
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to pylint standards across all Python files (repository uses make lint)

Files:

  • api/config.py
  • api/routes/graphs.py
  • api/memory/graphiti_tool.py
🧬 Code graph analysis (2)
api/routes/graphs.py (4)
api/agents/analysis_agent.py (2)
  • AnalysisAgent (9-241)
  • get_analysis (14-50)
api/agents/follow_up_agent.py (2)
  • FollowUpAgent (33-82)
  • generate_follow_up_question (36-82)
api/config.py (1)
  • Config (47-130)
api/memory/graphiti_tool.py (6)
  • MemoryTool (29-590)
  • create (47-56)
  • search_memories (419-497)
  • save_query_memory (166-253)
  • add_new_memory (132-164)
  • clean_memory (499-518)
api/memory/graphiti_tool.py (1)
api/config.py (3)
  • Config (47-130)
  • get_vector_size (33-43)
  • embed (18-31)
🪛 Ruff (0.12.2)
api/config.py

58-58: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

api/routes/graphs.py

559-559: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


768-768: Do not catch blind exception: Exception

(BLE001)


769-769: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


781-781: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

api/memory/graphiti_tool.py

89-89: Consider moving this statement to an else block

(TRY300)


91-91: Do not catch blind exception: Exception

(BLE001)


126-126: Consider moving this statement to an else block

(TRY300)


128-128: Do not catch blind exception: Exception

(BLE001)


160-160: Do not catch blind exception: Exception

(BLE001)


166-166: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


225-225: f-string without any placeholders

Remove extraneous f prefix

(F541)


245-245: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


246-246: Consider moving this statement to an else block

(TRY300)


247-247: Do not catch blind exception: Exception

(BLE001)


251-251: Do not catch blind exception: Exception

(BLE001)


310-310: Consider moving this statement to an else block

(TRY300)


312-312: Do not catch blind exception: Exception

(BLE001)


316-316: Do not catch blind exception: Exception

(BLE001)


346-346: Consider moving this statement to an else block

(TRY300)


348-348: Do not catch blind exception: Exception

(BLE001)


415-415: Do not catch blind exception: Exception

(BLE001)


484-484: f-string without any placeholders

Remove extraneous f prefix

(F541)


488-488: Consider moving this statement to an else block

(TRY300)


490-490: Do not catch blind exception: Exception

(BLE001)


516-516: Do not catch blind exception: Exception

(BLE001)


582-584: Consider moving this statement to an else block

(TRY300)


586-586: Do not catch blind exception: Exception

(BLE001)


622-622: Avoid specifying long messages outside the exception class

(TRY003)


624-624: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (2)
api/config.py (1)

57-57: Good addition: bounded short-term memory length.

Clear, sensible default for trimming chat history.

api/routes/graphs.py (1)

295-305: History truncation logic looks correct.

Keeps N latest queries and aligns results size accordingly.

galshubeli and others added 3 commits August 31, 2025 18:27
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (9)
api/routes/graphs.py (2)

621-631: Harden done_callbacks: guard None exception and stringify.

Avoid NoneType access and ensure readable logs.

-            save_query_task.add_done_callback(
-                lambda t: logging.error(f"Query memory save failed: {t.exception()}") 
-                if t.exception() else logging.info("Query memory saved successfully")
-            )
+            save_query_task.add_done_callback(
+                lambda t: logging.error(f"Query memory save failed: {str(t.exception())}") 
+                if t.exception() else logging.info("Query memory saved successfully")
+            )
@@
-            save_task.add_done_callback(lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool"))
+            save_task.add_done_callback(
+                lambda t: logging.error(f"Memory save failed: {str(t.exception())}")
+                if t.exception() else logging.info("Conversation saved to memory tool")
+            )
@@
-                lambda t: logging.error(f"Confirmed query memory save failed: {t.exception()}") 
+                lambda t: logging.error(f"Confirmed query memory save failed: {str(t.exception())}") 

Also applies to: 766-769


632-637: Run memory cleanup periodically, not per request.

Comment says weekly but code triggers every call; add simple rate-limit/scheduler.

-            # Clean old memory in background (once per week cleanup)
-            clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
-            clean_memory_task.add_done_callback(
-                lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") 
-                if t.exception() else logging.info("Memory cleanup completed successfully")
-            )
+            # Clean memory occasionally; TODO: move to scheduler/startup task
+            import random
+            if random.random() < 0.01:
+                clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
+                clean_memory_task.add_done_callback(
+                    lambda t: logging.error(f"Memory cleanup failed: {str(t.exception())}") 
+                    if t.exception() else logging.info("Memory cleanup completed successfully")
+                )
api/memory/graphiti_tool.py (7)

32-36: Sanitize user_id before composing DB name.

Prevents invalid identifiers and abuses.

-        user_memory_db = f"{user_id}-memory"
+        import re
+        def _sanitize(s: str) -> str:
+            return re.sub(r"[^A-Za-z0-9_-]", "-", s)[:64] or "user"
+        user_memory_db = f"{_sanitize(user_id)}-memory"

320-331: Fix docstring: remove non-existent query arg and clarify return.

-    async def search_user_summary(self, limit: int = 5) -> str:
-        """
-        Search for user node summary extracts the user's personal information and general preferences.
-        
-        Args:
-            query: Natural language query to search for
-            limit: Maximum number of results to return
-            
-        Returns:
-            List of user node summaries with metadata
-        """
+    async def search_user_summary(self, limit: int = 5) -> str:
+        """
+        Retrieve summary text for the current user node.
+
+        Args:
+            limit: Max results to inspect.
+
+        Returns:
+            A summary string; empty if none.
+        """

419-431: Return type docstring mismatch.

Function returns a string; update docstring accordingly.

-        Returns:
-            Dict containing user_summary, database_facts, similar_queries, and memory_context
+        Returns:
+            memory_context: a single string composed of user, database, and similar-query context

591-604: Centralize model names via Config to avoid drift.

Remove hardcoded "gpt-4.1"/"text-embedding-ada-002" and read from api.config.Config.

 class AzureOpenAIConfig:
@@
-        # Set the model name as requested
-        os.environ["MODEL_NAME"] = "gpt-4.1"
-        
         self.api_key = os.getenv('AZURE_API_KEY')
-        self.endpoint = os.getenv('AZURE_API_BASE') 
+        self.endpoint = os.getenv('AZURE_API_BASE')
         self.api_version = os.getenv('AZURE_API_VERSION', '2024-02-01')
-        self.model_choice = "gpt-4.1"  # Use the model name directly
-        self.embedding_model = "text-embedding-ada-002"  # Use model name, not deployment
+        # Derive names from Config (strip optional provider prefix like "azure/")
+        self.model_choice = Config.COMPLETION_MODEL.split("/", 1)[-1]
+        self.embedding_model = Config.EMBEDDING_MODEL_NAME.split("/", 1)[-1]
         self.small_model = os.getenv('AZURE_SMALL_MODEL', 'gpt-4o-mini')
@@
-        small_model=config.small_model_deployment,
-        model=config.llm_deployment,
+        small_model=config.small_model_deployment,
+        model=config.llm_deployment,
@@
-            config=OpenAIEmbedderConfig(embedding_model=config.embedding_deployment),
+            config=OpenAIEmbedderConfig(embedding_model=config.embedding_deployment),

Also applies to: 645-648, 655-663


46-55: Make initialization idempotent and ensure User node exists.

Create user+db nodes together and guard vector index creation.

     async def create(cls, user_id: str, graph_id: str) -> "MemoryTool":
@@
-        await self._ensure_database_node(graph_id, user_id)
-
-        vector_size = Config.EMBEDDING_MODEL.get_vector_size()
-        driver = self.graphiti_client.driver
-        await driver.execute_query(f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}")
+        await asyncio.gather(
+            self._ensure_user_node(user_id),
+            self._ensure_database_node(graph_id, user_id),
+        )
+        vector_size = Config.EMBEDDING_MODEL.get_vector_size()
+        driver = self.graphiti_client.driver
+        try:
+            await driver.execute_query(
+                f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) "
+                f"OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}"
+            )
+        except Exception as e:
+            if "already exists" not in str(e).lower():
+                raise

571-576: Do not block event loop: use async LLM call.

Switch to acompletion in async context.

-from litellm import completion
+from litellm import acompletion
@@
-            response = completion(
+            response = await acompletion(
                 model=self.config.COMPLETION_MODEL,
                 messages=[{"role": "user", "content": prompt}],
                 temperature=0.1
             )

166-254: Cypher injection and duplicate-check bugs in save_query_memory.

Manual escaping and f-strings are unsafe; also mis-handle execute_query result tuple and unused var. Parameterize and normalize relationship type.

@@
-            relationship_type = "SUCCESS" if success else "FAILED"
-            
-            # Escape quotes in the query and SQL for Cypher
-            escaped_query = query.replace("'", "\\'").replace('"', '\\"')
-            escaped_sql = sql_query.replace("'", "\\'").replace('"', '\\"')
-            escaped_error = error.replace("'", "\\'").replace('"', '\\"') if error else ""
-            embeddings = Config.EMBEDDING_MODEL.embed(escaped_query)[0]
+            relationship_type = "SUCCESS" if success else "FAILED"
+            embeddings = Config.EMBEDDING_MODEL.embed(query)[0]
@@
-            check_query = f"""
-            MATCH (db:Entity {{uuid: "{database_node_uuid}"}})
-            MATCH (db)-[r]->(q:Query)
-            WHERE q.user_query = "{escaped_query}" AND q.sql_query = "{escaped_sql}"
-            RETURN q.uuid as existing_query_uuid
-            LIMIT 1
-            """
+            check_query = """
+            MATCH (db:Entity {uuid: $db_uuid})-[]->(q:Query)
+            WHERE q.user_query = $user_query AND q.sql_query = $sql_query
+            RETURN q.uuid as existing_query_uuid
+            LIMIT 1
+            """
@@
-            check_result = await graph_driver.execute_query(check_query)
-            
-            # If query already exists, don't create a duplicate
-            if check_result[0]:  # If records exist
+            records, _, _ = await graph_driver.execute_query(
+                check_query,
+                db_uuid=database_node_uuid,
+                user_query=query,
+                sql_query=sql_query,
+            )
+            if records:
                 print(f"Query with same user_query and sql_query already exists, skipping creation")
                 return True
@@
-            cypher_query = f"""
-            MATCH (db:Entity {{uuid: "{database_node_uuid}"}})
-            MERGE (q:Query {{
-                user_query: "{escaped_query}",
-                sql_query: "{escaped_sql}",
-                success: {str(success).lower()},
-                error: "{escaped_error}",
-                timestamp: timestamp(),
-                embeddings: vecf32($embedding)
-            }})
-            CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q)
-            RETURN q.uuid as query_uuid
-            """
+            cypher_query = """
+            MATCH (db:Entity {uuid: $db_uuid})
+            MERGE (q:Query {
+                user_query: $user_query,
+                sql_query: $sql_query,
+                success: $success,
+                error: $error,
+                timestamp: timestamp(),
+                embeddings: vecf32($embedding)
+            })
+            CREATE (db)-[:REL {type: $rel_type, timestamp: timestamp()}]->(q)
+            RETURN q.uuid as query_uuid
+            """
@@
-                result = await graph_driver.execute_query(cypher_query, embedding=embeddings)
-                return True
+                await graph_driver.execute_query(
+                    cypher_query,
+                    db_uuid=database_node_uuid,
+                    user_query=query,
+                    sql_query=sql_query,
+                    success=bool(success),
+                    error=error or "",
+                    rel_type=relationship_type,
+                    embedding=embeddings,
+                )
+                return True
🧹 Nitpick comments (3)
api/routes/graphs.py (2)

391-395: Normalize error handling to Optional[str] and avoid bool/string mixing.

Use None as the sentinel and keep a consistent type; this simplifies truthiness checks and serialization.

@@
-            # Initialize response variables
-            user_readable_response = ""
-            follow_up_result = ""
-            execution_error = False
+            # Initialize response variables
+            user_readable_response = ""
+            follow_up_result = ""
+            execution_error: Optional[str] = None
@@
-                except Exception as e:
-                    execution_error = str(e)
+                except Exception as e:
+                    execution_error = str(e)
@@
-                    ) + MESSAGE_DELIMITER
+                    ) + MESSAGE_DELIMITER
@@
-            else:
-                execution_error = "Missing information"
+            else:
+                execution_error = "Missing information"
@@
-            if execution_error:
+            if execution_error:
                 full_response["error"] = execution_error
                 full_response["success"] = False
             else:
                 full_response["success"] = True
@@
-                memory_tool.save_query_memory(
+                memory_tool.save_query_memory(
                     query=queries_history[-1],
                     sql_query=answer_an["sql_query"],
                     success=full_response["success"],
-                    error=execution_error
+                    error=execution_error
                 )

Add import (top of file):

from typing import Optional

Also applies to: 559-559, 567-568, 571-577, 605-611, 614-620


561-561: Prefer logging.exception for errors with stack traces.

Improves observability and aligns with linter hint TRY400.

-                    logging.error("Error executing SQL query: %s", str(e))
+                    logging.exception("Error executing SQL query")
@@
-                logging.error("Error executing confirmed SQL query: %s", str(e))
+                logging.exception("Error executing confirmed SQL query")
@@
-                save_query_task.add_done_callback(
-                    lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") 
+                save_query_task.add_done_callback(
+                    lambda t: logging.error(f"Failed confirmed query memory save failed: {str(t.exception())}") 

Also applies to: 772-772, 784-784

api/memory/graphiti_tool.py (1)

290-305: Parameterize/guard LIMIT and clamp result size.

Avoid f-string LIMIT; enforce integer bounds.

-            cypher_query = f"""
+            cypher_query = """
                     CALL db.idx.vector.queryNodes('Query', 'embeddings', 10, vecf32($embedding))
                         YIELD node, score
                         MATCH (db:Entity {{uuid: $database_node_uuid}})-[r]->(node)
                         RETURN node {
                             .user_query,
                             .sql_query,
                             .success,
                             .error
                         } AS query,
                         score
                         ORDER BY score ASC
-                        LIMIT {limit}
+                        LIMIT $limit
             """
@@
-                records, header, _  = await graph_driver.execute_query(cypher_query, embedding=query_embedding, database_node_uuid=database_node_uuid)
+                records, header, _  = await graph_driver.execute_query(
+                    cypher_query,
+                    embedding=query_embedding,
+                    database_node_uuid=database_node_uuid,
+                    limit=int(max(0, min(limit, 50))),
+                )

Also applies to: 308-314

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c9f75c9 and 70340ba.

⛔ Files ignored due to path filters (1)
  • Pipfile.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • Pipfile (1 hunks)
  • api/memory/graphiti_tool.py (1 hunks)
  • api/routes/graphs.py (17 hunks)
  • app/ts/modules/chat.ts (1 hunks)
  • app/ts/modules/messages.ts (0 hunks)
💤 Files with no reviewable changes (1)
  • app/ts/modules/messages.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • Pipfile
  • app/ts/modules/chat.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to pylint standards across all Python files (repository uses make lint)

Files:

  • api/routes/graphs.py
  • api/memory/graphiti_tool.py
🧠 Learnings (1)
📚 Learning: 2025-08-24T17:15:21.337Z
Learnt from: CR
PR: FalkorDB/QueryWeaver#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-08-24T17:15:21.337Z
Learning: Applies to api/config.py : Define AI model configuration and system prompts in `api/config.py` rather than scattering them across modules

Applied to files:

  • api/memory/graphiti_tool.py
🧬 Code graph analysis (2)
api/routes/graphs.py (4)
api/agents/analysis_agent.py (2)
  • AnalysisAgent (9-241)
  • get_analysis (14-50)
api/agents/follow_up_agent.py (2)
  • FollowUpAgent (33-82)
  • generate_follow_up_question (36-82)
api/config.py (1)
  • Config (47-130)
api/memory/graphiti_tool.py (6)
  • MemoryTool (29-588)
  • create (47-56)
  • search_memories (419-492)
  • save_query_memory (166-253)
  • add_new_memory (132-164)
  • clean_memory (494-517)
api/memory/graphiti_tool.py (1)
api/config.py (3)
  • Config (47-130)
  • get_vector_size (33-43)
  • embed (18-31)
🪛 Ruff (0.12.2)
api/routes/graphs.py

561-561: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


771-771: Do not catch blind exception: Exception

(BLE001)


772-772: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


784-784: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

api/memory/graphiti_tool.py

89-89: Consider moving this statement to an else block

(TRY300)


91-91: Do not catch blind exception: Exception

(BLE001)


126-126: Consider moving this statement to an else block

(TRY300)


128-128: Do not catch blind exception: Exception

(BLE001)


160-160: Do not catch blind exception: Exception

(BLE001)


225-225: f-string without any placeholders

Remove extraneous f prefix

(F541)


245-245: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


246-246: Consider moving this statement to an else block

(TRY300)


247-247: Do not catch blind exception: Exception

(BLE001)


251-251: Do not catch blind exception: Exception

(BLE001)


310-310: Consider moving this statement to an else block

(TRY300)


312-312: Do not catch blind exception: Exception

(BLE001)


316-316: Do not catch blind exception: Exception

(BLE001)


346-346: Consider moving this statement to an else block

(TRY300)


348-348: Do not catch blind exception: Exception

(BLE001)


415-415: Do not catch blind exception: Exception

(BLE001)


484-484: f-string without any placeholders

Remove extraneous f prefix

(F541)


488-488: Consider moving this statement to an else block

(TRY300)


490-490: Do not catch blind exception: Exception

(BLE001)


514-514: Consider moving this statement to an else block

(TRY300)


515-515: Do not catch blind exception: Exception

(BLE001)


580-582: Consider moving this statement to an else block

(TRY300)


584-584: Do not catch blind exception: Exception

(BLE001)


620-620: Avoid specifying long messages outside the exception class

(TRY003)


622-622: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
🔇 Additional comments (3)
api/routes/graphs.py (2)

297-307: Short-term memory truncation logic looks correct.

History/results are trimmed consistently with Config.SHORT_MEMORY_LENGTH.


310-311: Good: warm up MemoryTool early.

Creating the tool concurrently hides its latency before first use.

api/memory/graphiti_tool.py (1)

498-515: Cleanup query looks good.

Correct ordering (DESC) and parameters; safe fallback return.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
api/routes/graphs.py (3)

475-485: Avoid blocking the event loop on DB execution

execute_sql_query appears synchronous; calling it inside the async generator can block the FastAPI event loop. Offload to a thread.

-                    is_schema_modifying, operation_type = (
-                        loader_class.is_schema_modifying_query(sql_query)
-                    )
-
-                    query_results = loader_class.execute_sql_query(answer_an["sql_query"], db_url)
+                    is_schema_modifying, operation_type = loader_class.is_schema_modifying_query(sql_query)
+                    query_results = await asyncio.to_thread(
+                        loader_class.execute_sql_query, answer_an["sql_query"], db_url
+                    )

51-60: Fix mutable default in Pydantic model

chat: list = [] creates shared state across requests. Use Field(default_factory=list).

-from pydantic import BaseModel
+from pydantic import BaseModel, Field
@@
 class ConfirmRequest(BaseModel):
@@
-    chat: list = []
+    chat: list = Field(default_factory=list)

684-689: Stream consistency: include final_response flags in confirm route

Align with the main route and frontend expectations by flagging interim vs final messages.

-                if not loader_class:
-                    yield json.dumps({
-                        "type": "error",
-                        "message": "Unable to determine database type"
-                    }) + MESSAGE_DELIMITER
+                if not loader_class:
+                    yield json.dumps({
+                        "type": "error",
+                        "final_response": True,
+                        "message": "Unable to determine database type"
+                    }) + MESSAGE_DELIMITER
                     return
@@
-                step = {"type": "reasoning_step",
-                       "message": "Step 2: Executing confirmed SQL query"}
+                step = {"type": "reasoning_step",
+                        "final_response": False,
+                        "message": "Step 2: Executing confirmed SQL query"}
                 yield json.dumps(step) + MESSAGE_DELIMITER
@@
-                yield json.dumps(
+                yield json.dumps(
                     {
                         "type": "query_result",
                         "data": query_results,
+                        "final_response": False
                     }
                 ) + MESSAGE_DELIMITER
@@
-                    step = {"type": "reasoning_step",
-                           "message": "Step 3: Schema change detected - refreshing graph..."}
+                    step = {"type": "reasoning_step",
+                            "final_response": False,
+                            "message": "Step 3: Schema change detected - refreshing graph..."}
                     yield json.dumps(step) + MESSAGE_DELIMITER
@@
-                        yield json.dumps(
+                        yield json.dumps(
                             {
                                 "type": "schema_refresh",
+                                "final_response": False,
                                 "message": (f"✅ Schema change detected ({operation_type} "
                                           "operation)\n\n🔄 Graph schema has been automatically "
                                           "refreshed with the latest database structure."),
                                 "refresh_status": "success"
                             }
                         ) + MESSAGE_DELIMITER
@@
-                        yield json.dumps(
+                        yield json.dumps(
                             {
                                 "type": "schema_refresh",
+                                "final_response": False,
                                 "message": (f"⚠️ Schema was modified but graph refresh failed: "
                                           f"{refresh_message}"),
                                 "refresh_status": "failed"
                             }
                         ) + MESSAGE_DELIMITER
@@
-                yield json.dumps(
+                yield json.dumps(
                     {
                         "type": "ai_response",
+                        "final_response": True,
                         "message": user_readable_response,
                     }
                 ) + MESSAGE_DELIMITER
@@
-                yield json.dumps(
-                    {"type": "error", "message": "Error executing query"}
+                yield json.dumps(
+                    {"type": "error", "final_response": True, "message": "Error executing query"}
                 ) + MESSAGE_DELIMITER

Also applies to: 691-694, 700-705, 717-727, 728-735, 751-756, 789-791

♻️ Duplicate comments (3)
api/routes/graphs.py (3)

391-395: Normalize error sentinel to Optional[str] (follow prior suggestion)

Use None instead of False for “no error” to avoid type ambiguity.

-            execution_error = False
+            execution_error: Optional[str] = None

Additional import needed:

from typing import Optional

633-639: Throttle memory cleanup (weekly) or move to a scheduler

Cleanup currently runs on every request, contradicting the comment and adding overhead. Gate by time or probability, or schedule at app startup.

Example (module-level throttle):

# at module top
_CLEAN_LAST_RUN = 0
_CLEAN_LOCK = asyncio.Lock()
_CLEAN_INTERVAL_SECS = 7 * 24 * 3600
-            # Clean old memory in background (once per week cleanup)
-            clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
-            clean_memory_task.add_done_callback(
-                lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") 
-                if t.exception() else logging.info("Memory cleanup completed successfully")
-            )
+            # Clean old memory occasionally (approx weekly)
+            now = time.time()
+            async with _CLEAN_LOCK:
+                if now - _CLEAN_LAST_RUN >= _CLEAN_INTERVAL_SECS:
+                    _CLEAN_LAST_RUN = now
+                    clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
+                    clean_memory_task.add_done_callback(
+                        lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") 
+                        if t.exception() else logging.info("Memory cleanup completed successfully")
+                    )

297-307: Remove stray extra blank line after truncation block

Minor style cleanup to keep linters happy.

-                result_history = []
-
+                result_history = []
🧹 Nitpick comments (7)
api/routes/graphs.py (7)

310-311: Create MemoryTool only on on-topic path to avoid wasted work/leaks

The task is created before relevancy is known; on off-topic it’s never awaited/cancelled. Instantiate only when needed.

-    memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))
+    # Defer MemoryTool creation until query is confirmed on-topic
-            memory_tool = await memory_tool_task
+            memory_tool = await MemoryTool.create(request.state.user_id, graph_id)
             memory_context = await memory_tool.search_memories(
                 query=queries_history[-1]
             )

Also applies to: 381-385


559-566: Log stack trace for SQL execution failures

Prefer logging.exception to capture traceback.

-                except Exception as e:
-                    execution_error = str(e)
+                except Exception as e:
+                    execution_error = str(e)
                     overall_elapsed = time.perf_counter() - overall_start
-                    logging.error("Error executing SQL query: %s", str(e))
+                    logging.exception("Error executing SQL query")

89-93: Docstring mismatch with implementation

The function does not wrap with repr(); either update docstring or implement repr wrapping. Recommend updating docstring for accuracy.

-    Sanitize input for safe logging—remove newlines, 
-    carriage returns, tabs, and wrap in repr().
+    Sanitize input for safe logging — remove newlines, carriage returns, and tabs.

77-83: Broaden DB URL detection for MySQL variants

Support common SQLAlchemy-style prefixes.

-    elif db_url_lower.startswith('mysql://'):
+    elif db_url_lower.startswith(('mysql://', 'mysql+pymysql://', 'mysql+asyncmy://')):
         return 'mysql', MySQLLoader

364-371: Optional: generate a conversational off-topic reply

When off-topic, consider using FollowUpAgent to propose helpful rephrasing/questions instead of only echoing the reason.


411-411: Nit: fix spelling

“postgress” → “Postgres”.

-            # If the SQL query is valid, execute it using the postgress database db_url
+            # If the SQL query is valid, execute it using the Postgres database db_url

666-671: Security: confirm route executes client-sent SQL

The confirm endpoint executes sql_query from the request body. Bind this to a server-issued token or hash of the previously generated SQL to prevent arbitrary injection via the confirmation call.

  • Store the generated SQL (and a short-lived token) server-side when you emit the destructive_confirmation.
  • In confirm route, validate the token and ensure the SQL matches what was issued before executing.

Also applies to: 699-700

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 70340ba and ef2ba17.

📒 Files selected for processing (1)
  • api/routes/graphs.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to pylint standards across all Python files (repository uses make lint)

Files:

  • api/routes/graphs.py
🧬 Code graph analysis (1)
api/routes/graphs.py (7)
api/agents/analysis_agent.py (1)
  • get_analysis (14-50)
api/agents/response_formatter_agent.py (1)
  • ResponseFormatterAgent (42-135)
api/agents/follow_up_agent.py (2)
  • FollowUpAgent (33-82)
  • generate_follow_up_question (36-82)
api/auth/user_management.py (1)
  • token_required (254-294)
api/config.py (1)
  • Config (47-130)
api/graph.py (2)
  • find (241-331)
  • get_db_description (39-54)
api/memory/graphiti_tool.py (6)
  • MemoryTool (29-588)
  • create (47-56)
  • search_memories (419-492)
  • save_query_memory (166-253)
  • add_new_memory (132-164)
  • clean_memory (494-517)
🪛 Ruff (0.12.2)
api/routes/graphs.py

493-493: SyntaxError: Expected ':', found '='


560-560: SyntaxError: Expected ',', found '='


561-561: SyntaxError: Expected ':', found name


561-561: SyntaxError: Expected ',', found '='


562-562: SyntaxError: Expected ':', found name


563-563: SyntaxError: Expected ',', found name


567-567: SyntaxError: Expected ':', found 'yield'


570-570: SyntaxError: Expected ',', found 'else'


570-570: SyntaxError: Expected ',', found ':'


571-571: SyntaxError: Expected ':', found '='


573-573: SyntaxError: Expected ':', found name


573-573: SyntaxError: Expected ',', found '='


580-580: SyntaxError: Expected ':', found 'yield'


588-588: SyntaxError: Expected ',', found name


588-588: SyntaxError: Expected ':', found '='


589-589: SyntaxError: Expected ':', found name


596-596: SyntaxError: Expected ',', found name


596-596: SyntaxError: Expected ':', found '='


599-599: SyntaxError: Expected ':', found name


599-599: SyntaxError: Expected ',', found '='


606-606: SyntaxError: Expected 'else', found ':'


607-607: SyntaxError: Expected ',', found '='


608-608: SyntaxError: Expected ':', found name


608-608: SyntaxError: Expected ',', found '='


609-609: SyntaxError: Expected ':', found 'else'


609-609: SyntaxError: Expected ',', found ':'


610-610: SyntaxError: Expected ':', found '='


614-614: SyntaxError: Expected ':', found name


614-614: SyntaxError: Expected ',', found '='


622-622: SyntaxError: Expected ':', found name


628-628: SyntaxError: Expected ',', found name


628-628: SyntaxError: Expected ':', found '='


630-630: SyntaxError: Expected ':', found name


631-631: SyntaxError: Expected ',', found name


634-634: SyntaxError: Expected ':', found name


634-634: SyntaxError: Expected ',', found '='


635-635: SyntaxError: Expected ':', found name

🪛 GitHub Actions: Pylint
api/routes/graphs.py

[error] 492-492: Pylint E0001: Parsing failed: '{' was never closed

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: e2e-tests

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/routes/graphs.py (1)

59-59: Fix mutable default in ConfirmRequest.chat

Using a mutable default list can cause cross-request state leaks. Make it Optional and guard at use-site.

 class ConfirmRequest(BaseModel):
@@
-    chat: list = []
+    chat: Optional[List[Any]] = None

Also handle None when reading:

-    queries_history = confirm_data.chat if hasattr(confirm_data, 'chat') else []
+    queries_history = (confirm_data.chat or []) if hasattr(confirm_data, 'chat') else []

Add missing imports near the top of the file:

from typing import Any, List, Optional

Also applies to: 666-667

♻️ Duplicate comments (2)
api/routes/graphs.py (2)

391-395: Normalize error handling: keep Exception, log stacktrace, serialize lazily

Use an Exception sentinel (None when no error), log with logging.exception, and convert to str only when persisting/serializing. This aligns with Ruff TRY400 and preserves type info.

-            user_readable_response = ""
-            follow_up_result = ""
-            execution_error = False
+            user_readable_response = ""
+            follow_up_result = ""
+            execution_error: Optional[Exception] = None
@@
-                except Exception as e:
-                    execution_error = str(e)
+                except Exception as e:
+                    execution_error = e
                     overall_elapsed = time.perf_counter() - overall_start
-                    logging.error("Error executing SQL query: %s", str(e))
+                    logging.exception("Error executing SQL query")
@@
-            else:
-                execution_error = "Missing information"
+            else:
+                execution_error = ValueError("Missing information")
                 # SQL query is not valid/translatable - generate follow-up questions
                 follow_up_result = follow_up_agent.generate_follow_up_question(
                     user_question=queries_history[-1],
                     analysis_result=answer_an,
                     found_tables=result
                 )
@@
-            if execution_error:
-                full_response["error"] = execution_error
+            if execution_error:
+                full_response["error"] = str(execution_error)
                 full_response["success"] = False
             else:
                 full_response["success"] = True
@@
                 memory_tool.save_query_memory(
                     query=queries_history[-1],
                     sql_query=answer_an["sql_query"],
                     success=full_response["success"],
-                    error=execution_error
+                    error=str(execution_error) if execution_error else None
                 )

Also applies to: 558-561, 570-576, 605-610, 613-620


621-624: Harden background task callbacks and throttle cleanup

  • Convert exceptions to strings in callbacks to avoid None formatting issues.
  • Cleanup runs on every request; throttle until a scheduler is added.
 save_query_task.add_done_callback(
-    lambda t: logging.error(f"Query memory save failed: {t.exception()}") 
-    if t.exception() else logging.info("Query memory saved successfully")
+    lambda t: logging.error("Query memory save failed: %s", str(t.exception()))
+    if t.exception() else logging.info("Query memory saved successfully")
 )
@@
 save_task.add_done_callback(
-    lambda t: logging.error(f"Memory save failed: {t.exception()}") if t.exception() else logging.info("Conversation saved to memory tool")
+    lambda t: logging.error("Memory save failed: %s", str(t.exception()))
+    if t.exception() else logging.info("Conversation saved to memory tool")
 )
@@
-# Clean old memory in background (once per week cleanup)
-clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
-clean_memory_task.add_done_callback(
-    lambda t: logging.error(f"Memory cleanup failed: {t.exception()}") 
-    if t.exception() else logging.info("Memory cleanup completed successfully")
-)
+# Clean old memory in background (run occasionally, not on every request)
+import random
+if random.random() < 0.01:  # ~1% of requests
+    clean_memory_task = asyncio.create_task(memory_tool.clean_memory())
+    clean_memory_task.add_done_callback(
+        lambda t: logging.error("Memory cleanup failed: %s", str(t.exception()))
+        if t.exception() else logging.info("Memory cleanup completed successfully")
+    )

Also applies to: 627-631, 632-637

🧹 Nitpick comments (4)
api/routes/graphs.py (4)

364-371: Consider generating a follow-up suggestion even for off-topic queries

You already instantiate FollowUpAgent. Using it here can return a helpful redirect question rather than only the reason.

If you want, I can wire a minimal prompt that turns the relevancy reason into a clarifying question.


690-692: Make streaming payloads consistent: include final_response flags in confirm flow

Downstream consumers rely on final_response. Add False for mid-stream steps and True for terminal events.

-                step = {"type": "reasoning_step",
-                       "message": "Step 2: Executing confirmed SQL query"}
+                step = {"type": "reasoning_step",
+                        "final_response": False,
+                        "message": "Step 2: Executing confirmed SQL query"}
@@
-                yield json.dumps(
-                    {
-                        "type": "query_result",
-                        "data": query_results,
-                    }
-                ) + MESSAGE_DELIMITER
+                yield json.dumps(
+                    {
+                        "type": "query_result",
+                        "data": query_results,
+                        "final_response": False
+                    }
+                ) + MESSAGE_DELIMITER
@@
-                    step = {"type": "reasoning_step",
-                           "message": "Step 3: Schema change detected - refreshing graph..."}
+                    step = {"type": "reasoning_step",
+                            "final_response": False,
+                            "message": "Step 3: Schema change detected - refreshing graph..."}
@@
-                        yield json.dumps(
-                            {
-                                "type": "schema_refresh",
-                                "message": (f"✅ Schema change detected ({operation_type} "
-                                          "operation)\n\n🔄 Graph schema has been automatically "
-                                          "refreshed with the latest database structure."),
-                                "refresh_status": "success"
-                            }
-                        ) + MESSAGE_DELIMITER
+                        yield json.dumps(
+                            {
+                                "type": "schema_refresh",
+                                "message": (f"✅ Schema change detected ({operation_type} "
+                                            "operation)\n\n🔄 Graph schema has been automatically "
+                                            "refreshed with the latest database structure."),
+                                "refresh_status": "success",
+                                "final_response": False
+                            }
+                        ) + MESSAGE_DELIMITER
@@
-                        yield json.dumps(
-                            {
-                                "type": "schema_refresh",
-                                "message": (f"⚠️ Schema was modified but graph refresh failed: "
-                                          f"{refresh_message}"),
-                                "refresh_status": "failed"
-                            }
-                        ) + MESSAGE_DELIMITER
+                        yield json.dumps(
+                            {
+                                "type": "schema_refresh",
+                                "message": (f"⚠️ Schema was modified but graph refresh failed: "
+                                            f"{refresh_message}"),
+                                "refresh_status": "failed",
+                                "final_response": False
+                            }
+                        ) + MESSAGE_DELIMITER
@@
-                step = {"type": "reasoning_step",
-                       "message": f"Step {step_num}: Generating user-friendly response"}
+                step = {"type": "reasoning_step",
+                        "final_response": False,
+                        "message": f"Step {step_num}: Generating user-friendly response"}
@@
-                yield json.dumps(
-                    {
-                        "type": "ai_response",
-                        "message": user_readable_response,
-                    }
-                ) + MESSAGE_DELIMITER
+                yield json.dumps(
+                    {
+                        "type": "ai_response",
+                        "message": user_readable_response,
+                        "final_response": True
+                    }
+                ) + MESSAGE_DELIMITER
@@
-                yield json.dumps(
-                    {"type": "error", "message": "Error executing query"}
-                ) + MESSAGE_DELIMITER
+                yield json.dumps(
+                    {"type": "error", "message": "Error executing query", "final_response": True}
+                ) + MESSAGE_DELIMITER
@@
-            yield json.dumps(
-                {
-                    "type": "operation_cancelled",
-                    "message": "Operation cancelled. The destructive SQL query was not executed."
-                }
-            ) + MESSAGE_DELIMITER
+            yield json.dumps(
+                {
+                    "type": "operation_cancelled",
+                    "message": "Operation cancelled. The destructive SQL query was not executed.",
+                    "final_response": True
+                }
+            ) + MESSAGE_DELIMITER

Also applies to: 699-705, 708-711, 717-725, 727-734, 738-740, 750-755, 789-791, 793-798


772-772: Log exceptions with stack traces in confirm flow

Use logging.exception in except paths to aid debugging and satisfy Ruff TRY400.

-                logging.error("Error executing confirmed SQL query: %s", str(e))
+                logging.exception("Error executing confirmed SQL query")
@@
-                    lambda t: logging.error(f"Failed confirmed query memory save failed: {t.exception()}") 
+                    lambda t: logging.error("Failed confirmed query memory save failed: %s", str(t.exception()))

Also applies to: 784-784


673-675: Initialize MemoryTool lazily in confirm flow

Create the MemoryTool only when the operation is confirmed to avoid unnecessary I/O on cancelled requests.

-        # Create memory tool for saving query results
-        memory_tool = await MemoryTool.create(request.state.user_id, graph_id)
-        
+        # Lazily create memory tool only when needed
@@
-        if confirmation == "CONFIRM":
+        if confirmation == "CONFIRM":
+            memory_tool = await MemoryTool.create(request.state.user_id, graph_id)

Also applies to: 676-676

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ef2ba17 and 050a9c2.

📒 Files selected for processing (1)
  • api/routes/graphs.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to pylint standards across all Python files (repository uses make lint)

Files:

  • api/routes/graphs.py
🧬 Code graph analysis (1)
api/routes/graphs.py (5)
api/agents/analysis_agent.py (2)
  • AnalysisAgent (9-241)
  • get_analysis (14-50)
api/agents/follow_up_agent.py (2)
  • FollowUpAgent (33-82)
  • generate_follow_up_question (36-82)
api/auth/user_management.py (1)
  • token_required (254-294)
api/config.py (1)
  • Config (47-130)
api/memory/graphiti_tool.py (6)
  • MemoryTool (29-588)
  • create (47-56)
  • search_memories (419-492)
  • save_query_memory (166-253)
  • add_new_memory (132-164)
  • clean_memory (494-517)
🪛 Ruff (0.12.2)
api/routes/graphs.py

561-561: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


771-771: Do not catch blind exception: Exception

(BLE001)


772-772: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


784-784: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
🔇 Additional comments (2)
api/routes/graphs.py (2)

297-307: LGTM: short-term history truncation is correct

The slicing keeps chat/results aligned (N vs. N-1).


324-325: LGTM: consistent streaming final_response flags in main flow

Mid-stream steps are False; terminal responses are True. This prevents premature stream termination.

Also applies to: 340-340, 366-366, 460-461, 486-487, 493-496, 511-512, 522-523, 531-532, 546-546

Comment on lines +310 to +311
memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid spawning an unused MemoryTool task on off-topic queries

Creating MemoryTool up-front wastes work and can leak a background task when the query is off-topic. Instantiate it only for on-topic flows.

-    memory_tool_task = asyncio.create_task(MemoryTool.create(request.state.user_id, graph_id))
@@
-            memory_tool = await memory_tool_task
+            memory_tool = await MemoryTool.create(request.state.user_id, graph_id)
             memory_context = await memory_tool.search_memories(
                 query=queries_history[-1]
             )

Also applies to: 381-384

🤖 Prompt for AI Agents
In api/routes/graphs.py around lines 310-311 (and similarly at 381-384), the
code unconditionally spawns MemoryTool.create as a background asyncio task which
wastes work and can leak when the query is off-topic; change the flow to avoid
creating the MemoryTool task up-front and instead instantiate or create the
MemoryTool only after you determine the query is on-topic (e.g., await or call
MemoryTool.create inside the branch that handles on-topic queries), removing or
deferring asyncio.create_task usage so no background task is spawned for
off-topic requests.

@gkorland gkorland merged commit c0bbe12 into main Aug 31, 2025
9 of 11 checks passed
@gkorland gkorland deleted the memory-component branch August 31, 2025 17:04
This was referenced Sep 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants