-
Notifications
You must be signed in to change notification settings - Fork 1
Fix Embeddings #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Embeddings #64
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR renames the GraphRAG system to DeepGitAI throughout the codebase, representing a rebranding from "GraphRAG" to "DeepGitAI" and adds support for semantic search through repository embeddings.
- Comprehensive renaming from GraphRAG to DeepGitAI across all files
- Added embedding functionality for semantic search on README content
- Enhanced dependency management with new machine learning packages
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| start.sh | Updated command line output to show DeepGitAI instead of GraphRAG |
| src/views/GraphRAGPanel.tsx | Renamed component to DeepGitAIPanel and updated all references from GraphRAG to DeepGitAI |
| src/views/ContextPanel.tsx | Updated imports and panel references to use DeepGitAI naming |
| src/styles/_layout.scss | Updated CSS class names and styling from graphrag to deepgit-ai |
| src/styles/_graph.scss | Updated CSS selectors and class names to reflect DeepGitAI branding |
| src/lib/config.ts | Renamed API endpoints from GraphRAG to DeepGitAI format |
| environment.yml | Added ML dependencies for embeddings: sentence-transformers and torch |
| backend/requirements.txt | Added sentence-transformers and torch for embedding functionality |
| backend/app/services/deepgit_ai_service.py | Major service rename with added embedding support and semantic search capabilities |
| backend/app/main.py | Updated all Flask endpoints and variable names from GraphRAG to DeepGitAI |
| backend/app/config_manager.py | Updated configuration keys from graphrag to deepgit_ai |
| backend/app/config.example.json | Updated example configuration with DeepGitAI settings |
Comments suppressed due to low confidence (1)
backend/app/services/deepgit_ai_service.py:1399
- The SQL query is missing the embedding fields that are now part of the data structure. The query should include SET r.readme_embedding = $embedding, r.embedding_provider = $embedding_provider to match the data being processed in the loop.
MERGE (r:Repository {id: $repo_id})
SET r.readme_content = $content, r.readme_length = $length
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| is_archived, language_count, pull_requests, issues, | ||
| primary_language, created_at_year, license_info, topics, | ||
| contributors, stargazers, '', 0 # readme_content, readme_length | ||
| contributors, stargazers, '', 0, None, None # readme_content, readme_length, readme_embedding, embedding_provider |
Copilot
AI
Sep 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline comment with field descriptions should be updated to match the actual parameter order and names. The fields appear to be readme_content, readme_length, readme_embedding, embedding_provider but the comment shows different spacing and structure.
| contributors, stargazers, '', 0, None, None # readme_content, readme_length, readme_embedding, embedding_provider | |
| contributors, stargazers, # up to stargazers | |
| '', # readme_content | |
| 0, # readme_length | |
| None, # readme_embedding | |
| None # embedding_provider |
| conn.execute("ALTER TABLE Repository ADD COLUMN readme_embedding DOUBLE[]") | ||
| conn.execute("ALTER TABLE Repository ADD COLUMN embedding_provider STRING") |
Copilot
AI
Sep 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These ALTER TABLE statements are executed inside the exception handler but the retry query at line 1412 is still missing the embedding fields. The retry should include both readme_embedding and embedding_provider in the SET clause.
|
|
||
| return ( | ||
| <div className="d-flex flex-column h-100 graphrag-chat"> | ||
| <div className="d-flex flex-column h-100 deepgit-ai-chat" style={{ minHeight: 0 }}> |
Copilot
AI
Sep 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CSS class name 'deepgit-ai-chat' should be consistent with the hyphenated naming convention used elsewhere in the codebase. Consider using 'deepgit-ai-chat' consistently or update the corresponding CSS to match.
| <div className="d-flex flex-column h-100 deepgit-ai-chat" style={{ minHeight: 0 }}> | |
| <div className="d-flex flex-column h-100 deepgitai-chat" style={{ minHeight: 0 }}> |
| contributors STRING, | ||
| stargazers STRING, | ||
| readme_content STRING, | ||
| readme_length INT64, |
Copilot
AI
Sep 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The database schema includes embedding fields but there's no validation to ensure the embedding array has a consistent dimension. Consider adding a constraint or comment documenting the expected embedding dimension (e.g., 384 for sentence-transformers/all-MiniLM-L6-v2).
| readme_length INT64, | |
| readme_length INT64, | |
| -- readme_embedding is expected to be a 384-dimensional vector (e.g., from sentence-transformers/all-MiniLM-L6-v2) |
No description provided.