Feature/ca 18 allow mcp tools during pr analysis#124
Conversation
…egration - Introduced LLM reranking in the file review stages to improve issue prioritization based on changed files. - Added MCP tool execution capabilities for verifying HIGH/CRITICAL issues during the aggregation stage. - Updated prompt builder to conditionally include MCP tool instructions based on configuration. - Simplified scoring configuration by removing path-based and metadata boosts, focusing solely on content-type boosts. - Implemented unit tests for LLM reranker and scoring configuration to ensure correct functionality and boost application. - Enhanced collection manager to support on-disk vector storage configuration. - Qdrant vectors on disk
… for project deletion handling
|
/codecrow analyze |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Comment commands are not enabled for this project Check the job logs in CodeCrow for detailed error information. |
|
| Status | PASS WITH WARNINGS |
| Risk Level | MEDIUM |
| Review Coverage | 28 files analyzed in depth |
| Confidence | HIGH |
Executive Summary
This PR introduces Model Context Protocol (MCP) tool execution within the PR analysis workflow, spanning both the Java-based core services and the Python inference orchestrator. The changes implement necessary configuration DTOs, a new Reranker logic for the RAG pipeline, and the tool execution engine. While the core functionality is sound, there are concerns regarding data integrity during project deletion and potential logic discrepancies between the Java and Python ecosystems.
Recommendation
Decision: PASS WITH WARNINGS
The PR is architecturally sound but requires immediate attention to the project cleanup logic and a verification of cross-language DTO synchronization. Ensure that the manual deletion steps in ProjectService are updated or refactored before merging to prevent orphaned data.
Issues Overview
| Severity | Count | |
|---|---|---|
| 🟡 Medium | 3 | Issues that should be addressed |
| 🔵 Low | 1 | Minor issues and improvements |
Analysis completed on 2026-02-19 21:04:43 | View Full Report | Pull Request
📋 Detailed Issues (4)
🟡 Medium Severity Issues
Id on Platform: 3337
Category: 🐛 Bug Risk
File: .../rag/llm_reranker.py
Issue: When processing LLM rankings, the code adds 'missing items' that weren't in the LLM's ranked list. However, it doesn't filter out the results that WERE already included in the reranked list from the original results list if the LLM provided an incomplete list. This could lead to duplicate chunks in the final context if the LLM skips indices or the logic for 'included_ids' check is slightly offset by index management.
💡 Suggested Fix
Ensure the fallback loop clearly identifies which original indices have not been placed into the reranked list to avoid potential duplication of the first few items.
--- a/python-ecosystem/inference-orchestrator/service/rag/llm_reranker.py
+++ b/python-ecosystem/inference-orchestrator/service/rag/llm_reranker.py
@@ -211,7 +211,7 @@
# Add any missing items at the end
- included_ids = set(rankings)
+ included_ids = {idx for idx in rankings if isinstance(idx, int) and 0 <= idx < len(results)}
for i, result in enumerate(results):
- if i not in included_ids:
+ if i not in included_ids and len(reranked) < len(results):
result_copy = result.copy()
result_copy["_llm_rank"] = len(reranked) + 1Id on Platform: 3338
Category: ✨ Best Practices
File: .../service/ProjectService.java
Issue: When deleting a project, several child repositories are manually deleted by projectId. It is better to use JPA CascadeType.REMOVE on the Project entity if possible, or ensure these deletions are wrapped in the same transaction as the project deletion (which they currently are, but the list of manual deletions is growing long and prone to being missed).
💡 Suggested Fix
Consider moving these deletion logic into the Project entity using @onetomany(cascade = CascadeType.REMOVE) to ensure data integrity without manual boilerplate in the service layer.
No suggested fix providedId on Platform: 3340
Category: 🏗️ Architecture
File: .../analysis/CommentCommandRateLimitRepository.java:54
Issue: The implementation of deleteByProjectId is consistent with other repositories in the codebase (e.g., BranchRepository, RagIndexStatusRepository, AnalysisLockRepository) which use this naming convention for project cleanup operations. However, ensure that this method is annotated with @Modifying to execute as a DML statement.
💡 Suggested Fix
The method is correctly placed, but since it is a custom query using @query, it requires the @Modifying annotation for DELETE operations in Spring Data JPA.
--- a/java-ecosystem/libs/core/src/main/java/org/rostilos/codecrow/core/persistence/repository/analysis/CommentCommandRateLimitRepository.java
+++ b/java-ecosystem/libs/core/src/main/java/org/rostilos/codecrow/core/persistence/repository/analysis/CommentCommandRateLimitRepository.java
@@ -50,6 +50,7 @@
- @Modifying
@Query("DELETE FROM CommentCommandRateLimit r WHERE r.project.id = :projectId")
void deleteByProjectId(@Param("projectId") Long projectId);
}🔵 Low Severity Issues
Id on Platform: 3339
Category: 🧹 Code Quality
File: .../config/ProjectConfig.java:285
Issue: The ensureMainBranchInPatterns method contains a redundant check for null after it has already returned if 'main' is null at the start of the method.
💡 Suggested Fix
Remove the redundant null check for 'main'.
--- a/java-ecosystem/libs/core/src/main/java/org/rostilos/codecrow/core/model/project/config/ProjectConfig.java
+++ b/java-ecosystem/libs/core/src/main/java/org/rostilos/codecrow/core/model/project/config/ProjectConfig.java
@@ -284,3 +284,2 @@
public void ensureMainBranchInPatterns() {
String main = mainBranch();
- if (main == null)
- return;Files Affected
- .../config/ProjectConfig.java: 1 issue
- .../rag/llm_reranker.py: 1 issue
- .../analysis/CommentCommandRateLimitRepository.java: 1 issue
- .../service/ProjectService.java: 1 issue
This pull request introduces support for a new project-level configuration option,
useMcpTools, which enables LLMs to call VCS MCP tools during PR review stages. The changes span both backend configuration and the DTOs used for project and analysis requests, ensuring that this new option is available throughout the system. Minor formatting improvements were also made to several DTO records.Project configuration enhancements:
useMcpToolsboolean property toProjectConfig, including JSON serialization, default value, and updated constructors and accessors. This allows projects to specify whether MCP tools should be used during PR review. [1] [2] [3] [4] [5]ProjectConfig.javato describe the newuseMcpToolsoption and its intended use.DTO and request propagation:
useMcpToolsto theProjectDTOrecord and its construction logic, so project data reflects this new configuration. [1] [2] [3] [4]AiAnalysisRequestandAiAnalysisRequestImplto include theuseMcpToolsproperty, builder method, and accessor, allowing analysis requests to reference this option. [1] [2] [3] [4] [5] [6]Formatting improvements:
ProjectDTO.javafor improved readability and consistency. [1] [2] [3] [4] [5] [6] [7]Deployment configuration:
QDRANT_VECTORS_ON_DISK=trueto.env.samplefor vector storage configuration.