Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 30, 2025

Integration tests failing due to find_code_tool returning stub data and fundamental type incompatibility between vector store output (SearchResult) and API response (CodeMatch).

Changes

Restored find_code implementation in app_bindings.py

  • Uncommented find_code import and replaced stub response with real pipeline call
  • Added error handling and statistics tracking
  • Fixed focus_languages type conversion (SemanticSearchLanguage → str)

Fixed SearchResult→CodeMatch type mismatch

Problem: find_code expected SearchResult to support CodeMatch interface, but they're incompatible types.

Extended SearchResult (chunks.py):

  • Added mutable config (extra="allow", validate_assignment=False)
  • Added optional score tracking fields: dense_score, sparse_score, rerank_score, relevance_score
  • Added file property for chunk.file compatibility

Updated find_code pipeline (find_code.py):

  • Created _convert_search_result_to_code_match() to transform vector store output → API response
  • Handles str/CodeChunk content variants
  • Constructs DiscoveredFile from paths, extracts spans and symbols
  • Converts SearchResult list to CodeMatch list before returning

Added CLI serialization

  • Implemented CodeMatch.serialize_for_cli() (models.py) for CLI output formatting

Example

# Vector store returns SearchResult
results = await vector_store.search(vector=query_vector)
# SearchResult(content=CodeChunk, file_path=Path, score=0.95)

# find_code now converts to CodeMatch for API response
matches = [_convert_search_result_to_code_match(r) for r in results]
# CodeMatch(file=DiscoveredFile, content=CodeChunk, span=(1,50), 
#           relevance_score=0.95, match_type=SEMANTIC)

Test Infrastructure Gap (Not Addressed)

Integration tests create test projects but don't index them before searching, resulting in empty vector stores. Tests need indexing fixtures added separately (detailed in NEXT_STEPS.md).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • astral.sh
    • Triggering command: curl -LsSf REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

We recently completed the features for this branch but haven't tested (TDD); the spec and tasks are in specs/003-our-aim-to/ (spec.md and tasks.md)

The unit tests look pretty good, maybe one or two failing. The integration tests; not so much. There are, I believe, around 90 failing right now.

Your job is to:

  1. triage the failing tests
  2. look for common causes or underlying issues that might be causing widespread failures
  3. Try to identify groups of issues that could be safely worked on concurrently. Triage and identifying these is the main priority.
  4. resolve any issues you can clearly identify solutions for
  5. report on status for the next person who works on it

Make sure to commit your changes and update your findings as you go; the instance may not last long enough for you to finish so it is important to "save our progress" so to speak.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits October 30, 2025 11:22
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI changed the title [WIP] Triage and resolve failing integration tests Fix find_code integration: restore implementation and resolve SearchResult→CodeMatch type mismatch Oct 30, 2025
Copilot AI requested a review from bashandbone October 30, 2025 11:33
@bashandbone bashandbone marked this pull request as ready for review October 30, 2025 12:44
Copilot AI review requested due to automatic review settings October 30, 2025 12:44
@bashandbone bashandbone merged commit 107864c into 003-our-aim-to Oct 30, 2025
7 checks passed
@bashandbone bashandbone deleted the copilot/triage-integration-tests-issues branch October 30, 2025 12:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR re-enables the find_code tool implementation after it was temporarily disabled. The changes restore real search functionality by replacing stub responses with actual vector search operations and add proper type conversion between internal SearchResult objects and the public CodeMatch API response model.

Key changes:

  • Restored the find_code_tool entrypoint to call the real implementation instead of returning stub data
  • Added type conversion from SearchResult to CodeMatch to fix structural incompatibilities
  • Extended SearchResult model to support dynamic score attributes for hybrid search workflows
  • Added serialize_for_cli() method to CodeMatch for CLI output formatting

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/codeweaver/server/app_bindings.py Re-enabled find_code implementation by replacing stub with real search call and updated error handling
src/codeweaver/core/chunks.py Extended SearchResult with mutable config, dynamic score fields, and file property for compatibility
src/codeweaver/agent_api/models.py Added serialize_for_cli method to CodeMatch for CLI display formatting
src/codeweaver/agent_api/find_code.py Added SearchResult-to-CodeMatch conversion function and updated response building logic
TRIAGE_FINDINGS.md Documentation of integration test issues and fixes applied
NEXT_STEPS.md Comprehensive documentation of problems, solutions, and remaining work
Comments suppressed due to low confidence (1)

src/codeweaver/agent_api/find_code.py:170

  • Variable query_intent_obj is not used.
            query_intent_obj = None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +135 to +138
class _FileInfo:
def __init__(self, path):
self.path = path
return _FileInfo(self.file_path) if self.file_path else None
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner class _FileInfo is defined inside a property method, which means a new class is created on every property access. Consider defining this as a module-level helper class or using a simpler approach like types.SimpleNamespace(path=self.file_path) for better performance and clarity.

Copilot uses AI. Check for mistakes.
# Step 7: Sort and limit
candidates.sort(key=lambda x: x.relevance_score, reverse=True)
results = candidates[:max_results]
candidates.sort(key=lambda x: x.relevance_score if hasattr(x, 'relevance_score') and x.relevance_score is not None else x.score, reverse=True)
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sorting key lambda is complex and difficult to read. Consider extracting it into a helper function like get_sort_score(candidate) that returns the relevance_score or falls back to score. This would improve readability and make the logic easier to test.

Suggested change
candidates.sort(key=lambda x: x.relevance_score if hasattr(x, 'relevance_score') and x.relevance_score is not None else x.score, reverse=True)
def get_sort_score(candidate):
if hasattr(candidate, 'relevance_score') and candidate.relevance_score is not None:
return candidate.relevance_score
return candidate.score
candidates.sort(key=get_sort_score, reverse=True)

Copilot uses AI. Check for mistakes.
if results:
top_files = list({r.file.path.name for r in results[:3]})
if code_matches:
top_files = list({m.file.path.name for m in code_matches[:3] if m.file and m.file.path})
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a set comprehension {...} then converting to list() loses the ordering of the top 3 files. Since the matches are already sorted by relevance, this should preserve order. Use a list comprehension with deduplication instead: seen = set(); [m.file.path.name for m in code_matches[:3] if m.file and m.file.path and m.file.path.name not in seen and not seen.add(m.file.path.name)] or use dict.fromkeys() to preserve order while removing duplicates.

Suggested change
top_files = list({m.file.path.name for m in code_matches[:3] if m.file and m.file.path})
# Deduplicate while preserving order
top_files = []
seen = set()
for m in code_matches[:3]:
if m.file and m.file.path:
name = m.file.path.name
if name not in seen:
top_files.append(name)
seen.add(name)

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +122
def serialize_for_cli(self) -> dict:
"""Serialize code match for CLI display.
Returns a dict suitable for rendering in CLI output formats.
"""
return {
"file_path": str(self.file.path),
"span": self.span,
"relevance_score": self.relevance_score,
"match_type": self.match_type.value,
"content_preview": self.content.content[:200] + "..." if len(self.content.content) > 200 else self.content.content,
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The content preview truncation logic is embedded in the serialization method. Consider adding a max_preview_length parameter (defaulting to 200) to make this configurable and improve testability.

Suggested change
def serialize_for_cli(self) -> dict:
"""Serialize code match for CLI display.
Returns a dict suitable for rendering in CLI output formats.
"""
return {
"file_path": str(self.file.path),
"span": self.span,
"relevance_score": self.relevance_score,
"match_type": self.match_type.value,
"content_preview": self.content.content[:200] + "..." if len(self.content.content) > 200 else self.content.content,
def serialize_for_cli(self, max_preview_length: int = 200) -> dict:
"""Serialize code match for CLI display.
Returns a dict suitable for rendering in CLI output formats.
"""
content = self.content.content
preview = content[:max_preview_length] + "..." if len(content) > max_preview_length else content
return {
"file_path": str(self.file.path),
"span": self.span,
"relevance_score": self.relevance_score,
"match_type": self.match_type.value,
"content_preview": preview,

Copilot uses AI. Check for mistakes.

# Track successful request in statistics
if context and context.request_context:
request_id = getattr(context.request_context, "request_id", "unknown")
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using getattr() with a default value makes the code more defensive, but if request_id is expected to always exist on request_context, this might silently hide missing attributes. Consider whether 'unknown' is an appropriate fallback or if this should raise an error to catch configuration issues early.

Suggested change
request_id = getattr(context.request_context, "request_id", "unknown")
request_id = context.request_context.request_id

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants