Skip to content

Add swe-bench test - Not to run in regular PRs#100

Merged
0xba1a merged 9 commits intomainfrom
bala/swe-bench
Feb 11, 2026
Merged

Add swe-bench test - Not to run in regular PRs#100
0xba1a merged 9 commits intomainfrom
bala/swe-bench

Conversation

@0xba1a
Copy link
Copy Markdown
Member

@0xba1a 0xba1a commented Jan 28, 2026

No description provided.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.84%. Comparing base (cf15005) to head (2700bd0).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #100      +/-   ##
==========================================
- Coverage   92.27%   91.84%   -0.43%     
==========================================
  Files          21       21              
  Lines         880      834      -46     
==========================================
- Hits          812      766      -46     
  Misses         68       68              
Flag Coverage Δ
integration 60.55% <60.00%> (-4.00%) ⬇️
ollama_local 64.62% <33.33%> (+2.01%) ⬆️
slow-browser 53.71% <33.33%> (+2.23%) ⬆️
slow-other 70.74% <73.33%> (+4.37%) ⬆️
unit 65.22% <93.33%> (-10.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/microbots/MicroBot.py 99.33% <100.00%> (-0.11%) ⬇️
src/microbots/llm/anthropic_api.py 100.00% <100.00%> (ø)
src/microbots/llm/llm.py 100.00% <100.00%> (ø)
src/microbots/llm/ollama_local.py 98.30% <100.00%> (+0.09%) ⬆️
src/microbots/llm/openai_api.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

command: str = ""

class LLMInterface(ABC):
def __init__(self, system_prompt: str, max_retries: int = 3):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMInterface.__init__ was removed but _validate_llm_response still relies on self.retries, self.max_retries, and self.messages. This means any new subclass that forgets to manually initialize these will get an error at runtime, nothing in the interface signals they're required.
Maybe its better to restore a base __init__ and have subclasses call super().__init__(). The only subclass-specific part is whether messages includes a system prompt entry that can be handled after the super call by appending to messges in the subclass.This also removes the duplicated 3 lines in each subclass.

@0xba1a 0xba1a merged commit a68a96c into main Feb 11, 2026
12 of 13 checks passed
@0xba1a 0xba1a deleted the bala/swe-bench branch March 3, 2026 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants