swe-bench

Here are 3 public repositories matching this topic...

smallcloudai / refact

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

open-source enterprise vscode self-hosted developer-tools on-prem fine-tuning rag ai-agent swe-bench

Updated Jul 11, 2025
Rust

logic-star-ai / insights

Star

We track and analyze the activity and performance of autonomous code agents in the wild

agents swe-agent swe-bench

Updated Jul 10, 2025
TypeScript

RanjanaRaghavan / swe-bench-evaluation

Star

This project explores how Large Language Models (LLMs) perform on real-world software engineering tasks, inspired by the SWE-Bench benchmark. Using locally hosted models like Llama 3 via Ollama, the tool evaluates code repair capabilities on Python repositories through custom test cases and a lightweight scoring framework.

generative-ai swe-bench

Updated Feb 17, 2025
TeX

Improve this page

Add a description, image, and links to the swe-bench topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the swe-bench topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly