Skip to content
@SWE-bench

SWE-bench

Organization for maintaining the SWE-bench/agent projects

SWE-bench

This organization contains the source code for SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.

Use the repositories in this organization to...

Also check out related organizations

  • SWE-bench-repos: Mirror clones for repositories used for SWE-bench style evalautions.
  • SWE-agent: Solve GitHub issue(s) automatically with a Language Model powered agent!

Pinned Loading

  1. SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2.9k 485

  2. experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 168 174

  3. sb-cli Public

    Run SWE-bench evaluations remotely

    Python 10

  4. swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 6

Repositories

Showing 6 of 6 repositories
  • SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2,867 MIT 485 36 8 Updated Apr 22, 2025
  • sb-cli Public

    Run SWE-bench evaluations remotely

    Python 10 MIT 0 3 0 Updated Apr 18, 2025
  • swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 6 2 2 Updated Mar 31, 2025
  • experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 168 174 6 13 Updated Mar 31, 2025
  • .github Public
    0 0 0 0 Updated Feb 26, 2025
  • humanevalfix-results Public archive

    Evaluation data + results for SWE-agent inference on HumanEvalFix task

    Jupyter Notebook 0 0 0 0 Updated Jul 12, 2024

Most used topics

Loading…