Skip to content

Run evaluation on full SWE-Bench #1693

@rezzie-rich

Description

@rezzie-rich

Love the progress so far!

Will you guys test and publish the full swe-bench and the 25% subset test besides just the swe-bench lite?

On auto-code-rover repo, it says 22% on swe-bench lite and 16% on full swe-bench. However, you guys have ACR at 16% on swe-bench lite. Is that the result you guys got or a typo?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions