Skip to content

Checkpoint#4

Closed
Ruiying-Ma wants to merge 1 commit intomainfrom
dev
Closed

Checkpoint#4
Ruiying-Ma wants to merge 1 commit intomainfrom
dev

Conversation

@Ruiying-Ma
Copy link
Copy Markdown
Collaborator

Overwrite main with ruiying content; Keep main history

@Ruiying-Ma Ruiying-Ma closed this Dec 26, 2025
NuryeNigusMekonen pushed a commit to NuryeNigusMekonen/DataAgentBench that referenced this pull request Apr 22, 2026
…, Cohere (ucbepic#38) to leaderboard

Verified Pass@1 numbers were re-computed from the raw submission JSONs
using common_scaffold/validate/validate.py:

  PR ucbepic#31  Pi Coding Agent + Claude Opus 4.6      → 0.5603 (ucbepic#1)
  PR ucbepic#32  Oracle Forge (Tenacious) + Sonnet 4.6  → 0.4554 (ucbepic#4)
  PR ucbepic#38  Oracle Forge (Cohere) + Gemini 2.0 F.  → 0.128  (ucbepic#10)

Adds a Submission column on both the README table and the website
leaderboard linking each submission to its PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant