Skip to content

Cleaned repo#10

Merged
Ruiying-Ma merged 30 commits intomainfrom
shreya-civic-unstructured
Mar 4, 2026
Merged

Cleaned repo#10
Ruiying-Ma merged 30 commits intomainfrom
shreya-civic-unstructured

Conversation

@Ruiying-Ma
Copy link
Copy Markdown
Collaborator

@Ruiying-Ma Ruiying-Ma commented Mar 4, 2026

  • Cleaned agent scaffold & datasets and queries
  • Updated READMEs
  • Add scripts for stats & plotting (for reproducibility)

@Ruiying-Ma Ruiying-Ma changed the title Shreya civic unstructured Cleaned repo Mar 4, 2026
@Ruiying-Ma Ruiying-Ma merged commit 67bebc0 into main Mar 4, 2026
NuryeNigusMekonen pushed a commit to NuryeNigusMekonen/DataAgentBench that referenced this pull request Apr 22, 2026
…, Cohere (ucbepic#38) to leaderboard

Verified Pass@1 numbers were re-computed from the raw submission JSONs
using common_scaffold/validate/validate.py:

  PR ucbepic#31  Pi Coding Agent + Claude Opus 4.6      → 0.5603 (ucbepic#1)
  PR ucbepic#32  Oracle Forge (Tenacious) + Sonnet 4.6  → 0.4554 (ucbepic#4)
  PR ucbepic#38  Oracle Forge (Cohere) + Gemini 2.0 F.  → 0.128  (ucbepic#10)

Adds a Submission column on both the README table and the website
leaderboard linking each submission to its PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant