Skip to content

Improve wording for queries and DB descriptions#1

Merged
Ruiqi-Chen-0216 merged 1 commit intoucbepic:mainfrom
shreyashankar:main
Jul 8, 2025
Merged

Improve wording for queries and DB descriptions#1
Ruiqi-Chen-0216 merged 1 commit intoucbepic:mainfrom
shreyashankar:main

Conversation

@shreyashankar
Copy link
Copy Markdown
Collaborator

The queries are awesome!!

This PR addresses grammar, clarity, and ambiguity issues in query files and database descriptions across all query
datasets.

Key changes:

  • Made queries unambiguous with clear instructions
  • Fixed grammar issues and standardized terminology in database descriptions
  • Centralized duplicate db_description.txt files into single files per dataset
  • Improved hints with clearer explanations

Outstanding issues to address:

  1. Vague hints in Yelp dataset: The hint "The datasets contain five tables total. Carefully identify which tables and columns contain the information required to answer the query" should specify which tables are needed for each query type.
  2. MongoDB symbol definitions: The stockmarket_symboldefinition/SymbolDirectoryDefinitions.bson file should be converted to text format & included in the hint directly or provided as a proper database connection, like the other databases.
  3. Missing ground truth validation: Though there are ground truth query scripts, we should have a file of all query answers and validation functions to check whether an agent's predicted answer is correct. For example, if the answer is a list of companies and quantities, the function should check that each company and quantity is present in the agent's string response.

@Ruiqi-Chen-0216 Ruiqi-Chen-0216 merged commit e5cf7c8 into ucbepic:main Jul 8, 2025
Ruiying-Ma pushed a commit that referenced this pull request Dec 5, 2025
Improve wording for queries and DB descriptions
NuryeNigusMekonen pushed a commit to NuryeNigusMekonen/DataAgentBench that referenced this pull request Apr 22, 2026
…, Cohere (ucbepic#38) to leaderboard

Verified Pass@1 numbers were re-computed from the raw submission JSONs
using common_scaffold/validate/validate.py:

  PR ucbepic#31  Pi Coding Agent + Claude Opus 4.6      → 0.5603 (ucbepic#1)
  PR ucbepic#32  Oracle Forge (Tenacious) + Sonnet 4.6  → 0.4554 (ucbepic#4)
  PR ucbepic#38  Oracle Forge (Cohere) + Gemini 2.0 F.  → 0.128  (ucbepic#10)

Adds a Submission column on both the README table and the website
leaderboard linking each submission to its PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants