Skip to content

Leaderboard submission: PromptQL (Opus 4.6, n=5)#23

Merged
Ruiying-Ma merged 2 commits intomainfrom
submission/promptql-opus46-wiki4-pp2-n5
Mar 19, 2026
Merged

Leaderboard submission: PromptQL (Opus 4.6, n=5)#23
Ruiying-Ma merged 2 commits intomainfrom
submission/promptql-opus46-wiki4-pp2-n5

Conversation

@anushruthasura
Copy link
Copy Markdown
Collaborator

Submission

  • Agent: PromptQL (Hasura)
  • Backbone LLM: Claude Opus 4.6 (Anthropic)
  • Dataset hints: Yes
  • Trials: 5 (approved by Aditya)
  • Datasets: 12 (civic and paper excluded per team guidance)
  • Additional notes: Agent uses a wiki-based knowledge system for data exploration and an LLM post-processing step for answer formatting.
  • File: submissions/promptql_opus46_wiki4_pp2_n5.json (270 entries)

Anushrut Gupta and others added 2 commits March 19, 2026 00:57
Agent: PromptQL (Hasura)
Backbone LLM: Claude Opus 4.6 (Anthropic)
Dataset hints: No per-query hints. Wiki learnings from 4 rounds of
  general-purpose data exploration prompts (identical across all 12
  projects). Domain Overview pages manually maintained.
Post-processing: LLM reformatting (Claude Opus 4.6) for validator
  compliance — formatting only, no answer changes.
Trials: 5 independent runs per query (n=5, approved by Aditya)
Datasets: 12 (civic and paper excluded per UCB team)
Total entries: 270 (54 queries × 5 runs)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agent: PromptQL (Hasura)
Backbone LLM: Claude Opus 4.6 (Anthropic)
Dataset hints: No per-query hints. Wiki learnings from 4 rounds of
  general-purpose data exploration prompts (identical across all 12
  projects). Domain Overview pages manually maintained.
Post-processing: LLM reformatting (Claude Opus 4.6) applied uniformly
  to ALL trials — formatting only, no answer changes, no
  result-informed selection.
Trials: 5 independent runs per query (n=5, approved by Aditya)
Datasets: 12 (civic and paper excluded per UCB team)
Total entries: 270 (54 queries × 5 runs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Ruiying-Ma Ruiying-Ma merged commit a25d69e into main Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants