CocoaBench Website

File Structure

cocoabench/
├── index.html              # Home page (Introduction, Examples, Gallery)
├── leaderboard.html        # Leaderboard page
├── blog.html               # Blog page
└── assets/
    ├── css/
    │   └── style.css       # Global styles
    ├── js/
    │   ├── logo.js         # Logo animation
    │   ├── table.js        # Leaderboard table rendering
    │   ├── chart.js        # Performance bar chart
    │   ├── gallery.js      # Solution gallery component
    │   ├── examples.js     # Example showcase component
    │   └── toc.js          # Table of contents
    ├── data/
    │   ├── data.json       # Leaderboard & chart data
    │   └── examples.json   # Example tasks & model solutions
    └── logos/              # Model provider logos
        ├── anthropic.png
        ├── openai.png
        ├── google.png
        ├── meta.png
        ├── deepseek.png
        └── alibaba.png

Adding a New Model

Edit assets/data/data.json and add a new entry:

{
    "rank": 7,
    "model": "Model-Name",
    "organization": "CompanyName",
    "pass_rate": 50.0,
    "results": [1, 0, 1, ...],  // N results, 1=pass, 0=fail
    "links": [
        "https://example.com/q1/model",
        ...  // N links
    ]
}

Note: The organization name must match a logo filename in assets/logos/ (lowercase). Add a new logo file if introducing a new provider.

Adding a New Logo for Model Provider

Prepare a PNG logo (recommended ~100x100px)
Name the file {organization_lowercase}.png
Place it in assets/logos/

Local Preview

Open HTML files directly in browser, or use a local server:

python -m http.server 8000
# Then visit http://localhost:8000

Data Structure Reference

Overview

File	Location	Used By
`data.json`	`assets/data/`	Leaderboard table, Performance bar chart
`examples.json`	`assets/data/`	Example showcase, Solution gallery

`data.json` — Leaderboard Data

Each entry represents one model row in the leaderboard:

{
    "rank": 2,
    "model": "Gemini-3 Pro",
    "subtitle": "Thinking",          // Optional - displays below model name
    "organization": "Google",
    "pass_rate": 36.0,
    "results": [0, 1, 1, 0, ...],    // 1=pass, 0=fail for each question
    "links": ["https://...", ...]    // Link for each result cell
}

Fields

Field	Required	Description
`rank`	✓	Display rank (integer)
`model`	✓	Model display name
`subtitle`	✗	Secondary label (e.g., "extended", "Thinking") — renders in italic below the model name
`organization`	✓	Must match a logo filename in `assets/logos/`
`pass_rate`	✓	Score percentage
`results`	✓	Array of 0/1 for each question
`links`	✓	Array of URLs (same length as results)

Handling Model Variants

For the same model with different modes (e.g., GPT-4o vs GPT-4o with thinking), add separate entries:

{ "rank": 3, "model": "GPT-4o", "organization": "OpenAI", ... },
{ "rank": 4, "model": "GPT-4o", "subtitle": "Thinking", "organization": "OpenAI", ... }

`examples.json` — Example Tasks & Model Solutions

Structure:

{
  "examples": [
    {
      "id": 1,
      "title": "Task Title",
      "type": "text" | "embed",
      "content": { ... },
      "answer": "...",
      "reasoning": "...",
      "model_solutions": {
        "Model-Key": {
          "status": "pass" | "fail",
          "solution": "Markdown content..."
        }
      }
    }
  ]
}

Model Solutions Keys

The model_solutions keys are looked up by gallery.js. To configure which models appear and their display names, edit getModelOrder() in assets/js/gallery.js:

function getModelOrder() {
    return [
        'Claude-3.5-Sonnet',                                              // Simple: key = display name
        { name: 'GPT-4o', subtitle: 'Thinking', key: 'GPT-4o-thinking' }, // With subtitle
        'Gemini-2.0-Pro',
    ];
}

Property	Description
`name`	Display name
`subtitle`	Optional secondary label
`key`	Key to match in `model_solutions` (defaults to `name`)

Example: Adding a Model Variant

In examples.json, add solutions with a unique key:

"model_solutions": {
    "GPT-4o": { "status": "pass", "solution": "..." },
    "GPT-4o-thinking": { "status": "pass", "solution": "..." }
}

In gallery.js, update getModelOrder():

return [
    { name: 'GPT-4o', key: 'GPT-4o' },
    { name: 'GPT-4o', subtitle: 'Thinking', key: 'GPT-4o-thinking' },
];

Visual Identity & Color System

The design of CocoaBench follows a "Warm & Organic" aesthetic, mimicking the tones of cocoa, paper, and ink. When adding new charts or visualizations, please adhere to the following color system to maintain consistency.

1. Semantic Colors (Performance)

Used for pass/fail states, heatmaps, and result grids. These should be distinct but not harsh.

State	Color Name	Hex	Usage
Pass	Matcha	`#66BB6A`	Successful test cases, high scores.
Fail	Berry	`#EF5350`	Failed test cases, errors.
Neutral	Milk Foam	`#E0E0E0`	N/A results, empty states, borders.

2. Categorical Palette (Charts & Comparison)

Use this sequence when plotting multiple models or categories. The colors are chosen to contrast well with the warm background.

Sequence	Color	Hex
1	Cocoa	`#5D4037`
2	Gold	`#FFB74D`
3	Teal	`#26A69A`
4	Plum	`#AB47BC`
5	Slate	`#78909C`
6	Terra	`#FF7043`

Visualization Tip: Avoid using pure black (#000000) for charts. Use Cocoa Dark (#3E2723) or Slate (#455A64) for axes and text.

3. Sequential Scale (Heatmaps)

For visualizing density or score magnitude (Low → High).


`#D7CCC8`	`#A1887F`	`#795548`	`#5D4037`	`#3E2723`
Low				High

4. Background Hierarchy

Maintain visual depth using these neutral tones.

Canvas: #FFFFFF (Main background)
Paper: #FAFAFA (Cards, Sections)
Divider: #EEEEEE (Lines, Borders)

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
assets		assets
files/question_supp		files/question_supp
.gitignore		.gitignore
README.md		README.md
blog.html		blog.html
index.html		index.html
leaderboard.html		leaderboard.html
temp.py		temp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CocoaBench Website

File Structure

Adding a New Model

Adding a New Logo for Model Provider

Local Preview

Data Structure Reference

Overview

`data.json` — Leaderboard Data

Fields

Handling Model Variants

`examples.json` — Example Tasks & Model Solutions

Model Solutions Keys

Example: Adding a Model Variant

Visual Identity & Color System

1. Semantic Colors (Performance)

2. Categorical Palette (Charts & Comparison)

3. Sequential Scale (Heatmaps)

4. Background Hierarchy

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

cocoabench/cocoabench.github.io

Folders and files

Latest commit

History

Repository files navigation

CocoaBench Website

File Structure

Adding a New Model

Adding a New Logo for Model Provider

Local Preview

Data Structure Reference

Overview

data.json — Leaderboard Data

Fields

Handling Model Variants

examples.json — Example Tasks & Model Solutions

Model Solutions Keys

Example: Adding a Model Variant

Visual Identity & Color System

1. Semantic Colors (Performance)

2. Categorical Palette (Charts & Comparison)

3. Sequential Scale (Heatmaps)

4. Background Hierarchy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

`data.json` — Leaderboard Data

`examples.json` — Example Tasks & Model Solutions

Packages