cocoabench/
├── index.html # Home page (Introduction, Examples, Gallery)
├── leaderboard.html # Leaderboard page
├── blog.html # Blog page
└── assets/
├── css/
│ └── style.css # Global styles
├── js/
│ ├── logo.js # Logo animation
│ ├── table.js # Leaderboard table rendering
│ ├── chart.js # Performance bar chart
│ ├── gallery.js # Solution gallery component
│ ├── examples.js # Example showcase component
│ └── toc.js # Table of contents
├── data/
│ ├── data.json # Leaderboard & chart data
│ └── examples.json # Example tasks & model solutions
└── logos/ # Model provider logos
├── anthropic.png
├── openai.png
├── google.png
├── meta.png
├── deepseek.png
└── alibaba.png
Edit assets/data/data.json and add a new entry:
{
"rank": 7,
"model": "Model-Name",
"organization": "CompanyName",
"pass_rate": 50.0,
"results": [1, 0, 1, ...], // N results, 1=pass, 0=fail
"links": [
"https://example.com/q1/model",
... // N links
]
}Note: The organization name must match a logo filename in assets/logos/ (lowercase). Add a new logo file if introducing a new provider.
- Prepare a PNG logo (recommended ~100x100px)
- Name the file
{organization_lowercase}.png - Place it in
assets/logos/
Open HTML files directly in browser, or use a local server:
python -m http.server 8000
# Then visit http://localhost:8000| File | Location | Used By |
|---|---|---|
data.json |
assets/data/ |
Leaderboard table, Performance bar chart |
examples.json |
assets/data/ |
Example showcase, Solution gallery |
Each entry represents one model row in the leaderboard:
{
"rank": 2,
"model": "Gemini-3 Pro",
"subtitle": "Thinking", // Optional - displays below model name
"organization": "Google",
"pass_rate": 36.0,
"results": [0, 1, 1, 0, ...], // 1=pass, 0=fail for each question
"links": ["https://...", ...] // Link for each result cell
}| Field | Required | Description |
|---|---|---|
rank |
✓ | Display rank (integer) |
model |
✓ | Model display name |
subtitle |
✗ | Secondary label (e.g., "extended", "Thinking") — renders in italic below the model name |
organization |
✓ | Must match a logo filename in assets/logos/ |
pass_rate |
✓ | Score percentage |
results |
✓ | Array of 0/1 for each question |
links |
✓ | Array of URLs (same length as results) |
For the same model with different modes (e.g., GPT-4o vs GPT-4o with thinking), add separate entries:
{ "rank": 3, "model": "GPT-4o", "organization": "OpenAI", ... },
{ "rank": 4, "model": "GPT-4o", "subtitle": "Thinking", "organization": "OpenAI", ... }Structure:
{
"examples": [
{
"id": 1,
"title": "Task Title",
"type": "text" | "embed",
"content": { ... },
"answer": "...",
"reasoning": "...",
"model_solutions": {
"Model-Key": {
"status": "pass" | "fail",
"solution": "Markdown content..."
}
}
}
]
}The model_solutions keys are looked up by gallery.js. To configure which models appear and their display names, edit getModelOrder() in assets/js/gallery.js:
function getModelOrder() {
return [
'Claude-3.5-Sonnet', // Simple: key = display name
{ name: 'GPT-4o', subtitle: 'Thinking', key: 'GPT-4o-thinking' }, // With subtitle
'Gemini-2.0-Pro',
];
}| Property | Description |
|---|---|
name |
Display name |
subtitle |
Optional secondary label |
key |
Key to match in model_solutions (defaults to name) |
-
In
examples.json, add solutions with a unique key:"model_solutions": { "GPT-4o": { "status": "pass", "solution": "..." }, "GPT-4o-thinking": { "status": "pass", "solution": "..." } }
-
In
gallery.js, updategetModelOrder():return [ { name: 'GPT-4o', key: 'GPT-4o' }, { name: 'GPT-4o', subtitle: 'Thinking', key: 'GPT-4o-thinking' }, ];
The design of CocoaBench follows a "Warm & Organic" aesthetic, mimicking the tones of cocoa, paper, and ink. When adding new charts or visualizations, please adhere to the following color system to maintain consistency.
Used for pass/fail states, heatmaps, and result grids. These should be distinct but not harsh.
Use this sequence when plotting multiple models or categories. The colors are chosen to contrast well with the warm background.
| Sequence | Color | Hex | Preview |
|---|---|---|---|
| 1 | Cocoa | #5D4037 |
|
| 2 | Gold | #FFB74D |
|
| 3 | Teal | #26A69A |
|
| 4 | Plum | #AB47BC |
|
| 5 | Slate | #78909C |
|
| 6 | Terra | #FF7043 |
Visualization Tip: Avoid using pure black (
#000000) for charts. Use Cocoa Dark (#3E2723) or Slate (#455A64) for axes and text.
For visualizing density or score magnitude (Low → High).
#D7CCC8 |
#A1887F |
#795548 |
#5D4037 |
#3E2723 |
| Low | High |
Maintain visual depth using these neutral tones.
- Canvas:
#FFFFFF(Main background) - Paper:
#FAFAFA(Cards, Sections) - Divider:
#EEEEEE(Lines, Borders)