A streamlined, crab-themed benchmarking leaderboard for comparing LLM models as OpenClaw coding agents. Built with Next.js 16, React 19, and Tailwind CSS.
- Clean tabbed interface for Success Rate, Speed, and Cost views
- Crab-themed rankings - Lobster for #1, Crab for #2, Shrimp for #3
- Visual bar chart showing model performance at a glance
- Simplified table with essential metrics only
- Color-coded scores (green/yellow/red) for quick assessment
- Provider color coding with brand colors (Anthropic, OpenAI, Google, etc.)
- Minimal, data-focused design inspired by SkateBench
- Circular score gauge showing overall performance percentage
- Category breakdown with scores by task type (Calendar, Coding, Research, etc.)
- Expandable task cards with detailed criterion-by-criterion scoring
- Grading type badges (Automated, LLM Judge, Hybrid)
- Status indicators for success, warnings, and timeouts
- Metadata display including OpenClaw version and submission timestamp
- Dark, minimal theme with focus on data clarity
- Crab and lobster emojis throughout for fun, themed experience
- Streamlined layout - removed unnecessary sections
- Tab-based navigation for different metric views
- Simple bar charts for quick visual comparison
- Clean typography with monospace code fonts
- Mobile-responsive design
- Framework: Next.js 16 (App Router)
- UI Library: React 19.2
- Styling: Tailwind CSS with custom design tokens
- Components: shadcn/ui (Radix UI primitives)
- Icons: Lucide React
- Date Formatting: date-fns
- Type Safety: TypeScript
├── app/
│ ├── page.tsx # Main leaderboard page
│ ├── submission/[id]/page.tsx # Submission detail page
│ ├── layout.tsx # Root layout
│ └── globals.css # Global styles & theme
├── components/
│ ├── leaderboard-table.tsx # Main table component
│ ├── task-breakdown.tsx # Expandable task list
│ ├── score-gauge.tsx # Circular score visualization
│ └── ui/ # shadcn/ui components
├── lib/
│ ├── types.ts # TypeScript interfaces
│ ├── mock-data.ts # Sample benchmark data
│ └── utils.ts # Utility functions
- Node.js 18+
- pnpm (recommended)
-
Install dependencies:
pnpm install
-
Run the development server:
pnpm dev
-
Open your browser: Navigate to http://localhost:3000
pnpm build
pnpm startThis project is a Next.js App Router app and should be deployed with Cloudflare Pages, not Workers. The error you saw (Missing entry-point to Worker script or to assets directory) happens when Wrangler is used without a Worker entry-point.
- Framework preset: Next.js
- Build command:
pnpm run build - Build output directory:
.next - Root directory:
/
- Do not use
npx wrangler deployfor Pages. Pages builds and deploys automatically from the repo. - If you already created a Workers project, create a new Pages project and connect this repo.
- Node.js 18+ and pnpm are supported by Pages; no extra config is required for this repo.
Currently, the app uses mock data from lib/mock-data.ts. To connect to real benchmark APIs:
Create files in app/api/:
// app/api/leaderboard/route.ts
export async function GET() {
const response = await fetch("YOUR_API_ENDPOINT/leaderboard");
const data = await response.json();
return Response.json(data);
}
// app/api/results/[id]/route.ts
export async function GET(
request: Request,
{ params }: { params: Promise<{ id: string }> },
) {
const { id } = await params;
const response = await fetch(`YOUR_API_ENDPOINT/results/${id}`);
const data = await response.json();
return Response.json(data);
}Update components to fetch from API routes instead of importing mock data:
// In app/page.tsx
const response = await fetch("/api/leaderboard");
const entries = await response.json();Create .env.local:
NEXT_PUBLIC_API_URL=https://your-api-domain.com
API_SECRET_KEY=your-secret-keyEdit app/globals.css to customize the color scheme:
:root {
--background: 0 0% 7%;
--primary: 217 91% 60%;
--success: 142 71% 45%;
--warning: 38 92% 50%;
/* ... */
}Edit lib/types.ts to add/modify provider brand colors:
export const PROVIDER_COLORS: Record<string, string> = {
anthropic: "#d97757",
openai: "#10a37f",
google: "#4285f4",
// Add more providers
};Add new task categories in lib/types.ts:
export const CATEGORY_ICONS: Record<string, string> = {
calendar: "📅",
coding: "💻",
// Add more categories
};Responsive table with sorting, filtering, and mobile card layout.
Props:
entries: LeaderboardEntry[]- Array of leaderboard entries
Expandable accordion showing detailed task results with criterion breakdowns.
Props:
tasks: TaskResult[]- Array of task results
Circular progress indicator showing overall score percentage.
Props:
score: number- Current scoremaxScore: number- Maximum possible score
See lib/types.ts for complete TypeScript interfaces:
LeaderboardEntry- Main leaderboard row dataTaskResult- Individual task performanceSubmission- Complete submission with all task details
- Semantic HTML elements (
<main>,<header>,<table>) - Keyboard navigation support
- WCAG AA contrast ratios
- Screen reader friendly labels
- Color + icon indicators (not color alone)
- Server-side rendering with Next.js 16
- Optimized images and assets
- Code splitting and lazy loading
- Turbopack for faster builds
- Chrome/Edge (latest)
- Firefox (latest)
- Safari (latest)
- Mobile browsers (iOS Safari, Chrome Mobile)
MIT
Built with v0 for OpenClaw