Release v0.1.0 · tinycomputerai/bun-server-bench

First tagged release of bun-server-bench — a benchmark and trajectory dataset for evaluating AI coding agents on real-world Bun server engineering tasks.

Contents

50 versioned tasks across 15+ backend categories (auth, databases, idempotency, rate limiting, websockets, …), each engineered so a plausible-but-wrong solution passes public tests and fails the hidden ones.
120 verified SFT trajectories + 120 patch records (full-credit runs only; hidden tests and reference solutions excluded).

Published artifacts

Hugging Face dataset: https://huggingface.co/datasets/tinycomputerai/bun-server-bench-trajectories
Harbor dataset: https://hub.harborframework.com/datasets/tinycomputerai/bun-server-bench

Release assets below: full source tarball, SFT + patch JSONL, and the manifest (counts + source commit).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!