skills

Skills to guide Claude Code, Codex, and other coding agents on using the Weights & Biases AI developer platform to train models and build agents.

For model training

Log metrics and rich media during model training and fine-tuning
Track model training experiments
Analyze runs and experiment results to understand how the model is learning
Tune hyperparameters

For agent building

Trace agentic AI applications
Analyze traces and classify them into failure modes
Evaluate models with labeled datasets
Run online evaluations for production monitoring

Getting Started

npx skills add wandb/skills

Then set your W&B API key:

export WANDB_API_KEY=<your-key>

npx skills is a utility for installing skills into major coding agent CLIs. Use --global to install for all projects, or --agent <name> to target a specific agent. See the npx skills docs for more details.

Available Skills

Skill	Description	Status
`wandb-primary`	Comprehensive primary skill for agents working with Weights & Biases. Covers both the W&B and Weave SDK	claude-code: 32/35 (91%)

Benchmarks

We maintain a growing internal benchmark suite that evaluates each skill across coding agents and task categories. Skills are evaluated automatically on every merge to main.

Category	Tasks	Claude Code (`sonnet4.6`)	Codex (`gpt-5.3-codex`)
Weave analysis	26	97%*	63%*
Weave tooling	11	95%*	83%*
Model training	8	90%*	85%*
LLM finetuning & RL analysis	14	72%*	86%*
Failure & outlier detection	8	86%*	63%*

*Pass rates are +/- 3%. Many tasks span multiple categories.

Contributing

See CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.badges		.badges
.github/workflows		.github/workflows
.reuse/templates		.reuse/templates
skills/wandb-primary		skills/wandb-primary
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
cla.md		cla.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

skills

For model training

For agent building

Getting Started

Available Skills

Benchmarks

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 3

Languages

Folders and files

Latest commit

History

Repository files navigation

skills

For model training

For agent building

Getting Started

Available Skills

Benchmarks

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 3

Languages

Packages