Skip to content

WordleBench is a benchmark for evaluating LLMs on their ability to solve Wordle puzzles.

Notifications You must be signed in to change notification settings

abronte/wordlebench

Repository files navigation

WordleBench

WordleBench is a benchmark for evaluating LLMs on their ability to solve Wordle puzzles.

Results

https://www.wordlebench.com/

Running the Benchmark

# Install dependencies
uv sync

# Run the benchmark
uv run python main.py

# Analyze results
uv run python analyze.py

About

WordleBench is a benchmark for evaluating LLMs on their ability to solve Wordle puzzles.

Topics

Resources

Stars

Watchers

Forks