-
Notifications
You must be signed in to change notification settings - Fork 25
Issues: clp-research/clembench
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[wordle] wordle h/h has a bug -- check if wordle clemgame does as well
#91
opened May 6, 2024 by
davidschlangen
[games] Do another round of prompt criticism / unification. For clembench 1.6
#87
opened May 1, 2024 by
davidschlangen
[leaderboard] rethink approach to deciding on which models to test
#83
opened Apr 19, 2024 by
davidschlangen
[evaluation] check bencheval score if certain episodes don't have scores file
#80
opened Apr 16, 2024 by
sherzod-hakimov
[framework] Integrate External API Backends (Together) that support many models
#78
opened Apr 11, 2024 by
sherzod-hakimov
model responses differ when run locally and when run via fastchat API
#70
opened Mar 19, 2024 by
davidschlangen
[structure] separate framework from benchmark (from games?) / make pip installable
#62
opened Feb 29, 2024 by
davidschlangen
[framework] remove special handling of programmatic Players
breaking
Framework users need to adjust their code
enhancement
New feature or request
#60
opened Feb 29, 2024 by
phisad
[framework] future-proof: allow any kind of data in conversation
#56
opened Feb 21, 2024 by
davidschlangen
"relaxed parsing mode" for games? relaxed move rules...
enhancement
New feature or request
games
#27
opened Nov 27, 2023 by
davidschlangen
wordle game should use standard method for checking whether it runs over max context size
enhancement
New feature or request
#23
opened Nov 21, 2023 by
davidschlangen
ProTip!
Find all open issues with in progress development work with linked:pr.