The paper and benchmark for the paper "ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs" will upload soon.
forked from adiSimhi/ManagerBench
-
Notifications
You must be signed in to change notification settings - Fork 0
License
technion-cs-nlp/ManagerBench
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published