Skip to content

EnvCommons/taboo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Taboo

OpenReward Environment

Description

Taboo is an environment for evaluating agents on creative communication and word description under constraints. This environment wraps the Taboo implementation from TextArena, a framework for text-based game environments.

Capabilities

  • Creative description generation
  • Semantic reasoning with constraints
  • Team coordination through indirect communication
  • Vocabulary selection and paraphrasing

Compute Requirements

Taboo does not require a sandbox. It has minimal compute requirements.

License

MIT.

Tasks

There are two splits: train (1650 tasks) and test (1650 tasks). Each split contains 50 tasks across each of 33 variants:

  • Taboo-v0
  • Taboo-v0-train
  • Taboo-v0-raw
  • Taboo-v0-animals
  • Taboo-v0-animals-train
  • Taboo-v0-animals-raw
  • Taboo-v0-cars
  • Taboo-v0-cars-train
  • Taboo-v0-cars-raw
  • Taboo-v0-city/country
  • Taboo-v0-city/country-train
  • Taboo-v0-city/country-raw
  • Taboo-v0-food
  • Taboo-v0-food-train
  • Taboo-v0-food-raw
  • Taboo-v0-literature
  • Taboo-v0-literature-train
  • Taboo-v0-literature-raw
  • Taboo-v0-people
  • Taboo-v0-people-train
  • Taboo-v0-people-raw
  • Taboo-v0-tv
  • Taboo-v0-tv-train
  • Taboo-v0-tv-raw
  • Taboo-v0-long
  • Taboo-v0-long-train
  • Taboo-v0-long-raw
  • Taboo-v0-full
  • Taboo-v0-full-train
  • Taboo-v0-full-raw

Each task is seeded for reproducibility.

Reward Structure

This is a sparse reward environment. Rewards are mapped from TextArena's native range of {-1, 0, 1} to {0.0, 0.5, 1.0} via (raw + 1) / 2.

We do not use LLM graders for this environment; reward is determined programmatically.

Data

Game state is generated procedurally by the TextArena engine using seeded randomness. No external data files are required.

Tools

Agents are given a single tool:

  • send_message(message): Send a clue or guess. As Clue Giver, describe the word without using taboo words.

Time Horizon

Taboo is a multi-turn environment.

Environment Difficulty

Medium. Agents play as the Clue Giver for Team 0, giving clues to help their teammate guess the target word while avoiding the use of taboo words.

Other Environment Requirements

This environment requires an OpenAI API key (passed via secrets) to power the LLM opponents.

Safety

Agents in Taboo interact only with a word description game and have no access to external systems, the internet, or sensitive data. The environment does not present safety risks.

Citations

@software{textarena2024,
  author    = {Guertler, Leon and Banting, Wilfried and Pignatelli, Eduardo},
  title     = {TextArena},
  year      = {2024},
  publisher = {GitHub},
  url       = {https://github.com/LeonGuertler/TextArena}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors