# Getting Started with Codenames Bot

This notebook demonstrates the basic usage of the Codenames multi-agent environment and agents.

## Overview

The Codenames bot supports:
- **Multi-agent environments**: WordBatchEnv (word-based) and VectorBatchEnv (embedding-based)
- **Batched execution**: Run multiple games in parallel with `batch_size > 1`
- **Agent types**: Spymasters (give clues) and Guessers (select tiles)
- **Fixed agent IDs**: `["red_spy", "red_guess", "blue_spy", "blue_guess"]`

In [10]:
import numpy as np
import os
import sys
sys.path.append(os.path.dirname(os.getcwd()))

In [11]:

from envs.word_batch_env import WordBatchEnv
from envs.vector_batch_env import VectorBatchEnv
from agents.spymaster import RandomSpymaster, EmbeddingSpymaster, SpymasterParams
from agents.guesser import RandomGuesser, EmbeddingGuesser, GuesserParams

## 1. Creating a WordBatchEnv

The `WordBatchEnv` is a word-based environment where clues are strings and boards contain word lists.

In [12]:
# Create environment with batch_size=1 (single game)
env = WordBatchEnv(batch_size=1, seed=42)

print(f"Agent IDs: {env.agent_ids}")
print(f"Batch size: {env.batch_size}")
print(f"Board size: {env.board_size}")

Agent IDs: ['red_spy', 'red_guess', 'blue_spy', 'blue_guess']
Batch size: 1
Board size: 25


## 2. Resetting the Environment

Reset returns observations for all agents. Each agent sees different information:
- **Spymasters**: See board words and colors (know which tiles are red/blue/neutral/assassin)
- **Guessers**: See board words and revealed tiles, plus current clue

In [13]:
obs_dict = env.reset(seed=42)

print("\n=== Red Spymaster Observation ===")
print(f"Words: {obs_dict['red_spy']['words'][0][:5]}...")  # First 5 words
print(f"Colors: {obs_dict['red_spy']['colors'][0][:5]}...")  # 0=red, 1=blue, 2=neutral, 3=assassin
print(f"Revealed: {obs_dict['red_spy']['revealed'][0][:5]}...")

print("\n=== Red Guesser Observation ===")
print(f"Words: {obs_dict['red_guess']['words'][0][:5]}...")
print(f"Colors: {obs_dict['red_guess']['colors']}")  # None - guessers don't see colors!
print(f"Current clue: {obs_dict['red_guess']['current_clue']}")


=== Red Spymaster Observation ===
Words: ['BAND', 'CONDUCTOR', 'AIR', 'CZECH', 'CRASH']...
Colors: [0 1 2 2 2]...
Revealed: [False False False False False]...

=== Red Guesser Observation ===
Words: ['BAND', 'CONDUCTOR', 'AIR', 'CZECH', 'CRASH']...
Colors: None
Current clue: ['']


## 3. Creating Agents

We'll create both random and embedding-based agents.

In [14]:
# Random agents (baseline)
red_spy = RandomSpymaster(team="red")
red_guess = RandomGuesser(team="red")
blue_spy = RandomSpymaster(team="blue")
blue_guess = RandomGuesser(team="blue")

# Embedding agents (use semantic similarity)
try:
    red_spy_emb = EmbeddingSpymaster(
        team="red",
        params=SpymasterParams(
            n_candidate_clues=20,
            risk_tolerance=2.0,
            seed=42
        )
    )
    red_guess_emb = EmbeddingGuesser(
        team="red",
        params=GuesserParams(
            similarity_threshold=0.0,
            seed=42
        )
    )
    print("Embedding agents created successfully!")
except RuntimeError as e:
    print(f"Embedding agents unavailable: {e}")
    print("Install sentence-transformers: pip install sentence-transformers")

Embedding agents created successfully!


## 4. Running a Game Loop

Let's run a complete game with random agents.

In [15]:
# Reset environment
env = WordBatchEnv(batch_size=1, seed=42)
obs_dict = env.reset(seed=42)

# Run game
max_turns = 50
for turn in range(max_turns):
    # Check if game is over
    if env.game_state.game_over[0]:
        break
    
    # Get actions from all agents
    actions_dict = {
        "red_spy": red_spy.get_clue(obs_dict["red_spy"]),
        "red_guess": red_guess.get_guess(obs_dict["red_guess"]),
        "blue_spy": blue_spy.get_clue(obs_dict["blue_spy"]),
        "blue_guess": blue_guess.get_guess(obs_dict["blue_guess"]),
    }
    
    # Step environment
    obs_dict, rewards_dict, dones_dict, infos_dict = env.step(actions_dict)
    
    # Print turn info
    if turn % 5 == 0:
        print(f"Turn {turn}: Red cards left: {infos_dict['red_spy']['unrevealed_counts']['red'][0]}, "
              f"Blue cards left: {infos_dict['red_spy']['unrevealed_counts']['blue'][0]}")

# Game over
winner = infos_dict['red_spy']['winner'][0]
turn_count = infos_dict['red_spy']['turn_count'][0]
winner_name = "Red" if winner == 0 else "Blue" if winner == 1 else "None"

print(f"\nGame over! Winner: {winner_name}, Turns: {turn_count}")
print(f"Red total reward: {rewards_dict['red_spy'][0] + rewards_dict['red_guess'][0]}")
print(f"Blue total reward: {rewards_dict['blue_spy'][0] + rewards_dict['blue_guess'][0]}")

Turn 0: Red cards left: 9, Blue cards left: 8
Turn 5: Red cards left: 8, Blue cards left: 7

Game over! Winner: Blue, Turns: 2
Red total reward: -20.0
Blue total reward: 0.0


## 5. Understanding Agent Actions

### Spymaster Actions
Spymasters give clues with a word and a number:
```python
{
    "clue": ["FIRE"],  # List of strings (batch dimension)
    "clue_number": np.array([2])  # How many tiles relate to this clue
}
```

### Guesser Actions
Guessers select a tile index:
```python
{
    "tile_index": np.array([5])  # Index 0-24
}
```

In [16]:
# Example: Get a clue from random spymaster
env = WordBatchEnv(batch_size=1, seed=42)
obs_dict = env.reset(seed=42)

clue_action = red_spy.get_clue(obs_dict["red_spy"])
print(f"Spymaster clue action: {clue_action}")

guess_action = red_guess.get_guess(obs_dict["red_guess"])
print(f"Guesser action: {guess_action}")

Spymaster clue action: {'clue': ['CLUE'], 'clue_number': array([3])}
Guesser action: {'tile_index': array([4], dtype=int32)}


## 6. Batched Environments

You can run multiple games in parallel by setting `batch_size > 1`.

In [17]:
# Create batched environment (4 games in parallel)
batch_env = WordBatchEnv(batch_size=4, seed=42)
obs_dict = batch_env.reset(seed=42)

print(f"Batch size: {batch_env.batch_size}")
print(f"Number of games: {len(obs_dict['red_spy']['words'])}")  # 4 games
print(f"Words per game: {len(obs_dict['red_spy']['words'][0])}")  # 25 words each
print(f"Colors shape: {obs_dict['red_spy']['colors'].shape}")  # (4, 25)

# Agents automatically handle batched observations
clue_action = red_spy.get_clue(obs_dict["red_spy"])
print(f"\nBatched clue action:")
print(f"  Clues: {clue_action['clue']}")  # List of 4 clues
print(f"  Numbers: {clue_action['clue_number']}")  # Array of shape (4,)

Batch size: 4
Number of games: 4
Words per game: 25
Colors shape: (4, 25)

Batched clue action:
  Clues: ['TEST', 'TEST', 'RANDOM', 'WORD']
  Numbers: [3 3 2 2]


## 7. VectorBatchEnv

The `VectorBatchEnv` uses embeddings instead of words. This is useful for training agents that work directly with semantic representations.

In [18]:
# Create vector environment
vec_env = VectorBatchEnv(batch_size=1, seed=42)
vec_obs = vec_env.reset(seed=42)

print("\n=== VectorBatchEnv Observations ===")
print(f"Board vectors shape: {vec_obs['red_spy']['board_vectors'].shape}")  # (1, 25, 384)
print(f"Colors shape: {vec_obs['red_spy']['colors'].shape}")  # (1, 25)

# Clues are also vectors in VectorBatchEnv
print(f"Current clue vector shape: {vec_obs['red_guess']['current_clue_vec'].shape}")  # (1, 384)


=== VectorBatchEnv Observations ===
Board vectors shape: (1, 25, 384)
Colors shape: (1, 25)
Current clue vector shape: (1, 384)


## Summary

You now know how to:
- Create `WordBatchEnv` and `VectorBatchEnv` environments
- Create random and embedding-based agents
- Run game loops with multi-agent actions
- Use batched environments for parallel execution

Next notebooks:
- **02_multi_agent_experiments.ipynb**: Run experiments with trackers
- **03_training_single_agent.ipynb**: Train a single agent with RL
- **04_parameter_sweeps.ipynb**: Optimize agent parameters