# Using LLMs as High-Level Planners for Multi-Agent Coordination

This notebook provides a step-by-step guide to customizing and interacting with the RL environment.

## For Submission
1. Fill in your code in `submit.py`. 
   - Add your code *only* in the TODO sections marked by the '#' delimiter lines. Do not modify any other parts of the script.
   - You should implement any helper functions/classes in a separate `helper.py` file and import them in `submit.py`.
1. Submit `out.log` and `results.csv` generated by the `submit.py` script.


In [45]:
# Import necessary libraries and modules
import gymnasium as gym
import multigrid.envs
import matplotlib.pyplot as plt
from agents import AgentCollection

%matplotlib inline
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


---
## Initial Plan Generation

An intial plan can be generated by the `initial_planner` by invoking it with the grid size and number of agents available.

In [46]:
from models import claude_llm as llm
from planner import PromptPlanner

In [48]:
N, M = 10, 2
env = multigrid.envs.EmptyEnvV2(
    size=N,  # Specify the size of the grid, N
    agents=M,  # Specify number of agents, M
    goals=[(3, 3), (5, 5)],  # Specify target positions for agents
    mission_space="All targets are contained within the region from (3, 3) to (5, 5).",
    render_mode="rgb_array",
    hidden_goals=True,
    # max_steps=50, # For debugging, you can set a maximum number of steps
)

# Always reset the environment before starting
observations, infos = env.reset()

# Create a group of 2 agents
agents = AgentCollection(num=2)

planner = PromptPlanner(llm=llm, grid_size=N, observations=observations, infos=infos)

# Providing the agents with high-level instructions
mission = observations[0]["mission"]
plan = planner.initial_plan()
print(plan)
for agent, actions in plan.items():
    for action in actions:
        agents.tell({agent: action.serialize()})

while not agents.all_idle() and not env.unwrapped.is_done():
    # Obtain the low-level action for current time step for all agents
    a = agents.act()

    # Step the environment with the actions
    observations, rewards, terminations, truncations, infos = env.step(a)
    print(observations)
    print(a, rewards, terminations, truncations)

    plan = planner.replan(
        agents, observations, rewards, terminations, truncations, infos
    )
    for agent, actions in plan.items():
        for action in actions:
            agents.tell({agent: action.serialize()})

    # Render the environment
    img = env.render()
    plt.figure(figsize=(5, 5))
    plt.imshow(img)
    plt.show()

env.close()

# Search Plan for 10×10 Grid with 2 Agents to Find 2 Targets

## Mission Analysis
- Grid size: 10×10
- Agents: 2 (starting at position (1,1))
- Targets: 2 (locations unknown, no specific hints provided)
- No additional constraints or probabilistic information given

## Strategy Overview
Since we have 2 agents and 2 targets in a 10×10 grid with no specific location hints, I'll implement a complete grid coverage strategy by dividing the grid into two equal rectangular regions. With equal-sized regions and no location hints, a simple horizontal split is most efficient.

## Region Assignment
- **Region 1**: Coordinates (1,1) to (5,10) - Left half of grid (50 cells)
- **Region 2**: Coordinates (6,1) to (10,10) - Right half of grid (50 cells)

## Agent Plans

### Agent 1
1. **Start**: Agent is already at (1,1), which is within Region 1
2. **Search**: From position (1,1), search the rectangular area from (1,1) to (5,10)

### Agent 2
1. **Movement**: Move from (1,1) to (6,1) by traveling east 

AttributeError: 'PromptPlanner' object has no attribute 'tracker'