# Wild Drone LLM Workshop: Part 2 - Drone Safari Agent

Welcome to Part 2 of the Wild Drone LLM Workshop! Now that you understand LLM fundamentals, let's build something more exciting - an AI agent that can play the Drone Safari game using natural language commands!

## What You'll Learn

1. **Game-Playing Agents**: Building agents that can control complex systems
2. **Natural Language Control**: Converting human commands to robot actions
3. **State Management**: How agents track and respond to changing environments
4. **Agent Specialization**: Creating domain-specific AI agents

## The Drone Safari Game

The drone safari game is a grid-based game where you control a drone to photograph wildlife. Here are the rules:

### Objective:
Photograph all three animals (zebra, elephant, oryx) without crashing

### Game Rules:
- Don't crash into trees or fly outside boundaries
- Don't get too close to animals (they'll get scared!)
- Take pictures from exactly 2 cells away in the direction you're facing
- You only have 5 pictures total - use them wisely!

### Available Actions:
- `move(direction)` - Move forward, left, right, or back
- `turn(direction)` - Turn left or right  
- `take_picture()` - Take a photograph

Let's start by setting up the environment and exploring the game!

## Setup and Installation

Before we begin, let's set up our environment. If you completed Part 1, you should already have the required packages installed.

### API Key Setup

Make sure you have your **Gemini API key** configured from Part 1. If not, refer to the Part 1 notebook for setup instructions.

In [None]:
# Install required packages (if not already installed)
!pip install litellm matplotlib numpy python-dotenv

In [None]:
# Configuration and Setup
import os
import litellm
from dotenv import load_dotenv

# Load environment variables from .env file (if it exists)
load_dotenv()

# MODEL CONFIGURATION
# Get Gemini API key from environment variable
gemini_api_key = os.getenv('GEMINI_API_KEY')

if not gemini_api_key:
    print("GEMINI_API_KEY not found!")
    print("Please set your API key in one of these ways:")
    print("   1. Create a .env file with: GEMINI_API_KEY=your_key_here")
    print("   2. Set environment variable: export GEMINI_API_KEY=your_key_here")
    print("   3. Set it in this session: os.environ['GEMINI_API_KEY'] = 'your_key_here'")
    raise ValueError("GEMINI_API_KEY environment variable is required")

# Set the API key for litellm
os.environ['GEMINI_API_KEY'] = gemini_api_key

# Set the model to use throughout the workshop
model_name = "gemini/gemini-2.5-flash"  # Using stable model

try:
    # Import agent implementations
    from llm_agents import DroneAgent, StrategicPlanningAgent, MultiAgentSystem
    
    # Import the drone game
    from drone_safari_game import DroneSafariGame
    
    print("Part 2 setup complete!")
    print(f"Using model: {model_name}")
    print("API key loaded securely from environment")
    print("DroneAgent and game classes loaded successfully!")
    
except ImportError as e:
    print(f"Import error: {e}")
    print("Make sure all required files are in the same directory as this notebook")

# Exploring the Drone Safari Game

Let's first understand how the game works by exploring it manually, then we'll build an AI agent to play it.

In [None]:
# Create a game instance and explore it
game = DroneSafariGame()

# Show the initial game state
print("Initial Game State:")
print("="*40)
status = game.get_status()
print(f"Drone Position: {status['position']}")
print(f"Facing: {status['facing']}")
print(f"Pictures Remaining: {status['pictures_remaining']}")
print(f"Animals to Photograph: {list(status['animals_photographed'].keys())}")

# Visualize the game
game.visualize()

In [None]:
# Let's try some manual moves to understand the game
print("Testing Manual Game Actions:")
print("="*40)

# Move forward a few steps
result = game.move('f')
print(f"Move forward: {result}")

# Turn right
result = game.turn('right')  
print(f"Turn right: {result}")

# Check current status
status = game.get_status()
print(f"\nCurrent position: {status['position']}")
print(f"Facing: {status['facing']}")

# Visualize current state
game.visualize()

# Building the Drone Agent

Now let's create an AI agent that can play this game using natural language commands. The key is understanding how we translate natural language into specific game actions.

In [None]:
# Understanding the Drone Agent

# Create a fresh game and agent (implementation details are in llm_agents.py)
game = DroneSafariGame()
drone_agent = DroneAgent(game, model=model_name)

print("Drone Agent created!")
print("\nKey Learning: The SYSTEM PROMPT that controls the drone:")
print("="*70)

# Show the actual system prompt used by the drone agent
drone_prompt = drone_agent.get_system_prompt()
print(drone_prompt)

print("="*70)
print("This prompt teaches the LLM:")
print("   • What the game rules are")
print("   • What actions are available") 
print("   • How to use the tool functions")
print("   • Strategic considerations")

print("\nAvailable Tools for the Drone:")
for tool in drone_agent.tools:
    print(f"- {tool.__name__}: {tool.__doc__.strip()}")

print("\nReady to test natural language commands!")

In [None]:
# Test the drone agent with natural language commands
print("Testing Drone Agent with Natural Language:")
print("="*50)

# Test command 1 - Normal mode
command1 = "Move forward to start exploring"
print(f"Command: {command1}")
result1 = drone_agent.process_command(command1)
print(f"Result: {result1}")
print()

# Show current state after first command
game.visualize()

print("\n" + "="*60)
print("Debug Mode - See the agent's thought process:")
print("="*60)

# Test the same command with debug mode
command2 = "Turn right and move forward twice"
print(f"Debug Command: {command2}")
result2 = drone_agent.process_command(command2, debug=True)

In [None]:
# Test more commands
print("Testing More Commands:")
print("="*30)

commands = [
    "Turn right to face east",
    "Move forward three times",
    "Take a picture to see what's there"
]

for i, command in enumerate(commands, 1):
    print(f"\nCommand {i}: {command}")
    result = drone_agent.process_command(command)
    print(f"Result: {result}")
    
    # Show game state after each command
    status = game.get_status()
    print(f"Position: {status['position']}, Facing: {status['facing']}, Pictures: {status['pictures_remaining']}")
    
    if status['game_over']:
        print("Game Over!")
        break

# Show final game state
game.visualize()

In [None]:
# Interactive testing - try your own commands!
print("Interactive Mode - Try Your Own Commands!")
print("="*50)
print("Example commands:")
print("- 'Navigate to the zebra'")
print("- 'Search for animals in the area'")
print("- 'Position for a photo of the elephant'")
print("- 'Avoid the trees and move safely'")

# Reset game for fresh testing
game = DroneSafariGame()
drone_agent = DroneAgent(game)

def test_command(command: str):
    """Helper function to test a command and show results"""
    print(f"\nCommand: {command}")
    result = drone_agent.process_command(command)
    print(f"Agent Response: {result}")
    
    status = game.get_status()
    print(f"Position: {status['position']}, Facing: {status['facing']}")
    print(f"Pictures remaining: {status['pictures_remaining']}")
    
    game.visualize()
    return status['game_over']

# Test a sample command
sample_command = "Move forward carefully to explore the area"
test_command(sample_command)

# Uncomment to test interactively:
# while not game.get_status()['game_over']:
#     user_command = input("\nYour command (or 'quit' to stop): ")
#     if user_command.lower() == 'quit':
#         break
#     test_command(user_command)

# Extra Exercise: Multi-Agent System (Advanced)

Ready for an advanced challenge? Let's create a multi-agent system where multiple AI agents work together to play the game autonomously!

## Multi-Agent Architecture

Our system will have two specialized agents:

### 1. Strategic Planning Agent
- **Role**: High-level game strategy and decision making
- **Responsibilities**: 
  - Analyze the game map and current state
  - Decide which animal to target next
  - Plan safe routes to avoid obstacles
  - Determine when and where to take pictures

### 2. Drone Control Agent (from above)
- **Role**: Execute specific game actions
- **Responsibilities**:
  - Translate natural language commands to game actions
  - Execute move, turn, and take_picture commands
  - Report action results back to the planning agent

## How They Work Together

1. **Planning Agent** observes the game state
2. **Planning Agent** decides what to do next in natural language
3. **Drone Control Agent** receives the command and executes the action
4. **Drone Control Agent** reports the result
5. Repeat until game completion

This separation of concerns makes the system more robust and easier to debug!

In [None]:
# Understanding Strategic Planning Agent

# Create the strategic planning agent (implementation in llm_agents.py)
strategic_agent = StrategicPlanningAgent(model=model_name)

print("Strategic Planning Agent Created!")
print("\nKey Learning: The STRATEGIC PLANNING PROMPT:")
print("="*70)

# Show the strategic planning system prompt
strategic_prompt = strategic_agent.get_system_prompt()
print(strategic_prompt)

print("="*70)
print("This prompt teaches the strategic agent to:")
print("   • Analyze the game state comprehensively")
print("   • Plan efficient routes to all animals")
print("   • Consider safety and picture limitations")
print("   • Generate clear commands for the drone agent")

# Test the strategic planning
game = DroneSafariGame()
print("\nStrategic Agent Analysis Example:")
print("="*50)

decision = strategic_agent.decide_next_action(game)
print(f"Strategic Decision: {decision}")

# Show the game state
game.visualize()

In [None]:
# Understanding Multi-Agent Coordination

# Create the multi-agent system (implementation in llm_agents.py)
game = DroneSafariGame()
multi_agent_system = MultiAgentSystem(game, model=model_name)

print("Multi-Agent System Created!")
print("="*60)
print("Strategic Agent: Analyzes game state and makes high-level decisions")
print("Drone Agent: Executes specific game actions based on natural language")
print("Coordination: Strategic agent commands → Drone agent actions")

print("\nKey Concept: Agent Specialization")
print("- Strategic Agent: 'What should we do?' (Planning)")
print("- Drone Agent: 'How do we do it?' (Execution)")
print("- This separation makes the system more robust and easier to debug")

print("\nThe system can now play the game autonomously!")
print("Each step involves:")
print("1. Strategic agent analyzes current state")
print("2. Strategic agent decides next action in natural language")
print("3. Drone agent receives command and executes it")
print("4. Repeat until game completion")

print("\nReady to run autonomous gameplay!")

In [None]:
# Test the Multi-Agent System
print("Testing Multi-Agent Autonomous Gameplay")
print("="*60)

# Create a fresh game and multi-agent system
game = DroneSafariGame()
multi_agent_system = MultiAgentSystem(game)

# Run a few steps to see how it works
print("Running autonomous game for 5 steps...")
results = multi_agent_system.run_autonomous_game(show_steps=True, max_steps=5)

print("\nQuick Results:")
print(f"Success: {results['success']}")
print(f"Steps taken: {results['steps_taken']}")
print(f"Pictures used: {results['pictures_used']}")

# Show performance summary
print("\n" + multi_agent_system.get_performance_summary())

In [None]:
# Run a complete autonomous game (uncomment to run)
print("Complete Autonomous Game Test")
print("="*40)

# Uncomment the lines below to run a full autonomous game:
# WARNING: This may take a while and use API credits!

# game_fresh = DroneSafariGame()
# multi_agent_complete = MultiAgentSystem(game_fresh)
# 
# print("Running complete autonomous game...")
# final_results = multi_agent_complete.run_autonomous_game(show_steps=False, max_steps=30)
# 
# print("FINAL RESULTS:")
# print("="*50)
# for key, value in final_results.items():
#     if key != 'step_log':  # Skip detailed step log for summary
#         print(f"{key}: {value}")

print("To run a complete game, uncomment the code above!")
print("Note: This will use API credits and may take several minutes.")

In [None]:
# Final Challenge: Experiment with Prompts!

print("PROMPT ENGINEERING CHALLENGE")
print("="*50)
print("The power of LLM agents comes from well-crafted prompts!")
print("Try modifying the prompts to change agent behavior.")

print("\nChallenge Options:")

print("\n1. Enhance the Drone Agent Prompt")
print("   • Add personality traits (cautious, aggressive, efficient)")
print("   • Include more detailed safety instructions")
print("   • Add specific strategies for different animals")

print("\n2. Improve the Strategic Agent Prompt") 
print("   • Add risk assessment criteria")
print("   • Include pathfinding hints")
print("   • Create contingency planning instructions")

print("\n3. Create a New Specialized Agent")
print("   • Photography expert (optimizes picture taking)")
print("   • Safety monitor (prevents crashes)")
print("   • Explorer (maps unknown areas)")

print("\nExample: Enhanced Drone Prompt")
print("-" * 40)

enhanced_drone_prompt = '''
You are a CAUTIOUS AI drone pilot in a safari photography mission. 
Your personality: Safety-first, methodical, and patient.

CORE RULES:
1. ALWAYS prioritize safety over speed
2. Take time to analyze each move carefully  
3. Avoid risky maneuvers near obstacles
4. Double-check animal distances before photos

AVAILABLE ACTIONS:
- move_drone(direction) - move in direction (check for obstacles first!)
- turn_drone(direction) - turn left/right (safe rotation)
- take_picture_drone() - photograph (verify 2-cell distance!)

RESPONSE FORMAT: Always explain your safety reasoning first, then use the appropriate tool.
Example: "Checking for obstacles ahead... path is clear, safe to move. Using move_drone."
'''

print(enhanced_drone_prompt)

print("\nTry creating your own prompt variations!")
print("Remember: The prompt is the 'brain' of your agent!")

## Congratulations!

You've successfully built a complete LLM-powered drone safari agent and explored multi-agent systems!

### What You've Learned in Part 2:

1. **Game-Playing Agents**: How to build agents that can control complex systems
2. **Natural Language Control**: Converting human commands to robot actions
3. **Agent Architecture**: Building agents with specialized system prompts
4. **State Management**: How agents track and respond to changing environments
5. **Multi-Agent Coordination**: Strategic planning + execution separation
6. **Prompt Engineering**: The key to controlling agent behavior

### Key Takeaways:

1. **Prompts are Everything**: The system prompt defines your agent's personality, capabilities, and behavior
2. **Tool Integration**: LLMs become powerful when connected to external tools and APIs
3. **Agent Specialization**: Different agents for different tasks (planning vs. execution)
4. **Natural Language Interface**: Users can control complex systems with simple commands
5. **Iterative Improvement**: Agent behavior improves through prompt refinement

### Real-World Applications:

The techniques you learned apply directly to:
- **Autonomous drones** and robot swarms
- **Smart home** and IoT device control
- **Customer service** and support bots
- **Process automation** and workflow management
- **Research assistants** and data analysis tools

### Next Steps:

- **Experiment** with different prompts and personalities
- **Add new tools** and capabilities to your agents  
- **Try real APIs** (weather, maps, sensors) instead of mock data
- **Scale up** to more agents and complex coordination
- **Apply to real robotics** and drone systems

Keep experimenting and building amazing AI systems!

## Resources and Next Steps

### Useful Resources

#### LLM and Agent Frameworks
- **LiteLLM**: Universal LLM API - [docs](https://docs.litellm.ai/)
- **LangChain**: LLM application framework - [docs](https://python.langchain.com/)
- **CrewAI**: Multi-agent systems - [docs](https://docs.crewai.com/)
- **AutoGen**: Microsoft's multi-agent framework - [docs](https://microsoft.github.io/autogen/)

#### Advanced Topics
- **RAG (Retrieval Augmented Generation)**: Adding knowledge bases to LLMs
- **Function Calling**: Advanced tool integration patterns
- **Agent Memory**: Persistent memory systems for agents
- **Multi-Modal Agents**: Agents that work with images, audio, etc.

#### Model Providers
- **OpenAI**: GPT models via API
- **Anthropic**: Claude models
- **Ollama**: Local model hosting
- **Hugging Face**: Open source models

### What's Next?

1. **Experiment**: Try different models and prompting strategies
2. **Scale**: Build more complex multi-agent systems
3. **Specialize**: Create domain-specific agents for your use cases
4. **Learn**: Explore advanced frameworks like LangChain or CrewAI
5. **Share**: Contribute to the open source agent ecosystem!

---

**Thank you for completing the Wild Drone LLM Workshop!**

*Happy building!*