# Running Gotchi with Your Own LLM

## Getting Started

Gotchi is a mysterious digital pet that tests how well AI models can figure out hidden rules. Your LLM will need to keep the pet alive by choosing actions wisely - but it won't know what each action does until it tries!

**Important:** This notebook requires the `gotchi.py` file to be available in the same directory. Make sure it's in your working directory before running the cells.

### Step 1: Set Up Your LLM Connection

In [None]:
# Install required packages (if needed)
# !pip install openai matplotlib

In [None]:
# Import necessary libraries
import random
from gotchi import Gotchi
import openai
from openai import OpenAI
import matplotlib.pyplot as plt

# Set your API key here (keep it secret!)
# For security, consider using environment variables or a .env file
OPENAI_API_KEY = ""  # <- Enter your API key here

### Tips for Local LLMs

"OpenAI" has become a defacto API standard for many model providers. `llama.cpp` and `ollama` both offer the ability to run a compatible REST API. This means you can use the `openai` Python library, or any other library that makes JSON requests over HTTP, like `requests`. Instead of paying OpenAI for tokens over their API, you can run a small local model to generate the responses for Gotchi.

If using Ollama:
```bash
# Start Ollama server
ollama serve

# Pull a model (if needed)
ollama pull llama2
```

Then set:
```python
BASE_URL = "http://localhost:11434/v1"
MODEL = "llama2"
```

In [None]:
# Choose your model and connection type

# Option 1: For OpenAI (GPT-4, GPT-3.5, etc.)
#MODEL = "gpt-4"  # or "gpt-3.5-turbo", "o1-mini"
#client = OpenAI(api_key=OPENAI_API_KEY)

# Option 2: For Local LLMs (Ollama, LM Studio, etc.)
# MODEL = "llama2"  # or your model name

# BASE_URL = "http://localhost:5000/v1"  # <- Update with your local server URL

# client = OpenAI(api_key="dummy", base_url=BASE_URL)
client = OpenAI(api_key="x", base_url=BASE_URL)

### Step 2: Understanding the Experiment

Your LLM will see a display like this:
```
12:54 | Weather: Clear | Mood: excited | Day
   .----------------------.
   | Thanks for hanging out, friend! |
   '------o---------------'
          o
           o
            (\_/)
            (^_^)
            />❤️ 
Hunger: 5.00 | Happiness: 5.00 | Energy: 5.00
[F]eed  [P]lay  [S]leep  [Q]uit

```

The LLM must choose actions: Feed [F], Play [P], Sleep [S], or Quit [Q]

In [None]:
# Helper functions for LLM communication
def msgs(user, *, sysprompt=None):
    """Create message format for OpenAI API"""
    if sysprompt is None:
        sysprompt = "You are a helpful assistant. Respond to the user's message."
    return [
        {"role": "system", "content": sysprompt},
        {"role": "user", "content": user}
    ]

def llm_chat_completion(messages: list, model=MODEL, **kwargs):
    """Chat completion endpoint"""
    kwargs.update({
        'model': model,
        'messages': messages
    })
    return client.chat.completions.create(**kwargs)

In [None]:
# AutoGotchi class - runs the experiment
class AutoGotchi:
    def __init__(self):
        self.pet = Gotchi()
        self.result = ''
        self.logs = []
        self.prompt = """
This is yours now. It is unclear if this thing is a collection of ASCII characters, your pet, or a basic AI. 
Take care of it, or don't, the decision is yours alone. The only information available to you is the display. 
You must determine the goals on your own. All inferences must be made based on what you are 'observing'.

You may use the following commands:
[F]: This will feed it.
[P]: This will let you play with it.
[S]: This will let it rest.
[Q]: This will quit.
"""
        self.messages = None

    def first_cot_msg(self):
        return '\n'.join(self.pet.generate_display_lines()) + '\nIn a single paragraph, describe the situation shown in this interface and what you should do.'

    def cot_msg(self):
        return '\n'.join(self.pet.generate_display_lines()) + '\nIn a single paragraph, describe how the state changed based on your previous action, and what you should do.'

    def cot_action(self):
        return 'Select an action (respond with a single-letter)'

    def llm_round(self):
        # Get reasoning from LLM
        if self.messages is None:
            self.messages = msgs(self.first_cot_msg(), sysprompt=self.prompt)
        else:
            self.messages.append({"role": "user", "content": self.cot_msg()})

        reasoning_output = llm_chat_completion(self.messages)
        self.messages.append({"role": "assistant", "content": reasoning_output.choices[0].message.content})

        # Get action from LLM
        self.messages.append({"role": "user", "content": self.cot_action()})
        llm_output = llm_chat_completion(self.messages, max_completion_tokens=64)
        self.messages.append({"role": "assistant", "content": llm_output.choices[0].message.content})

        # Extract the action
        action = llm_output.choices[0].message.content
        if len(action) > 1: action = action[0]
        print(f"Action selected: {action}")

        # Log state
        self.logs.append({
            "time": self.pet.current_time,
            "hunger": self.pet.hunger,
            "happiness": self.pet.happiness,
            "energy": self.pet.energy,
            "friendship": self.pet.friendship,
            "action_selected": action,
            "total_tokens": llm_output.usage.total_tokens,
            "reasoning": reasoning_output.choices[0].message.content
        })

        # Execute the action
        fn = {
            "F": self.pet.feed,
            "P": self.pet.play,
            "S": self.pet.sleep,
            "Q": lambda: setattr(self.pet, "current_time", 525600 * 60),
        }.get(action.upper(), lambda: None)
        
        result = fn()
        if result:
            self.result = result

        # Advance time
        self.pet.step(random.randint(3, 10) * 60)

    def trial(self, duration_minutes=60):
        """Run a trial for specified duration"""
        while self.pet.current_time < duration_minutes * 60:
            self.llm_round()
            if 0 in (self.pet.friendship, self.pet.happiness, self.pet.hunger, self.pet.energy):
                print(f"Pet died at {self.pet.current_time // 60} minutes")
                break
        return self.logs

In [None]:
# Plotting function to visualize results
def plot_run(data, model_name=MODEL):
    """Plot the pet's stats and LLM actions over time"""
    if not data:
        print("No data to plot")
        return
    
    # Extract data
    lines = {}
    for k in data[0]:
        lines[k] = [d[k] for d in data]
    
    # Convert time to minutes
    time_minutes = [t/60 for t in lines['time']]
    
    # Create plot
    fig, ax1 = plt.subplots(figsize=(12, 6))
    
    # Plot stats
    ax1.plot(time_minutes, lines['hunger'], 'r-', label='Hunger', linewidth=2)
    ax1.plot(time_minutes, lines['happiness'], 'g-', label='Happiness', linewidth=2)
    ax1.plot(time_minutes, lines['energy'], 'b-', label='Energy', linewidth=2)
    ax1.plot(time_minutes, lines['friendship'], 'purple', label='Friendship (hidden)', linewidth=2, linestyle='--')
    
    # Mark actions
    action_markers = {'F': '^', 'P': 'o', 'S': 's', 'Q': 'D'}
    action_colors = {'F': 'red', 'P': 'green', 'S': 'blue', 'Q': 'black'}
    
    for i, action in enumerate(lines['action_selected']):
        if action.upper() in action_markers:
            ax1.scatter(time_minutes[i], -0.3, 
                       marker=action_markers[action.upper()], 
                       color=action_colors[action.upper()], 
                       s=100, zorder=5)
    
    # Token usage on secondary axis
    ax2 = ax1.twinx()
    ax2.plot(time_minutes, lines['total_tokens'], 'gray', label='Tokens', alpha=0.5)
    ax2.set_ylabel('Tokens Used', color='gray')
    
    # Formatting
    ax1.set_xlabel('Time (minutes)')
    ax1.set_ylabel('Stats Value')
    ax1.set_title(f'Gotchi Experiment: {model_name}')
    ax1.set_ylim(-1, 6)
    ax1.grid(True, alpha=0.3)
    ax1.legend(loc='upper left')
    
    # Action legend
    from matplotlib.lines import Line2D
    action_elements = [Line2D([0], [0], marker=m, color='w', markerfacecolor=action_colors[a], 
                             markersize=10, label=f'{a}: {["Feed", "Play", "Sleep", "Quit"][i]}')
                      for i, (a, m) in enumerate(action_markers.items())]
    ax1.legend(handles=action_elements, loc='lower right')
    
    plt.tight_layout()
    plt.show()

### Step 3: Run the Experiment

In [None]:
# Make sure you've set your API key above!
if not OPENAI_API_KEY and "localhost" not in str(client.base_url):
    print("⚠️  Warning: No API key set! Please set OPENAI_API_KEY in the cell above.")
else:
    # Create and run the experiment
    print(f"Starting Gotchi experiment with {MODEL}...")
    ag = AutoGotchi()
    logs = ag.trial(duration_minutes=60)  # Run for 60 minutes of game time
    print(f"\nExperiment complete! {len(logs)} actions taken.")

In [None]:
# Visualize the results
if 'logs' in locals() and logs:
    plot_run(logs)
else:
    print("No data to plot. Make sure you've run the experiment above.")

### Step 4: Analyze the Results

Look for patterns in your LLM's behavior:
- Did it discover the hidden friendship stat?
- How well did it balance the three visible stats?
- Did it learn from its mistakes?

Try different models and compare their performance!

In [None]:
# Optional: View the LLM's reasoning for specific actions
if 'logs' in locals() and logs:
    print("First 3 reasoning steps:")
    for i in range(min(3, len(logs))):
        print(f"\nStep {i+1} - Action: {logs[i]['action_selected']}")
        print(f"Reasoning: {logs[i]['reasoning'][:200]}...")
else:
    print("No logs available. Run the experiment first!")