# `PokeAgent` LLM reasoning agents


**Note**: this notebooks requires a locally running Pokémon Showdown server. Please see the [getting started section](../getting_started.rst) for help on how to set one up.


In [1]:
from poke_env import RandomPlayer
from poke_env.data import GenData

# The RandomPlayer is a basic agent that makes decisions randomly,
# serving as a starting point for more complex agent development.
random_player = RandomPlayer()

### Creating a Battle

To create a battle, let's create a second agent and use the `battle_against` method. It's an asynchronous method, so we need to `await` it.

In [4]:
second_player = RandomPlayer()

# The battle_against method initiates a battle between two players.
# Here we are using asynchronous programming (await) to start the battle.
await random_player.battle_against(second_player, n_battles=1)

If you want to look at this battle, you can open a browser at [http://localhost:8000](http://localhost:8000) - you should see the battle in the lobby.

### Inspecting the Result

Here are a couple of ways to inspect the result of this battle.

In [5]:
# n_won_battles and n_finished_battles

print(
    f"Player {random_player.username} won {random_player.n_won_battles} out of {random_player.n_finished_battles} played"
)
print(
    f"Player {second_player.username} won {second_player.n_won_battles} out of {second_player.n_finished_battles} played"
)

# Looping over battles

for battle_tag, battle in random_player.battles.items():
    print(battle_tag, battle.won)

Player RandomPlayer 1 won 0 out of 1 played
Player RandomPlayer 2 won 1 out of 1 played
battle-gen9randombattle-697 False


You can look at more properties of the [Player](../modules/player.rst) and [Battle](../modules/battle.rst) classes in the documentation.

### Running a Cross-Evaluation

`poke-env` provides a `cross_evaluate` function, that allows you to run a cross evaluation between multiple agents. It will run a number of battles between the two agents, and return the results of the evaluation in a structured way.

In [6]:
from poke_env import cross_evaluate

third_player = RandomPlayer()

players = [random_player, second_player, third_player]

cross_evaluation = await cross_evaluate(players, n_challenges=5)
cross_evaluation

{'RandomPlayer 1': {'RandomPlayer 1': None,
  'RandomPlayer 2': 0.16666666666666666,
  'RandomPlayer 3': 0.8},
 'RandomPlayer 2': {'RandomPlayer 1': 0.8333333333333334,
  'RandomPlayer 2': None,
  'RandomPlayer 3': 0.2},
 'RandomPlayer 3': {'RandomPlayer 1': 0.2,
  'RandomPlayer 2': 0.8,
  'RandomPlayer 3': None}}

Here's one way to pretty print the results of the cross evaluation using `tabulate`:

In [7]:
from tabulate import tabulate

table = [["-"] + [p.username for p in players]]
for p_1, results in cross_evaluation.items():
    table.append([p_1] + [cross_evaluation[p_1][p_2] for p_2 in results])

print(tabulate(table))

--------------  ------------------  -------------------  --------------
-               RandomPlayer 1      RandomPlayer 2       RandomPlayer 3
RandomPlayer 1                      0.16666666666666666  0.8
RandomPlayer 2  0.8333333333333334                       0.2
RandomPlayer 3  0.2                 0.8
--------------  ------------------  -------------------  --------------


## Building a Max Damage Player

In this section, we introduce the `MaxDamagePlayer`, a custom agent designed to choose moves that maximize damage output.

### Implementing the MaxDamagePlayer Class

The primary task is to override the choose_move method. This method, defined as `choose_move(self, battle: Battle) -> str`, requires a `Battle` object as input, representing the current game state, and outputs a move order as a string. This move order must adhere to the [showdown protocol](https://github.com/smogon/pokemon-showdown/blob/master/sim/SIM-PROTOCOL.md) format. The `poke-env` library provides the `create_order` method to assist in formatting move orders directly from `Pokemon` and `Move` objects.

The `battle` parameter, a `Battle` object, encapsulates the agent's current knowledge of the game state. It provides various properties for easy access to game details, such as `active_pokemon`, `available_moves`, `available_switches`, `opponent_active_pokemon`, `opponent_team`, and `team`.

For this example, we'll utilize `available_moves`, which gives us a list of `Move` objects available in the current turn.

Our focus in implementing `MaxDamagePlayer` involves two key steps: interpreting the game state information from the battle object and then generating and returning a correctly formatted move order.

In [33]:
from poke_env.player import Player
from poke_env.environment.pokemon import Pokemon
from poke_env.environment.battle import Battle, AbstractBattle

def log_pokemon(pokemon: Pokemon, is_opponent: bool = False):
    print(f"[{pokemon.species} ({pokemon.name}) {'[FAINTED]' if pokemon.fainted else ''}]")
    print(f"Types: {[t.name for t in pokemon.types]}")
    if is_opponent:
        print(f'Possible Tera types {pokemon.tera_type}' if is_opponent else '')
    print(f"HP: {pokemon.current_hp}/{pokemon.max_hp} ({pokemon.current_hp_fraction*100:.1f}%)")
    print(f"Base stats: {pokemon.base_stats}")
    print(f"Stats: {pokemon.stats}")
    print(f"{'Possible abililities' if is_opponent else 'Ability'}: {pokemon.ability}")
    print(f"{'Possible items' if is_opponent else 'Item'}: {pokemon.item}")
    print(f"Status: {pokemon.status}")
    if pokemon.status:
        print(f"Status turn count: {pokemon.status_counter}")
    print(f"Moves:")
    for move in pokemon.moves.values():
        print(f"  {move.id} Base Power: {move.base_power} Accuracy: {move.accuracy * 100}% PP: ({move.current_pp}/{move.max_pp}) Priority: {move.priority}  ")
    print(f"Stats: {pokemon.stats}")
    
    print(f"Boosts: {pokemon.boosts}")


def log_player_info(battle: AbstractBattle):
    print("== Player Info ==")
    print(f"Active pokemon:")
    log_pokemon(battle.active_pokemon)
    print("Tera Type:", battle.can_tera)
    print('-'*10)
    # print rest of team
    print(f"Team: {battle.team}")
    for _, mon in battle.team.items():
        if not mon.active:
            log_pokemon(mon)
            print()


def log_opponent_info(battle: AbstractBattle):
    print(f"== Opponent Info ==")
    print("Opponent active pokemon:")
    print(log_pokemon(battle.opponent_active_pokemon, is_opponent=True))
    print(f"Opponent team: {battle.opponent_team}")

def log_battle_info(battle: AbstractBattle):
    print(f"== Battle Info ==")
    print(f"Turn: {battle.turn}")
    # Field info
    if battle.weather:
        print(f"Weather: {battle.weather}")
    if battle.fields:
        print(f"Fields: {battle.fields}")
    if battle.side_conditions:
        print(f"Side conditions: {battle.side_conditions}")
    if battle.opponent_side_conditions:
        print(f"Opponent side conditions: {battle.opponent_side_conditions}")
    if battle.trapped:
        print(f"Trapped: {battle.trapped}")
    
    

class GPTPlayer(Player):
    def choose_move(self, battle):

        # battle
        log_battle_info(battle)

        # player
        log_player_info(battle)

        # opponent
        log_opponent_info(battle)


        # active pokemon
        print("== Active pokemon ==")
        log_pokemon(battle.active_pokemon)


        if battle.available_moves:
            # Iterating over available moves to find the one with the highest base power
            best_move = max(battle.available_moves, key=lambda move: move.base_power)
            # Creating an order for the selected move
            return self.create_order(best_move)
        else:
            print('ERROR: No moves available')
            # If no attacking move is available, perform a random switch
            # This involves choosing a random move, which could be a switch or another available action
            return self.choose_random_move(battle)

In the `choose_move` method, our first step is to determine if there are any available moves for the current turn, as indicated by `battle.available_moves`. When a move is available, we select the one with the highest `base_power`. Formatting our choice is achieved by the `create_order`.

However, there are scenarios where no moves are available. In such cases, we use `choose_random_move(battle)`. This method randomly selects either a move or a switch, and guarantees that we will return a valid order.

The `Player.create_order` function is a crucial part of this process. It's a wrapper method that generates valid battle messages. It can take either a `Move` or a `Pokemon` object as its input. When passing a `Move` object, additional parameters such as `mega`, `z_move`, `dynamax`, or `terastallize` can be specified to indicate special battle actions.

We will adjust our strategy to include `terastallize` at the earliest opportunity, enhancing the effectiveness of our player in battle scenarios.

### Testing the MaxDamagePlayer

Next, we'll test our `MaxDamagePlayer` against a `RandomPlayer` in a series of battles:


In [38]:
# Creating players
random_player = RandomPlayer()
max_damage_player = GPTPlayer()

# Running battles
await max_damage_player.battle_against(random_player, n_battles=1)

# Displaying results
print(f"Max damage player won {max_damage_player.n_won_battles} / {max_damage_player.n_finished_battles} battles")

== Battle Info ==
Turn: 1
== Player Info ==
Active pokemon:
[mandibuzz (Mandibuzz) ]
Types: ['DARK', 'FLYING']
HP: 326/326 (100.0%)
Base stats: {'atk': 65, 'def': 105, 'hp': 110, 'spa': 55, 'spd': 95, 'spe': 80}
Stats: {'hp': 326, 'atk': 159, 'def': 227, 'spa': 142, 'spd': 210, 'spe': 185}
Ability: overcoat
Item: heavydutyboots
Status: None
Moves:
  defog Base Power: 0 Accuracy: 100.0% PP: (24/24) Priority: 0  
  uturn Base Power: 70 Accuracy: 100.0% PP: (32/32) Priority: 0  
  foulplay Base Power: 95 Accuracy: 100.0% PP: (24/24) Priority: 0  
  roost Base Power: 0 Accuracy: 100.0% PP: (8/8) Priority: 0  
Stats: {'hp': 326, 'atk': 159, 'def': 227, 'spa': 142, 'spd': 210, 'spe': 185}
Boosts: {'accuracy': 0, 'atk': 0, 'def': 0, 'evasion': 0, 'spa': 0, 'spd': 0, 'spe': 0}
Tera Type: STEEL (pokemon type) object
----------
Team: {'p1: Mandibuzz': mandibuzz (pokemon object) [Active: True, Status: None], 'p1: Grafaiai': grafaiai (pokemon object) [Active: False, Status: None], 'p1: Necrozma': 

## Other Initialization Options for `Player` Objects

### Specifying an Avatar

You can specify an `avatar` argument when initializing a `Player` object. This argument is a string, corresponding to the avatar's name.

You can find a [list of avatar names here](https://github.com/smogon/pokemon-showdown-client/blob/6d55434cb85e7bbe614caadada819238190214f6/play.pokemonshowdown.com/src/battle-dex-data.ts#L690). If the avatar you are looking for is not in this list, you can inspect the message the client is sending to the server by opening your browser's development console and selecting the avatar manually.


In [14]:
player_with_avatar = RandomPlayer(avatar="boarder")



### Saving Battle Replays

You can save battle replays by specifying a `save_replay` value when initializing a `Player` object. This argument can either be a boolean (if `True`, the replays will be saved in the `replays`) or a string - in which case the replays will be saved in the specified directory.

In [12]:
player_with_replays = RandomPlayer(save_replays="my_folder")

### Logging

Every `Player` instance has a custom logger. By default, it will only surface warnings and errors. You can change the logging level by specifying a `log_level` argument when initializing a `Player` object.

The two most relevant values are `logging.INFO` or 20, which will surface every message sent or received by the client (which is very useful when debugging) and 25, which is a custom level used by `poke-env` to surface only the most relevant events.

You can also use `logging.DEBUG` or 10, but the difference with `logging.INFO` should only be relevant for `poke-env` internals.

In [15]:
verbose_player = RandomPlayer(log_level=20)

from asyncio import sleep

await sleep(1)

2025-05-10 09:48:28,776 - RandomPlayer 8 - INFO - Starting listening to showdown websocket
2025-05-10 09:48:28,781 - RandomPlayer 8 - INFO - [92m[1m<<<[0m |updateuser| Guest 15|0|101|{"blockChallenges":false,"blockPMs":false,"ignoreTickets":false,"hideBattlesFromTrainerCard":false,"blockInvites":false,"doNotDisturb":false,"blockFriendRequests":false,"allowFriendNotifications":false,"displayBattlesToFriends":false,"hideLogins":false,"hiddenNextBattle":false,"inviteOnlyNextBattle":false,"language":null}
|customgroups|[{"symbol":"~","name":"Administrator","type":"leadership"},{"symbol":"#","name":"Room Owner","type":"leadership"},{"symbol":"★","name":"Host","type":"leadership"},{"symbol":"@","name":"Moderator","type":"staff"},{"symbol":"%","name":"Driver","type":"staff"},{"symbol":"*","name":"Bot","type":"normal"},{"symbol":"☆","name":"Player","type":"normal"},{"symbol":"+","name":"Voice","type":"normal"},{"symbol":"^","name":"Prize Winner","type":"normal"},{"symbol":"whitelist","name"

### Concurrency

By default, a `poke-env` `Player` will only run a single battle at a time. You can change this behavior by specifying a `max_concurrent_battles` argument when initializing a `Player` object.

This argument is an integer, and represents the maximum number of battles a `Player` can run at the same time. If 0, no limit will be enforced.

This can provide a significant speedup when your process is not CPU bound.

In [14]:
import time

# Time to run 50 battles, one at a time
start = time.time()
await random_player.battle_against(second_player, n_battles=50)
end = time.time()
print(f"Time to run 50 battles, one at a time: {end - start:.2f} seconds")

Time to run 50 battles, one at a time: 8.12 seconds


In [15]:
unrestricted_random_player = RandomPlayer(max_concurrent_battles=0)
unrestricted_second_player = RandomPlayer(max_concurrent_battles=0)

# Time to run 50 battles, in parallel
start = time.time()
await unrestricted_random_player.battle_against(
    unrestricted_second_player, n_battles=50
)
end = time.time()
print(f"Time to run 50 battles, in parallel: {end - start:.2f} seconds")

Time to run 50 battles, in parallel: 4.22 seconds


Other options can also be used on the server side to make battles run faster.

### Pokemon Showdown Timer

You can turn on the Pokemon Showdown timer by setting `start_timer_on_battle_start` to `True` when initializing a `Player` object.

This is mostly relevant when pitting your argents against humans.

In [16]:
impatient_player = RandomPlayer(start_timer_on_battle_start=True)