Error encountered during player.ladder() #306

akashsara · 2022-07-25T23:40:37Z

Hi,
I was testing a model I trained on Pokemon Showdown (code snippet below) when I ran into this issue. I'm able to challenge the bot to a battle and play against it perfectly well but when I do player.ladder(100) it errors out after completing a single battle.

2022-07-25 18:33:47,574 - UABGLSimpleDQN - ERROR - Unhandled exception raised while handling message:
>battle-gen8randombattle-1625188644
|-message|Nukkumatti lost due to inactivity.
|
|win|UABGLSimpleDQN
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 131, in _handle_message
    await self._handle_battle_message(split_messages)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player.py", line 235, in _handle_battle_message
    self._battle_finished_callback(battle)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\env_player.py", line 106, in _battle_finished_callback
    self._observations[battle].put(self.embed_battle(battle))
KeyError: <poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>
Task exception was never retrieved
future: <Task finished name='Task-39' coro=<PlayerNetwork._handle_message() done, defined at E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py:117> exception=KeyError(<poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>)>
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 177, in _handle_message
    raise exception
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 131, in _handle_message
    await self._handle_battle_message(split_messages)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player.py", line 235, in _handle_battle_message
    self._battle_finished_callback(battle)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\env_player.py", line 106, in _battle_finished_callback
    self._observations[battle].put(self.embed_battle(battle))
KeyError: <poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>

Model code:

class SimpleRLPlayerTesting(SimpleRLPlayer):
    def __init__(self, model, *args, **kwargs):
        SimpleRLPlayer.__init__(self, *args, **kwargs)
        self.model = model

    def choose_move(self, battle):
        state = self.embed_battle(battle)
        with torch.no_grad():
            predictions = self.model(state)
        action_mask = self.action_masks()
        action = np.argmax(predictions + action_mask)
        return self._action_to_move(action, battle)

Script:

async def main():
    ...
    player = simple_agent.SimpleRLPlayerTesting(
        model=model,
        player_configuration=PlayerConfiguration(USERNAME, PASSWORD),
        server_configuration=ShowdownServerConfiguration,
        start_timer_on_battle_start=True,
        **player_kwargs
    )
    print("Connecting to Pokemon Showdown...")
    await player.ladder(NUM_GAMES)
    # Print the rating of the player and its opponent after each battle
    for battle in player.battles.values():
        print(battle.rating, battle.opponent_rating)

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

The text was updated successfully, but these errors were encountered:

hsahovic · 2022-07-26T12:09:21Z

Hey @akashsara,

Thanks for opening this issue. If you want to test a trained model on the ladder, I would recommend not inheriting from the gym player, but from the base player class.

akashsara · 2022-07-26T15:20:01Z

Do you mean poke_env.player.player.Player?

Edit: Looking at the code for the player above, this would mean that I would need to inherit from the base player class and reimplement an embed_battle(), choose_move() and _action_to_move(). This is not really a big issue or anything and implementing it is trivial but it feels kinda weird to have to do so in another class just to be able to test the model.

MatteoH2O1999 · 2022-07-28T03:21:39Z

Hey @akashsara,
I think you are using the old version of the gym player. Try to install from gh source as the new version should be able to do what you need pretty easily.
Let me know if you have any questions!!!

akashsara · 2022-07-28T20:18:17Z

Thank you, I'll try that!

akashsara · 2022-07-29T00:52:33Z

So it seems like there's a number of significant changes with the new version. I'm following examples/rl_with_new_open_ai_gym_wrapper.py right now and setting things up, but I was wondering when these changes would be pushed to the next public release? @hsahovic

I'm asking since due to some circumstances I'm running my code elsewhere and that machine has some issues haha. So I'd prefer being able to just install the latest version and run my code vs setting up some hacky bits in the meantime.

akashsara · 2022-07-29T20:57:41Z

Hey @MatteoH2O1999,
So I've updated my code to work with the new version. For completeness I trained a new agent as well. However I'm having issues with testing it on Showdown.

I can't get the player.start_laddering() function to run correctly. This is the code snippet I'm using:

async def main():
    # <model creation code>
    player = simple_agent.SimpleRLPlayerTesting( # Inherits from env_player
            model=model,
            player_configuration=PlayerConfiguration(USERNAME, PASSWORD),
            server_configuration=ShowdownServerConfiguration,
            start_timer_on_battle_start=True,
            start_challenging=False,
    )
    player.start_laddering(NUM_GAMES)
    for battle in player.battles.values():
        print(battle.rating, battle.opponent_rating)

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

If I set start_challenging to True I get this output:

2022-07-29 16:52:32,957 - UABGLSimpleDQN - WARNING - Popup message received: |popup|The user 'randomplayer1' was not found.
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\play_on_showdown.py", line 121, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "C:\Program Files\Python39\lib\asyncio\base_events.py", line 642, in run_until_complete
    return future.result()
  File "E:\Dev\meta-discovery\play_on_showdown.py", line 109, in main
    player.start_laddering(NUM_GAMES)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env-0.4.21-py3.9.egg\poke_env\player\openai_api.py", line 505, in start_laddering
    raise RuntimeError("Agent is already challenging")
RuntimeError: Agent is already challenging

If I set it to False the script just finishes running almost instantly. I'm monitoring the account on Showdown and it doesn't start any battles or anything.

I can't seem to find a way to send/accept challenges. The player class has a accept_challenges and a send_challenges function but there doesn't seem to be an equivalent function in env_player.

MatteoH2O1999 · 2022-07-30T10:04:10Z

Hi @akashsara,
to answer you questions I need to explain how the wrapper works: it is designed to be use as an environment and not as a normal player. It uses a custom player run in a background thread so the OpenAIGym API is exposed on the main thread.

Regarding your first question, the problem is you are not actually using the environment. To use it you should have something like

player = ... #derive from env_player
...
player.start_laddering(NUM_GAMES)
for _ in range(NUM_GAMES):
 step = player.reset()
 while not step.done:
  action = model.action(step)
  step = player.step(action)
...

if you wish to implement accept_challenges and send_challenges I would advise to subclass EnvPlayer and define the methods to run in parallel the model-predict and the challenge-loop (use player.agent to access the custom background player)

akashsara · 2022-07-31T22:46:40Z

Ah I see. Apologies for the misunderstanding. Thank you for clarifying. It's working now!

I should note that there is something a little weird still going on. I get this error everytime I run it, even though battles do seem to be starting and the agent I have seems to be playing:
2022-07-31 18:39:13,537 - RandomPlayer 1 - ERROR - [WinError 1225] The remote computer refused the network connection
Note: Neither my agent nor my Showdown account is called RandomPlayer 1 so I'm not sure where this is coming from.

On the challenges part - are there any plans to implement it/something similar on EnvPlayer for the time being? Or is it not on the roadmap? I'll take a crack it either way, but just wanted to know.

MatteoH2O1999 · 2022-08-01T09:48:31Z

As a temporary workaround use opponent='placeholder string' in your env player. This is because by default a random player gets created if opponent has a "falsy" value (like ''). I think we'll change that in the next patch

MatteoH2O1999 · 2022-08-01T09:56:05Z

Regarding the challenges part the famous "it's a feature not a bug" applies because accept challenges and send challenges should have the same effect: once you await them, the battle completes. This is impossible to implement by default as it would mean linking poke-env to a specific ML library

hsahovic · 2022-08-01T11:46:56Z

@akashsara yeah I need to push a new release - i will take care of it this week.
What I meant by the base player class is something like this:

from poke_env.player.baselines import class RandomPlayer(Player)

class TrainedModelPlayer(RandomPlayer):
    def choose_move(self, battle):
        state = embed_battle(battle)
        with torch.no_grad():
            predictions = model(state)
        action_mask = SimpleRLPlayer.action_masks()
        action = np.argmax(predictions + action_mask)
        return SimpleRLPlayer._action_to_move(action, battle)

where embed_battle and model are standalone functions / objects - this should then work with ladder and other battling functions.

akashsara · 2022-08-01T20:01:45Z

@MatteoH2O1999 Thanks! Does this mean there are no plans to have a method to battle against the bot in a custom battle at all? (Apart from the method Haris mentioned above)

@hsahovic Got it, thank you. That should work for me.

MatteoH2O1999 · 2022-08-01T20:39:14Z

@akashsara,
For that there should be the method play_against, but it still requires for you to manage the prediction loop as it only starts the battle in the background. It should also work if you use

player.set_opponent('username to challenge')
player.start_challenging(n_challenges)
for _ in range(n_challenges):
 step = player.reset()
 while not step.done:
  action = model.action(step)
  step = player.step(action)

akashsara · 2022-08-02T05:50:27Z

Ooh that works out for me. Thanks a lot @MatteoH2O1999

akashsara · 2022-08-25T01:45:26Z

So as an update for anyone running into this thread later on, for speed reasons I would recommend using the general API like Haris mentioned. The approach Matteo suggested works pretty well for self-play but in terms of pure speed, the general API is much, much, much faster.

From my own rough benchmarking, the general API works out to be 3-4 times as fast.

@hsahovic maybe we could include this somewhere in the documentation? I expected it to be faster but not this fast.

akashsara closed this as completed Aug 2, 2022

MatteoH2O1999 mentioned this issue Aug 2, 2022

Gym new step API update #311

Merged

akashsara mentioned this issue Aug 3, 2022

Selfplay? #314

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error encountered during player.ladder() #306

Error encountered during player.ladder() #306

akashsara commented Jul 25, 2022

hsahovic commented Jul 26, 2022

akashsara commented Jul 26, 2022 •

edited

MatteoH2O1999 commented Jul 28, 2022

akashsara commented Jul 28, 2022

akashsara commented Jul 29, 2022

akashsara commented Jul 29, 2022

MatteoH2O1999 commented Jul 30, 2022 •

edited

akashsara commented Jul 31, 2022

MatteoH2O1999 commented Aug 1, 2022 •

edited

MatteoH2O1999 commented Aug 1, 2022

hsahovic commented Aug 1, 2022

akashsara commented Aug 1, 2022

MatteoH2O1999 commented Aug 1, 2022 •

edited

akashsara commented Aug 2, 2022

akashsara commented Aug 25, 2022

Error encountered during player.ladder() #306

Error encountered during player.ladder() #306

Comments

akashsara commented Jul 25, 2022

hsahovic commented Jul 26, 2022

akashsara commented Jul 26, 2022 • edited

MatteoH2O1999 commented Jul 28, 2022

akashsara commented Jul 28, 2022

akashsara commented Jul 29, 2022

akashsara commented Jul 29, 2022

MatteoH2O1999 commented Jul 30, 2022 • edited

akashsara commented Jul 31, 2022

MatteoH2O1999 commented Aug 1, 2022 • edited

MatteoH2O1999 commented Aug 1, 2022

hsahovic commented Aug 1, 2022

akashsara commented Aug 1, 2022

MatteoH2O1999 commented Aug 1, 2022 • edited

akashsara commented Aug 2, 2022

akashsara commented Aug 25, 2022

akashsara commented Jul 26, 2022 •

edited

MatteoH2O1999 commented Jul 30, 2022 •

edited

MatteoH2O1999 commented Aug 1, 2022 •

edited

MatteoH2O1999 commented Aug 1, 2022 •

edited