Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error encountered during player.ladder() #306

Closed
akashsara opened this issue Jul 25, 2022 · 15 comments
Closed

Error encountered during player.ladder() #306

akashsara opened this issue Jul 25, 2022 · 15 comments

Comments

@akashsara
Copy link
Contributor

Hi,
I was testing a model I trained on Pokemon Showdown (code snippet below) when I ran into this issue. I'm able to challenge the bot to a battle and play against it perfectly well but when I do player.ladder(100) it errors out after completing a single battle.

2022-07-25 18:33:47,574 - UABGLSimpleDQN - ERROR - Unhandled exception raised while handling message:
>battle-gen8randombattle-1625188644
|-message|Nukkumatti lost due to inactivity.
|
|win|UABGLSimpleDQN
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 131, in _handle_message
    await self._handle_battle_message(split_messages)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player.py", line 235, in _handle_battle_message
    self._battle_finished_callback(battle)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\env_player.py", line 106, in _battle_finished_callback
    self._observations[battle].put(self.embed_battle(battle))
KeyError: <poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>
Task exception was never retrieved
future: <Task finished name='Task-39' coro=<PlayerNetwork._handle_message() done, defined at E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py:117> exception=KeyError(<poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>)>
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 177, in _handle_message
    raise exception
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player_network_interface.py", line 131, in _handle_message
    await self._handle_battle_message(split_messages)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\player.py", line 235, in _handle_battle_message
    self._battle_finished_callback(battle)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env\player\env_player.py", line 106, in _battle_finished_callback
    self._observations[battle].put(self.embed_battle(battle))
KeyError: <poke_env.environment.battle.Gen8Battle object at 0x000001E1988D2EA0>

Model code:

class SimpleRLPlayerTesting(SimpleRLPlayer):
    def __init__(self, model, *args, **kwargs):
        SimpleRLPlayer.__init__(self, *args, **kwargs)
        self.model = model

    def choose_move(self, battle):
        state = self.embed_battle(battle)
        with torch.no_grad():
            predictions = self.model(state)
        action_mask = self.action_masks()
        action = np.argmax(predictions + action_mask)
        return self._action_to_move(action, battle)

Script:

async def main():
    ...
    player = simple_agent.SimpleRLPlayerTesting(
        model=model,
        player_configuration=PlayerConfiguration(USERNAME, PASSWORD),
        server_configuration=ShowdownServerConfiguration,
        start_timer_on_battle_start=True,
        **player_kwargs
    )
    print("Connecting to Pokemon Showdown...")
    await player.ladder(NUM_GAMES)
    # Print the rating of the player and its opponent after each battle
    for battle in player.battles.values():
        print(battle.rating, battle.opponent_rating)

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())
@hsahovic
Copy link
Owner

Hey @akashsara,

Thanks for opening this issue. If you want to test a trained model on the ladder, I would recommend not inheriting from the gym player, but from the base player class.

@akashsara
Copy link
Contributor Author

akashsara commented Jul 26, 2022

Do you mean poke_env.player.player.Player?

Edit: Looking at the code for the player above, this would mean that I would need to inherit from the base player class and reimplement an embed_battle(), choose_move() and _action_to_move(). This is not really a big issue or anything and implementing it is trivial but it feels kinda weird to have to do so in another class just to be able to test the model.

@MatteoH2O1999
Copy link
Contributor

Hey @akashsara,
I think you are using the old version of the gym player. Try to install from gh source as the new version should be able to do what you need pretty easily.
Let me know if you have any questions!!!

@akashsara
Copy link
Contributor Author

Thank you, I'll try that!

@akashsara
Copy link
Contributor Author

So it seems like there's a number of significant changes with the new version. I'm following examples/rl_with_new_open_ai_gym_wrapper.py right now and setting things up, but I was wondering when these changes would be pushed to the next public release? @hsahovic

I'm asking since due to some circumstances I'm running my code elsewhere and that machine has some issues haha. So I'd prefer being able to just install the latest version and run my code vs setting up some hacky bits in the meantime.

@akashsara
Copy link
Contributor Author

Hey @MatteoH2O1999,
So I've updated my code to work with the new version. For completeness I trained a new agent as well. However I'm having issues with testing it on Showdown.

  1. I can't get the player.start_laddering() function to run correctly. This is the code snippet I'm using:
async def main():
    # <model creation code>
    player = simple_agent.SimpleRLPlayerTesting( # Inherits from env_player
            model=model,
            player_configuration=PlayerConfiguration(USERNAME, PASSWORD),
            server_configuration=ShowdownServerConfiguration,
            start_timer_on_battle_start=True,
            start_challenging=False,
    )
    player.start_laddering(NUM_GAMES)
    for battle in player.battles.values():
        print(battle.rating, battle.opponent_rating)

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

If I set start_challenging to True I get this output:

2022-07-29 16:52:32,957 - UABGLSimpleDQN - WARNING - Popup message received: |popup|The user 'randomplayer1' was not found.
Traceback (most recent call last):
  File "E:\Dev\meta-discovery\play_on_showdown.py", line 121, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "C:\Program Files\Python39\lib\asyncio\base_events.py", line 642, in run_until_complete
    return future.result()
  File "E:\Dev\meta-discovery\play_on_showdown.py", line 109, in main
    player.start_laddering(NUM_GAMES)
  File "E:\Dev\meta-discovery\torch_env\lib\site-packages\poke_env-0.4.21-py3.9.egg\poke_env\player\openai_api.py", line 505, in start_laddering
    raise RuntimeError("Agent is already challenging")
RuntimeError: Agent is already challenging

If I set it to False the script just finishes running almost instantly. I'm monitoring the account on Showdown and it doesn't start any battles or anything.

  1. I can't seem to find a way to send/accept challenges. The player class has a accept_challenges and a send_challenges function but there doesn't seem to be an equivalent function in env_player.

@MatteoH2O1999
Copy link
Contributor

MatteoH2O1999 commented Jul 30, 2022

Hi @akashsara,
to answer you questions I need to explain how the wrapper works: it is designed to be use as an environment and not as a normal player. It uses a custom player run in a background thread so the OpenAIGym API is exposed on the main thread.

Regarding your first question, the problem is you are not actually using the environment. To use it you should have something like

player = ... #derive from env_player
...
player.start_laddering(NUM_GAMES)
for _ in range(NUM_GAMES):
 step = player.reset()
 while not step.done:
  action = model.action(step)
  step = player.step(action)
...

if you wish to implement accept_challenges and send_challenges I would advise to subclass EnvPlayer and define the methods to run in parallel the model-predict and the challenge-loop (use player.agent to access the custom background player)

@akashsara
Copy link
Contributor Author

Ah I see. Apologies for the misunderstanding. Thank you for clarifying. It's working now!

I should note that there is something a little weird still going on. I get this error everytime I run it, even though battles do seem to be starting and the agent I have seems to be playing:
2022-07-31 18:39:13,537 - RandomPlayer 1 - ERROR - [WinError 1225] The remote computer refused the network connection
Note: Neither my agent nor my Showdown account is called RandomPlayer 1 so I'm not sure where this is coming from.

On the challenges part - are there any plans to implement it/something similar on EnvPlayer for the time being? Or is it not on the roadmap? I'll take a crack it either way, but just wanted to know.

@MatteoH2O1999
Copy link
Contributor

MatteoH2O1999 commented Aug 1, 2022

As a temporary workaround use opponent='placeholder string' in your env player. This is because by default a random player gets created if opponent has a "falsy" value (like ''). I think we'll change that in the next patch

@MatteoH2O1999
Copy link
Contributor

Regarding the challenges part the famous "it's a feature not a bug" applies because accept challenges and send challenges should have the same effect: once you await them, the battle completes. This is impossible to implement by default as it would mean linking poke-env to a specific ML library

@hsahovic
Copy link
Owner

hsahovic commented Aug 1, 2022

@akashsara yeah I need to push a new release - i will take care of it this week.
What I meant by the base player class is something like this:

from poke_env.player.baselines import class RandomPlayer(Player)

class TrainedModelPlayer(RandomPlayer):
    def choose_move(self, battle):
        state = embed_battle(battle)
        with torch.no_grad():
            predictions = model(state)
        action_mask = SimpleRLPlayer.action_masks()
        action = np.argmax(predictions + action_mask)
        return SimpleRLPlayer._action_to_move(action, battle)

where embed_battle and model are standalone functions / objects - this should then work with ladder and other battling functions.

@akashsara
Copy link
Contributor Author

@MatteoH2O1999 Thanks! Does this mean there are no plans to have a method to battle against the bot in a custom battle at all? (Apart from the method Haris mentioned above)

@hsahovic Got it, thank you. That should work for me.

@MatteoH2O1999
Copy link
Contributor

MatteoH2O1999 commented Aug 1, 2022

@akashsara,
For that there should be the method play_against, but it still requires for you to manage the prediction loop as it only starts the battle in the background. It should also work if you use

player.set_opponent('username to challenge')
player.start_challenging(n_challenges)
for _ in range(n_challenges):
 step = player.reset()
 while not step.done:
  action = model.action(step)
  step = player.step(action)

@akashsara
Copy link
Contributor Author

Ooh that works out for me. Thanks a lot @MatteoH2O1999

@akashsara
Copy link
Contributor Author

So as an update for anyone running into this thread later on, for speed reasons I would recommend using the general API like Haris mentioned. The approach Matteo suggested works pretty well for self-play but in terms of pure speed, the general API is much, much, much faster.

From my own rough benchmarking, the general API works out to be 3-4 times as fast.

@hsahovic maybe we could include this somewhere in the documentation? I expected it to be faster but not this fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants