Sending or accepting challenges gives an error about coroutines #43

Gummygamer · 2020-05-06T00:19:52Z

If you call player.send_challenges or player.accept_challenges, you get this error:

RuntimeWarning: coroutine 'final_tests' was never awaited
final_tests()

If you wrap it under an async function and call it with await, you get this:

RuntimeError: Task <Task pending coro=<final_tests() running at pokerl.py:98> cb=[_run_until_complete_cb() at C:\Users\Username\Anaconda3\lib\asyncio\base_events.py:158]> got Future attached to a different loop

hsahovic · 2020-05-06T00:28:09Z

Hi @Gummygamer ! Thanks for opening this issue.

I do not think that we have a final_tests coroutine in poke-env. Would you mind sharing a snippet to reproduce the issue?

Gummygamer · 2020-05-06T01:09:14Z

I just defined that function at the RL test as a placeholder for sending or accepting challenges, it's just this:

async def final_tests():
await env_player.send_challenges('Gummygamer',100)

if I change to accepting challenges, I get the same issue. I can send the whole code for further inspection, but it's almost identical to the RL example at the documentation.

Gummygamer · 2020-05-06T01:13:15Z

pokerl.zip

hsahovic · 2020-05-06T01:41:23Z

Thanks, I'll take a look at it!

hsahovic · 2020-05-08T22:39:58Z

Hi!

I took a look at the code you sent: the problem is that you are trying to send challenges with an agent using the open AI API. This API is only meant for training models, and currently does not support this use-case directly - it is planned for later but requires additional work.

You can work around that by defining an agent similar to this one:

class EmbeddedRLPlayer(Player):
    def choose_move(self, battle):
        if np.random.rand() < 0.01:  # avoids infinite loops
            return self.choose_random_move(battle)
        embedding = SimpleRLPlayer.embed_battle(self, battle)  
        action = self.dqn.forward(embedding)
        return SimpleRLPlayer._action_to_move(self, action, battle)

Once instantiated, you can use this agent to challenge a human with send_challenges.

Gummygamer · 2020-05-09T22:23:12Z

It still happens the same way after embedding.
pokerl.zip

hsahovic · 2020-05-09T22:51:39Z

@Gummygamer that's unexpected; I'll post a full version later today

hsahovic · 2020-05-09T23:57:42Z

I confirm that I was able to battle the dqn agent using the embedded player; here is the exact code (up to the username) that I used:

# -*- coding: utf-8 -*-
import numpy as np
import tensorflow as tf

from poke_env.player_configuration import PlayerConfiguration
from poke_env.player.env_player import Gen8EnvSinglePlayer
from poke_env.player.random_player import RandomPlayer
from poke_env.player.player import Player
from poke_env.server_configuration import LocalhostServerConfiguration

from rl.agents.dqn import DQNAgent
from rl.policy import LinearAnnealedPolicy, EpsGreedyQPolicy
from rl.memory import SequentialMemory
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam

import asyncio


# We define our RL player
# It needs a state embedder and a reward computer, hence these two methods
class SimpleRLPlayer(Gen8EnvSinglePlayer):
    def embed_battle(self, battle):
        # -1 indicates that the move does not have a base power
        # or is not available
        moves_base_power = -np.ones(4)
        moves_dmg_multiplier = np.ones(4)
        for i, move in enumerate(battle.available_moves):
            moves_base_power[i] = (
                move.base_power / 100
            )  # Simple rescaling to facilitate learning
            if move.type:
                moves_dmg_multiplier[i] = move.type.damage_multiplier(
                    battle.opponent_active_pokemon.type_1,
                    battle.opponent_active_pokemon.type_2,
                )

        # We count how many pokemons have not fainted in each team
        remaining_mon_team = (
            len([mon for mon in battle.team.values() if mon.fainted]) / 6
        )
        remaining_mon_opponent = (
            len([mon for mon in battle.opponent_team.values() if mon.fainted]) / 6
        )

        # Final vector with 10 components
        return np.concatenate(
            [
                moves_base_power,
                moves_dmg_multiplier,
                [remaining_mon_team, remaining_mon_opponent],
            ]
        )

    def compute_reward(self, battle) -> float:
        return self.reward_computing_helper(
            battle, fainted_value=2, hp_value=1, victory_value=30
        )


class MaxDamagePlayer(RandomPlayer):
    def choose_move(self, battle):
        # If the player can attack, it will
        if battle.available_moves:
            # Finds the best move among available ones
            best_move = max(battle.available_moves, key=lambda move: move.base_power)
            return self.create_order(best_move)

        # If no attack is available, a random switch will be made
        else:
            return self.choose_random_move(battle)
            


NB_TRAINING_STEPS = 1
NB_EVALUATION_EPISODES = 1

tf.random.set_seed(0)
np.random.seed(0)


# This is the function that will be used to train the dqn
def dqn_training(player, dqn, nb_steps):
    dqn.fit(player, nb_steps=nb_steps)
    player.complete_current_battle()


def dqn_evaluation(player, dqn, nb_episodes):
    # Reset battle statistics
    player.reset_battles()
    dqn.test(player, nb_episodes=nb_episodes, visualize=False, verbose=False)

    print(
        "DQN Evaluation: %d victories out of %d episodes"
        % (player.n_won_battles, nb_episodes)
    )
    
    
async def final_tests():
    await emb_player.send_challenges('username',100)


if __name__ == "__main__":
    env_player = SimpleRLPlayer(
        player_configuration=PlayerConfiguration("RL Player", None),
        battle_format="gen8randombattle",
        server_configuration=LocalhostServerConfiguration,
    )

    opponent = RandomPlayer(
        player_configuration=PlayerConfiguration("Random player", None),
        battle_format="gen8randombattle",
        server_configuration=LocalhostServerConfiguration,
    )

    second_opponent = MaxDamagePlayer(
        player_configuration=PlayerConfiguration("Max damage player", None),
        battle_format="gen8randombattle",
        server_configuration=LocalhostServerConfiguration,
    )

    # Output dimension
    n_action = len(env_player.action_space)

    model = Sequential()
    model.add(Dense(128, activation="elu", input_shape=(1, 10)))

    # Our embedding have shape (1, 10), which affects our hidden layer
    # dimension and output dimension
    # Flattening resolve potential issues that would arise otherwise
    model.add(Flatten())
    model.add(Dense(64, activation="elu"))
    model.add(Dense(n_action, activation="linear"))

    memory = SequentialMemory(limit=10000, window_length=1)

    # Ssimple epsilon greedy
    policy = LinearAnnealedPolicy(
        EpsGreedyQPolicy(),
        attr="eps",
        value_max=1.0,
        value_min=0.05,
        value_test=0,
        nb_steps=10000,
    )

    # Defining our DQN
    dqn = DQNAgent(
        model=model,
        nb_actions=len(env_player.action_space),
        policy=policy,
        memory=memory,
        nb_steps_warmup=1000,
        gamma=0.5,
        target_model_update=1,
        delta_clip=0.01,
        enable_double_dqn=True,
    )

    dqn.compile(Adam(lr=0.00025), metrics=["mae"])
    
    class EmbeddedRLPlayer(Player):
      def choose_move(self, battle):
        if np.random.rand() < 0.01:  # avoids infinite loops
            return self.choose_random_move(battle)
        embedding = SimpleRLPlayer.embed_battle(self, battle)  
        action = dqn.forward(embedding)
        return SimpleRLPlayer._action_to_move(self, action, battle)
    
    emb_player = EmbeddedRLPlayer(
        player_configuration=PlayerConfiguration("Embedded RL Player", None),
        battle_format="gen8randombattle",
        server_configuration=LocalhostServerConfiguration,
    )

    # Training
    env_player.play_against(
        env_algorithm=dqn_training,
        opponent=opponent,
        env_algorithm_kwargs={"dqn": dqn, "nb_steps": NB_TRAINING_STEPS},
    )
    model.save("model_%d" % NB_TRAINING_STEPS)

    asyncio.get_event_loop().run_until_complete(final_tests())

My virtualenv runs python 3.6 with the following pacakges:

absl-py==0.9.0
aiologger==0.5.0
alabaster==0.7.12
appdirs==1.4.3
astor==0.8.1
astunparse==1.6.3
asynctest==0.13.0
attrs==19.3.0
Babel==2.8.0
black==19.10b0
bleach==3.1.5
cachetools==4.1.0
certifi==2020.4.5.1
cfgv==3.1.0
chardet==3.0.4
click==7.1.2
cloudpickle==1.3.0
coverage==5.1
dataclasses==0.7
distlib==0.3.0
docutils==0.16
entrypoints==0.3
filelock==3.0.12
flake8==3.7.9
future==0.18.2
gast==0.2.2
google-auth==1.14.2
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.28.1
gym==0.17.2
h5py==2.10.0
identify==1.4.15
idna==2.9
imagesize==1.2.0
importlib-metadata==1.6.0
importlib-resources==1.5.0
Jinja2==2.11.2
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
keras-rl2==1.0.3
keyring==21.2.1
libcst==0.3.4
Markdown==3.2.2
MarkupSafe==1.1.1
mccabe==0.6.1
more-itertools==8.2.0
mypy-extensions==0.4.3
nodeenv==1.3.5
numpy==1.18.4
oauthlib==3.1.0
opt-einsum==3.2.1
packaging==20.3
pathspec==0.8.0
pkginfo==1.5.0.1
pluggy==0.13.1
pre-commit==2.3.0
protobuf==3.11.3
psutil==5.7.0
py==1.8.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.5.0
pyflakes==2.1.1
pyglet==1.5.0
Pygments==2.6.1
pyparsing==2.4.7
pyre-check==0.0.46
pyre-extensions==0.0.18
pytest==5.4.2
pytest-asyncio==0.12.0
pytest-cov==2.8.1
pytest-timeout==1.3.4
pytz==2020.1
pywatchman==1.4.1
PyYAML==5.3.1
readme-renderer==26.0
regex==2020.5.7
requests==2.23.0
requests-oauthlib==1.3.0
requests-toolbelt==0.9.1
rsa==4.0
scipy==1.4.1
six==1.14.0
snowballstemmer==2.0.0
Sphinx==3.0.3
sphinx-rtd-theme==0.4.3
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.4
tabulate==0.8.7
tb-nightly==1.14.0a20190603
tensorboard==2.1.1
tensorboard-plugin-wit==1.6.0.post3
tensorflow==2.0.0b1
tensorflow-estimator==2.1.0
termcolor==1.1.0
tf-estimator-nightly==1.14.0.dev2019060501
toml==0.10.0
tqdm==4.46.0
twine==3.1.1
typed-ast==1.4.1
typing-extensions==3.7.4.2
typing-inspect==0.6.0
urllib3==1.25.9
virtualenv==20.0.20
wcwidth==0.1.9
webencodings==0.5.1
websockets==8.1
Werkzeug==1.0.1
wrapt==1.12.1
zipp==3.1.0

Gummygamer · 2020-05-10T00:43:16Z

It worked just fine! It was the way I called the coroutine, it seems.

hsahovic self-assigned this May 6, 2020

hsahovic added the bug Something isn't working label May 6, 2020

hsahovic added this to To do in Poke-env - general May 7, 2020

hsahovic closed this as completed May 8, 2020

Poke-env - general automation moved this from To do to Done May 8, 2020

hsahovic reopened this May 9, 2020

Poke-env - general automation moved this from Done to In progress May 9, 2020

Gummygamer closed this as completed May 10, 2020

Poke-env - general automation moved this from In progress to Done May 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sending or accepting challenges gives an error about coroutines #43

Sending or accepting challenges gives an error about coroutines #43

Gummygamer commented May 6, 2020 •

edited

hsahovic commented May 6, 2020

Gummygamer commented May 6, 2020

Gummygamer commented May 6, 2020

hsahovic commented May 6, 2020

hsahovic commented May 8, 2020

Gummygamer commented May 9, 2020

hsahovic commented May 9, 2020

hsahovic commented May 9, 2020

Gummygamer commented May 10, 2020

Sending or accepting challenges gives an error about coroutines #43

Sending or accepting challenges gives an error about coroutines #43

Comments

Gummygamer commented May 6, 2020 • edited

hsahovic commented May 6, 2020

Gummygamer commented May 6, 2020

Gummygamer commented May 6, 2020

hsahovic commented May 6, 2020

hsahovic commented May 8, 2020

Gummygamer commented May 9, 2020

hsahovic commented May 9, 2020

hsahovic commented May 9, 2020

Gummygamer commented May 10, 2020

Gummygamer commented May 6, 2020 •

edited