Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval_average_episode_rewards is not defined in shared/mpe_runner.py #17

Closed
ConstantinosM opened this issue Jun 22, 2021 · 1 comment
Closed

Comments

@ConstantinosM
Copy link

This is a bug that I fixed by taking a look at the other onpolicy code that you have. In shared/mpe_runner.py in def eval() almost at the end of the function:

        eval_episode_rewards = np.array(eval_episode_rewards)
        eval_env_infos = {}
        eval_env_infos['eval_average_episode_rewards'] = np.sum(np.array(eval_episode_rewards), axis=0)
        # print("eval average episode rewards of agent: " + str(eval_average_episode_rewards))

The eval_average_episode_rewards is not defined and code will exit with error. Instead I used:

print("eval average episode rewards of agent: " + str(np.mean(eval_env_infos['eval_average_episode_rewards'])))

This is different than in separated/mpe_runner.py:

        eval_train_infos = []
        for agent_id in range(self.num_agents):
            eval_average_episode_rewards = np.mean(np.sum(eval_episode_rewards[:, :, agent_id], axis=0))
            eval_train_infos.append({'eval_average_episode_rewards': eval_average_episode_rewards})
            print("eval average episode rewards of agent%i: " % agent_id + str(eval_average_episode_rewards))

but i guess the logic is that in the shared case agent1 and agent2 are the same so the average reward between their performance is reasonable.

@zoeyuchao
Copy link
Member

Thx~fixed the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants