Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Issue with rendering Atari environment #302

Closed
indweller opened this issue Jul 21, 2023 · 4 comments
Closed

[BUG] Issue with rendering Atari environment #302

indweller opened this issue Jul 21, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@indweller
Copy link

I tried to render the environment for Atari Pong. But I keep running into the following error.
Code:

from d3rlpy.datasets import get_atari
from d3rlpy.algos import DQNConfig
from d3rlpy.metrics import TDErrorEvaluator, EnvironmentEvaluator

dataset, env = get_atari(env_name='pong-expert-v4')
dqn = DQNConfig().create(device='cuda:0')
dqn.build_with_dataset(dataset)

td_error_evaluator = TDErrorEvaluator(episodes=dataset.episodes)
env_evaluator = EnvironmentEvaluator(env, render=True)
rewards = env_evaluator(dqn, dataset=None)

The output is as follows:

A.L.E: Arcade Learning Environment (version 0.8.1+53f58b7)
[Powered by Stella]
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:31: UserWarning: WARN: A Box observation space has an unconventional shape (neither an image, nor a 1D vector). We recommend flattening the observation to have only a 1D vector or use a custom policy to properly process the data. Actual observation shape: (84, 84)
  logger.warn(
loading /home/prashanth/.d4rl/datasets/Pong/5/50/observation.gz...
loading /home/prashanth/.d4rl/datasets/Pong/5/50/action.gz...
loading /home/prashanth/.d4rl/datasets/Pong/5/50/reward.gz...
loading /home/prashanth/.d4rl/datasets/Pong/5/50/terminal.gz...
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
  logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
  logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
  if not isinstance(terminated, (bool, np.bool8)):
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:289: UserWarning: WARN: No render fps was declared in the environment (env.metadata['render_fps'] is None or not defined), rendering may occur at inconsistent fps.
  logger.warn(
Traceback (most recent call last):
  File "trial.py", line 11, in <module>
    rewards = env_evaluator(dqn, dataset=None)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/d3rlpy/metrics/evaluators.py", line 540, in __call__
    return evaluate_qlearning_with_environment(
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/d3rlpy/metrics/utility.py", line 63, in evaluate_qlearning_with_environment
    env.render()
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/core.py", line 329, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/core.py", line 329, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 51, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/env_checker.py", line 53, in render
    return env_render_passive_checker(self.env, *args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py", line 316, in env_render_passive_checker
    result = env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/d4rl_atari/envs.py", line 48, in render
    self._env.render(mode)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/core.py", line 329, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 51, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/env_checker.py", line 53, in render
    return env_render_passive_checker(self.env, *args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py", line 316, in env_render_passive_checker
    result = env.render(*args, **kwargs)
TypeError: render() takes 1 positional argument but 2 were given

Additionally, I also tried to train without rendering, load the trained model separately and then render it during evaluation with gym.make('pong-expert-v4', render_mode='human') and env.render(). But the same error appears. I didn't face any issues while rendering CartPole

@indweller indweller added the bug Something isn't working label Jul 21, 2023
@takuseno
Copy link
Owner

@indweller Thanks for reporting this. This was because get_atari relies on another repository d4rl-atari, which I just fixed this issue there. Also, I've updated d3rlpy to support the latest render interface at these commits: a7207cd 2121edd . I'll release a patch that includes these fixes later today. When you try this, please reinstall d4rl pip install -U git+https://github.com/takuseno/d4rl-atari.

@takuseno
Copy link
Owner

The latest patch has been released.
https://github.com/takuseno/d3rlpy/releases/tag/v2.0.4

@takuseno
Copy link
Owner

Just as a reference, you can enable rendering like this:

from d3rlpy.datasets import get_atari
from d3rlpy.algos import DQNConfig
from d3rlpy.metrics import TDErrorEvaluator, EnvironmentEvaluator

dataset, env = get_atari(env_name='pong-expert-v4', render_mode="human")
dqn = DQNConfig().create(device='cuda:0')
dqn.build_with_dataset(dataset)

td_error_evaluator = TDErrorEvaluator(episodes=dataset.episodes)
env_evaluator = EnvironmentEvaluator(env)
rewards = env_evaluator(dqn, dataset=None)

@indweller
Copy link
Author

indweller commented Jul 24, 2023

Thanks for your quick response @takuseno ! The code snippet that you shared seems to be working fine. But when I use it like shown below, I get the following error.

import d3rlpy, d4rl_atari
import gym
import numpy as np

dqn = d3rlpy.load_learnable('./pongdqn.d3')

env = gym.make('pong-expert-v4', render_mode='human')
observations = env.reset()

observations = observations[0]

terminated = False
truncated = False
total = 0
positive_reward = 0

while not terminated and not truncated:
    action = dqn.predict(observations.reshape((1,1,84,84)))[0]
    observations, reward, terminated, truncated, info = env.step(action)
    env.render()
    print(f"Reward: {reward}\n")
    total += reward
    if reward > 0:
        positive_reward += reward

print(f"Total Reward: {total}, Positive Reward: {positive_reward}")

env.close()

Error:

2023-07-23 18:50:34 [warning  ] There might be incompatibility because of version mismatch. current_version=2.0.4 saved_version=2.0.3
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/envs/registration.py:623: UserWarning: WARN: The environment is being initialised with mode (human) that is not in the possible render_modes ([]).
  logger.warn(
A.L.E: Arcade Learning Environment (version 0.8.1+53f58b7)
[Powered by Stella]
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:31: UserWarning: WARN: A Box observation space has an unconventional shape (neither an image, nor a 1D vector). We recommend flattening the observation to have only a 1D vector or use a custom policy to properly process the data. Actual observation shape: (84, 84)
  logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
  logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
  logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
  if not isinstance(terminated, (bool, np.bool8)):
Traceback (most recent call last):
  File "testing.py", line 25, in <module>
    env.render()
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/core.py", line 329, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 51, in render
    return self.env.render(*args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/env_checker.py", line 53, in render
    return env_render_passive_checker(self.env, *args, **kwargs)
  File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py", line 307, in env_render_passive_checker
    assert (
AssertionError: With no render_modes, expects the Env.render_mode to be None, actual value: human

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants