Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Bug: lambda function refers only one environment #1155

Open
5 of 9 tasks
maguro27 opened this issue May 28, 2024 · 4 comments
Open
5 of 9 tasks

Python Bug: lambda function refers only one environment #1155

maguro27 opened this issue May 28, 2024 · 4 comments

Comments

@maguro27
Copy link

maguro27 commented May 28, 2024

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
    • design request (i.e. "X should be changed to Y.")
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, gymnasium as gym, torch, numpy, sys
    print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)
    0.5.1 0.29.1 2.3.0a0+40ec155e58.nv24.03 1.24.4 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] linux

The tutorial uses the lambda function for making callable environment functions many times.
However, I confirmed the Python bug (Python 3.10.12) when I use user-defined environment that is shown as follows,

train_envs: list[gym.Env] = making_my_env() # e.g., len(train_envs) == 4
ts_train_envs = DummyVectorEnv([lambda: env for env in train_envs])
tmp_envs = [lambda: env for env in train_envs]

for env in tmp_envs:
    print(env().reset())
for env in ts_train_envs._env_fns:
    print(env().reset())

Then, I can get the same return values of the environment, but stacked 4.
Hence, I fix the above code as follows,

def callable_env(env: gym.Env) -> Callable:
    def _callable_env() -> gym.Env:
        return env

    return _callable_env

train_envs: list[gym.Env] = making_my_env() # e.g., len(train_envs) == 4
ts_train_envs = DummyVectorEnv([callable_env(env) for env in train_envs])
tmp_envs = [callable_env(env) for env in train_envs]

for env in tmp_envs:
    print(env().reset())
for env in ts_train_envs._env_fns:
    print(env().reset())

This works properly.
In conclusion, I suggest that the tutorial should not use the lambda function.

@MischaPanch
Copy link
Collaborator

You are using a very old version of tianshou. Could you pls try on either 1.0.0 or on the version on master?

@maguro27
Copy link
Author

@MischaPanch
I update the python and tianshou version.

1.0.0 0.28.1 2.3.0+cu121 1.24.4 3.11.9 (main, Apr 6 2024, 17:59:24) [GCC 9.4.0] linux

However, this lambda function issue remains.

@dantp-ai
Copy link
Contributor

dantp-ai commented Jun 3, 2024

Hi @maguro27,

It seems that env is not looked up until the lambda function is called, but by the end of the loop, env is bound to the last element in the list, hence you get the last environment four times. You can read more about it and closures with lambdas here.

This should now work as expected since the default value for env is evaluated when the lambda function is defined:

ts_train_envs = DummyVectorEnv([lambda: env=env for env in train_envs])

Which Tianshou tutorial are you looking at?

@maguro27
Copy link
Author

maguro27 commented Jun 7, 2024

@dantp-ai

Thank you for your comments.
I understand I misunderstood the behavior of the lambda function.

Tianshou tutorials only use default gym environments.
Hence, I misunderstand the behavior.
Therefore, I think maintainers might want to add information for using custom environments (e.g., use the gymnasium register function, then use it. or use "lambda: env=env".).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants