Proposal: update rllib and pettingzoo examples to current release versions #113

elliottower · 2023-03-01T05:48:04Z

I'm planning to do some testing with the newly released ray 2.3.0 and I noticed the pettingzoo example is 9 months old and still uses gym rather than gymnasium, both should probably be updated. Will create a PR if I can get them to work but figure I'd create an issue first. Appreciate the repo, very excited to use it.

jagapiou · 2023-03-01T10:53:07Z

That sounds very useful thank you! I don't know the Ray stuff etc that well, so @duenez is best to comment on any details.

duenez · 2023-03-01T13:30:35Z

Hello Elliot,

That would be fantastic! Please let me know if there's anything I can do to help.

elliottower · 2023-03-02T03:25:40Z

Hello Elliot,

That would be fantastic! Please let me know if there's anything I can do to help.

Appreciate it, will reach out if I run into any issues. I got a good bit of progress today, although a major blocker to updating the entire pettingzoo tutorial is that stable-baselines3 doesn't officially support gymnasium yet (pettingzoo has migrated as of October 2022). There's some open pull requests which may get merged in the future (stable-baselines3 stable-baselines3-contrib but the only way to get it working currently is manually installing the feature branch: pip install git+https://github.com/DLR-RM/stable-baselines3@feat/gymnasium-support. I can do this in the meantime for development but I'd imagine merging the tutorial should probably wait on an official release.

Edit: looks like there's an API incompatibility with pettingzoo and the sb3 gymnasium support feature branch (link) so the pettingzoo tutorial will have to wait until that is fixed.

I got most of the pettingzoo code adapted though, I chose to use the SB3's frame stacking functionality rather than supersuit because the latter required casting types to float, stacking frames, and then casting back to uint8 for SB3. I've also heard that supersuit will be closed down and its functionality migrated directly into gymnasium and pettingzoo, so I figure it's best not to rely on it too much if possible.

My changes so far: elliottower@64b38e9

Luckily ray/rllib 2.3.0 does have gymnasium support so that shouldn't be a blocker, planning to work on that tomorrow.

elliottower · 2023-03-02T03:41:26Z

One minor issue I had with this repo was running the pylint script, didn't find any documentation on it and for some reason running pylint --rcfile .pylintrc didn't work.

I'm currently running the pytest suite but it seems to be going very slowly and says it has 4787 items. Edit: took 34 minutes 30 seconds on a 2019 intel mbp. The .sh install scripts worked great though and the tests ran pretty quickly there (less than 5 mins iirc)

duenez · 2023-03-02T16:01:36Z

Yes, some of our tests are very slow (the full substrate tests). I believe we don't run all the tests in the install.sh script.

elliottower · 2023-03-03T00:10:38Z

Is there a reason the pettingzoo env is defined with a separate superclass class _ParallelEnv? I got some issues when trying to use wrapper functions from pettingzoo/gymnasium/SB3 because the base class when you do env.unwrapped is _ParallelEnv.

Moving the EzPickle init line into the regular class (which is of the correct type parallel pettingzoo env) led to an error saying it can’t load the lab2d object from pickle. I don’t have experience with lab2d or the EzPickle function, imagine the specific definition of this ParallelEnv with the two superclasses (one for the env one for EzPickle) is what allows it to load properly?

Edit: Nevermind, I got the EzPickle component to work by restructuring the code and using wrappers. Will post a PR once I'm finished.

elliottower · 2023-03-04T00:49:23Z

Any idea on how to deal with these warnings? Getting them when running pettingzoo API tests, thinking they may be corner cases that the pettingzoo wrapper doesn't check for @jagapiou @duenez
W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/prefab_utils.lua:83] Character x not found in the charPrefabMap. Ignoring.

W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:225] GameObject 'scene' did not have a Transform component, but explicitly specifying one is strongly preferred. Using a default.

W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:219] GameObject 'scene' did not have a StateManager component, but explicitly specifying one is strongly preferred. Using a default.

Also PR is up: #117, running final tests and will then make a PR on shimmy with melting pot compatibility. I think using shimmy is best as it has comprehensive tests and is used for other API conversions, so I can update this PR to use shimmy once that gets added (as opposed to examples/pettingzoo/utils.py)

duenez · 2023-03-04T08:43:42Z

Those warnings are not critical. They happen when a substrate doesn't define a scene, it when their ASCII map contains a character that is not present in the prefab map. Which substrate is this for? I could just add those and fix the warnings. Shimmy seems like a reasonable approach to unify APIs. Thanks. Finally, regarding the EzPickle, this is needed, I believe, to serialise the environment. I'm particular, the EzPickle class has a __set_state__ and __get_state__ pair of functions ( https://github.com/openai/gym/blob/master/gym/utils/ezpickle.py) that I imagine allow for saving the environment. If you can modernise the code and not need that, it should be fine.

…

On Sat, 4 Mar 2023, 00:49 Elliot Tower, ***@***.***> wrote: Any idea on how to deal with these warnings? Getting them when running pettingzoo API tests, thinking they may be corner cases that the pettingzoo wrapper doesn't check for @jagapiou <https://github.com/jagapiou> @duenez <https://github.com/duenez> W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/prefab_utils.lua:83] Character x not found in the charPrefabMap. Ignoring. W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:225] GameObject 'scene' did not have a Transform component, but explicitly specifying one is strongly preferred. Using a default. W @/Users/elliottower/Documents/GitHub/meltingpot//meltingpot/lua/modules/base_simulation.lua:219] GameObject 'scene' did not have a StateManager component, but explicitly specifying one is strongly preferred. Using a default. Also PR is up: #117 <#117>, running final tests and will then make a PR on shimmy <https://github.com/Farama-Foundation/Shimmy> with melting pot compatibility. I think using shimmy is best as it has comprehensive tests and is used for other API conversions, so I can update this PR to use shimmy once that gets added (as opposed to examples/pettingzoo/utils.py) — Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAY7XEO4XO7EPPHRXV6CDJ3W2KGR5ANCNFSM6AAAAAAVLTHO6M> . You are receiving this because you were mentioned.Message ID: ***@***.***>

elliottower · 2023-03-05T18:08:06Z

Will get the list of substrates later today when I rerun the tests.

@duenez Do you happen to know if it’s possible to do rendering using PIL or something other than matplotlib? Not sure exactly how the rendering works internally but on the shimmy PR they raised concerns about not wanting to have to import matplotlib (most farama code uses pygame afaik and/or PIL, I think it might be a speed concern?)

And fwiw the pickle code is still used in other farama games like Atari and their devs said it looks fine.

duenez · 2023-03-07T07:59:04Z

Yes, of course. The observation is produced as a raw numpy array of height X width X channels (3, RGB). Si you can render this to anything you want. One possibility is doing imshow in matplotlib, but equally possible is using PIL.

elliottower · 2023-03-07T21:04:35Z

@duenez thanks, I ended up searching through the code more and finding an example using pygame rendering, which is what is used by all other pettingzoo envs, and mirrored that functionality in the shimmy wrapper.

elliottower · 2023-03-07T23:49:53Z

@duenez is it possible to reset the environment using a specific seed rather than re-initializing a new environment with a specified seed? The new gym API requires that the reset() function take a seed argument, I've been trying to adapt the code but it doesn't seem to be possible without initializing the environment again, which I also have had trouble figuring out how to do.

The only python file I see using seeding is builder_test.py but all of the example scripts and documentation seem to indicate that the higher level substrate.build() or substrate.build_from_config() are preferred to manually calling builder.builder.

I think the correct code should be something like this, but it says that the config is missing they key simulation.

config = meltingpot.python.substrate.get_config(substrate_name)
env = builder.build(config, env_seed=seed)

Have been having a lot of difficulty understanding the inner workings of this code and if it's even possible to do what I want to do. Maybe it's best to change the way the reset() function works internally to allow a seed to be specified?

duenez · 2023-03-08T14:21:11Z

You need to get the factory:
https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/substrate.py#L70

config = meltingpot.python.substrate.get_config(substrate_name)
factory = meltingpot.python.substrate.get_factory_from_config(config)
env = factory.build(config.default_player_roles)

If you need to explicitly set the seed, we currently don't expose it in the builder through the canonical way to build substrates. The reason we need a factory is because now the config doesn't need to know the number of players ahead of time. This means that we must have an intermediate step when we have the player roles to know what we need to build.

If this is critical, we can think about piping a seed through. What do you think?

elliottower · 2023-03-08T14:44:10Z

You need to get the factory:

https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/substrate.py#L70
config = meltingpot.python.substrate.get_config(substrate_name)
factory = meltingpot.python.substrate.get_factory_from_config(config)
env = factory.build(config.default_player_roles)
If you need to explicitly set the seed, we currently don't expose it in the builder through the canonical way to build substrates. The reason we need a factory is because now the config doesn't need to know the number of players ahead of time. This means that we must have an intermediate step when we have the player roles to know what we need to build.

If this is critical, we can think about piping a seed through. What do you think?

Thanks for the quick reply, makes sense about the number of players. So I take it using the factory doesn’t allow seeding either? It’s a core functionality of the current gymnasium and pettingzoo APIs so if it’s possible to make the reset method take a seed argument that would be awesome.

In gymnasium/pettingzoo you have to call reset() on the env before using it and set the seed that way, but if your code doesn’t do that maybe it’s better to pipe the seed through the factory build method and I can make the reset method rebuild the substrate using that.

We are also planning to make a shimmy wrapper for lab2d and gymnasium, do you happen to know if that would run into the same issue?

duenez · 2023-03-08T14:59:05Z

It depends on the shimmy API. We usually don't have a way to reset substrates without rebuilding the whole thing anyway, because some stochastic choices occur at the creation stage. This is why we have a reset wrapper that does this: https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/utils/substrates/wrappers/reset_wrapper.py

elliottower · 2023-03-08T17:16:09Z

It depends on the shimmy API. We usually don't have a way to reset substrates without rebuilding the whole thing anyway, because some stochastic choices occur at the creation stage. This is why we have a reset wrapper that does this: https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/utils/substrates/wrappers/reset_wrapper.py

Makes sense, I did some more testing and saw that the underlying dm_env code defines _rng to be a certain seed so I tried setting env._rng in the shimmy wrapper but then noticed what you mentioned about the stochastic environment creation. So is it possible to modify the reset wrapper to take in a seed? If you could help out with adding support for seeding that it would be greatly appreciated.

I could look into it myself as well (and potentially make a PR) but I’d need a better idea of where to look and which files to modify. Might be simpler for someone more familiar with the repo though.

duenez · 2023-03-08T22:45:44Z

I'll take a look tomorrow.

…

On Wed, 8 Mar 2023, 17:16 Elliot Tower, ***@***.***> wrote: It depends on the shimmy API. We usually don't have a way to reset substrates without rebuilding the whole thing anyway, because some stochastic choices occur at the creation stage. This is why we have a reset wrapper that does this: https://github.com/deepmind/meltingpot/blob/7de41d2db0e5eca31107312d405e20ff3a7da39e/meltingpot/python/utils/substrates/wrappers/reset_wrapper.py Makes sense, I did some more testing and saw that the underlying dm_env code defines _rng to be a certain seed so I tried setting env._rng in the shimmy wrapper but then noticed what you mentioned about the stochastic environment creation. So is it possible to modify the reset wrapper to take in a seed? If you could help out with adding support for seeding that it would be greatly appreciated. I could look into it myself as well (and potentially make a PR) but I’d need a better idea of where to look and which files to modify. Might be simpler for someone more familiar with the repo though. — Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAY7XENH7ZTHPYD2KCTSROTW3C5GJANCNFSM6AAAAAAVLTHO6M> . You are receiving this because you were mentioned.Message ID: ***@***.***>

elliottower · 2023-03-08T22:53:53Z

Appreciate it

jagapiou · 2023-03-09T15:04:06Z

Fundamentally, there's a API incompatibility here: a gym reset(seed) cannot be forwarded to a dmenv reset(). So you'll have to rebuild the Substrate/Scenario on reset as you say.

~~FYI right now you can change the Substrate seed at training-time:~~ [EDIT: this was true in v1 it is not true in v2]

But that won't help with Scenarios, and they'd need a seed to also be passed to the background bots when building them.

There may also be other sources of randomness (e.g. calls to random in Wrappers) or other sources of non determinism (e.g. python set iteration order varies on each process) that might make things hard. So we'd need a bunch of tests of determinism.

Finally, rather than adding an optional seed to all build methods, I think it might be better to add a seed (optional so we're still DMEnv compatible) to dmlab2d.Lab2d.reset (https://github.com/deepmind/lab2d/blob/main/dmlab2d/__init__.py). Since this is where the seed is actually randomized and set. Then our Substrate.reset method can then inherit and use that (and I think we can kill our ResetWrapper).

So it's not a small job. Alternatively, is there a quick hack?

Ignore the seed argument and just let the environment be stochastic (trigger a warning or whatever).
If reset(seed) accepts a default or a "randomize seed" sentinel, raise a NotImplementedError if a fixed seed has been requested.

jagapiou · 2023-05-04T09:45:13Z

Sorry update here: it's not currently possible to change the substrate seed. builder takes an env_seed that allows this but it is not currently wired in.

dimonenka · 2023-07-23T15:47:54Z

I have locally updated the rllib example to use ray==2.5.1 instead of 2.0.0, should I submit a pull request?

jagapiou · 2023-07-24T09:33:00Z

Amazing, yes please!

elliottower · 2023-07-24T18:06:08Z

Just fwiw when I get a chance I’m planning to do a PR updating the pettingzoo example to use shimmy and current pettingzoo, I’ve created official SB3 tutorials for pettingzoo in the most recent release so it would be cool to show this on our docs site as another example using SB3.

Also, once RLlib merges my PR (they have some other blockers but it should be soon) RLlib will be fully compatible with current gymnasium/pettingzoo, so users will be able to do RLlib through pettingzoo if they want. well as directly like in these tutorials.

We have a full CleanRL training script for Atari games that I imagine could also be similarly adapted to work on melting pot using a pettingzoo wrapper. I won’t have time to fully adapt it but thinking when I do at least update the pettingzoo tutorial I can mention something along the lines of “this enables you to use any sort of training framework as shown in our tutorials” (we even have one using LangChain, though that’s only suitable for games like chess, I would imagine LLMs would do poorly on visual input like this)

duenez · 2023-07-24T19:05:40Z

Fantastic, let me know if there's anything on our side that's needed.

jagapiou · 2023-07-25T23:24:57Z

FYI: @dimonenka's PR updating rllib: #153

itstyren · 2023-07-29T19:32:27Z

Hi, I found a minor typo in the file "example/pettingzoo/sb3_train.py" on line 86.

The environment name should be commons_harvest__open instead of commons_harvest_open. Could you please take a look? Thanks.

elliottower · 2023-09-06T14:18:51Z

Hey all, I’ve been planning on it for a while but finally have time to make a PR updating this tutorial now that PettingZoo has official SB3 tutorials, was thinking about other possible tutorial options such as AgileRL, which now supports PettingZoo directly.

Also would be nice to demonstrate the Shimmy conversion wrapper I made, which was based on this example but is more full featured and tested against every underlying environment in our CI, supports serialization via pickle, etc. Maybe @duenez has thoughts on the best way to proceed?

benbind · 2024-05-17T03:48:49Z

It looks like I'm late to the party! Is it now possible to make commons_harves__open deterministic? The 'wired in' hyperlink seemed like it was useful, but it looks like that page has been taken down at this point.

jzleibo · 2024-05-17T15:47:26Z

It should be possible to make the substrates deterministic. You can see how the seed is set here:

meltingpot/meltingpot/python/utils/substrates/builder.py

Lines 176 to 189 in 7de41d2

    
           if env_seed is None: 
        
             # Select a long seed different than zero. 
        
             env_seed = random.randint(1, _MAX_SEED) 
        
           env_seeds = (seed % (_MAX_SEED + 1) for seed in itertools.count(env_seed)) 
        
           def build_environment(): 
        
             seed = next(env_seeds) 
        
             lab2d_settings_dict["env_seed"] = str(seed)  # Sets the Lua seed. 
        
             env_raw = dmlab2d.Lab2d(_DMLAB2D_ROOT, lab2d_settings_dict) 
        
             observation_names = env_raw.observation_names() 
        
             return dmlab2d.Environment( 
        
                 env=env_raw, 
        
                 observation_names=observation_names, 
        
                 seed=seed)

It should work if you just change that logic to pass a fixed seed.

benbind · 2024-05-17T19:08:22Z

Even with overwriting a seed, the agents spawn in different areas:

duenez · 2024-05-17T19:14:39Z

You might need to use a custom builder, the raw builder in builder.py
(https://github.com/google-deepmind/meltingpot/blob/main/meltingpot/utils/substrates/builder.py) does allow a seed, and does lead to deterministic results for me. But the normal substrate.build() does not by design: https://github.com/google-deepmind/meltingpot/blob/main/meltingpot/substrate.py

duenez · 2024-05-17T19:17:32Z

Actually, that's where you were doing this. Don't use the zero seed, because that means to resample a seed. Use a fixed, non-zero one (like 42 :) ). If that fails, try setting you python random seed too, as some objects are built on the python side, but I don't think avatars would.

benbind · 2024-05-17T20:17:45Z

Still no dice:

In builder.py I'm directly setting seed = 42, as well as setting resetting random.seed(42) each time I call random.random().

jagapiou assigned jagapiou and duenez and unassigned jagapiou Mar 1, 2023

jagapiou mentioned this issue May 4, 2023

Setting a random seed #129

Closed

elliottower mentioned this issue Sep 6, 2023

Contributing towards additional baselines/tutorials resources rstrivedi/Melting-Pot-Contest-2023#8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: update rllib and pettingzoo examples to current release versions #113

Proposal: update rllib and pettingzoo examples to current release versions #113

elliottower commented Mar 1, 2023

jagapiou commented Mar 1, 2023

duenez commented Mar 1, 2023

elliottower commented Mar 2, 2023 •

edited

Loading

elliottower commented Mar 2, 2023 •

edited

Loading

duenez commented Mar 2, 2023

elliottower commented Mar 3, 2023 •

edited

Loading

elliottower commented Mar 4, 2023

duenez commented Mar 4, 2023 via email

elliottower commented Mar 5, 2023

duenez commented Mar 7, 2023

elliottower commented Mar 7, 2023

elliottower commented Mar 7, 2023

duenez commented Mar 8, 2023

elliottower commented Mar 8, 2023 •

edited

Loading

duenez commented Mar 8, 2023

elliottower commented Mar 8, 2023

duenez commented Mar 8, 2023 via email

elliottower commented Mar 8, 2023

jagapiou commented Mar 9, 2023 •

edited

Loading

jagapiou commented May 4, 2023

dimonenka commented Jul 23, 2023

jagapiou commented Jul 24, 2023

elliottower commented Jul 24, 2023 •

edited

Loading

duenez commented Jul 24, 2023

jagapiou commented Jul 25, 2023

itstyren commented Jul 29, 2023

elliottower commented Sep 6, 2023

benbind commented May 17, 2024

jzleibo commented May 17, 2024

benbind commented May 17, 2024

duenez commented May 17, 2024

duenez commented May 17, 2024

benbind commented May 17, 2024

Proposal: update rllib and pettingzoo examples to current release versions #113

Proposal: update rllib and pettingzoo examples to current release versions #113

Comments

elliottower commented Mar 1, 2023

jagapiou commented Mar 1, 2023

duenez commented Mar 1, 2023

elliottower commented Mar 2, 2023 • edited Loading

elliottower commented Mar 2, 2023 • edited Loading

duenez commented Mar 2, 2023

elliottower commented Mar 3, 2023 • edited Loading

elliottower commented Mar 4, 2023

duenez commented Mar 4, 2023 via email

elliottower commented Mar 5, 2023

duenez commented Mar 7, 2023

elliottower commented Mar 7, 2023

elliottower commented Mar 7, 2023

duenez commented Mar 8, 2023

elliottower commented Mar 8, 2023 • edited Loading

duenez commented Mar 8, 2023

elliottower commented Mar 8, 2023

duenez commented Mar 8, 2023 via email

elliottower commented Mar 8, 2023

jagapiou commented Mar 9, 2023 • edited Loading

jagapiou commented May 4, 2023

dimonenka commented Jul 23, 2023

jagapiou commented Jul 24, 2023

elliottower commented Jul 24, 2023 • edited Loading

duenez commented Jul 24, 2023

jagapiou commented Jul 25, 2023

itstyren commented Jul 29, 2023

elliottower commented Sep 6, 2023

benbind commented May 17, 2024

jzleibo commented May 17, 2024

benbind commented May 17, 2024

duenez commented May 17, 2024

duenez commented May 17, 2024

benbind commented May 17, 2024

elliottower commented Mar 2, 2023 •

edited

Loading

elliottower commented Mar 2, 2023 •

edited

Loading

elliottower commented Mar 3, 2023 •

edited

Loading

elliottower commented Mar 8, 2023 •

edited

Loading

jagapiou commented Mar 9, 2023 •

edited

Loading

elliottower commented Jul 24, 2023 •

edited

Loading