-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error running example "self_play_train.py" #10
Comments
UPDATE: I tried narrowing down the issue and I found that when converting dm_env to gym environment, the observation dict for the following keys has issues Env name: "allelopathic_harvest"
So basically this exists only for dtypes np.float64 as numpy gives overflow error for uniform sampling when sampling for min to max range of np.float64 I solved it by updating line return spaces.Box(info.min/10, info.max/10, spec.shape, spec.dtype) Not sure if this is the right thing to do and whether it affects the environment in anyway. After solving this I ran into another error stating
This seems to be because there is no inbuilt config in rllib for the observation space from the example environment. |
Previously we used the limits of precision, but this caused issues (#10). Since the bounds were finite spaces.Box would attempt uniform sampling. By passing np.inf spaces.Box will instead sample using Normal/Exponential distribution as desired. PiperOrigin-RevId: 421009922 Change-Id: Ie24571a8ab72fc2564b302ff0e9dbad9e5856a9e
Heya, thanks for raising these issues. Regarding your first point on Note that these specs aren't precise. For example, I'll leave this open so your second issue can be dealt with. |
Hi, how did you set the
|
Hi both! I just submitted a fix for this: 6a3a5c2. The example should work now. However, please take note that in order to please RLLib, I had to pick a different conv net configuration from what we used to run the baselines in the paper. See the explanation here. You might want to experiment with different sizes. |
Dear Prof Leibo, Thanks for your quick response. Could you please also specify the version of Python, Ray and TensorFlow in the repo? My laptop is MacBook and the version of Python, Ray and TensorFlow are 3.9, 1.10.0 and 2.8 respectively. However, when running the example code, it returns the following error:
I guess there might be something different in the versions. As Ray is updating and changing in its versions. I would appreciate it if you can specify the versions of all the dependencies. |
Hmm, I've never seen the error you got there. I tested it on a virtual machine with Python 3.7.12 and Ray 1.10.0. It's not clear to me what version of tensorflow I had there. It would be whatever version came along when I installed Ray. |
Dear Prof Leibo, thank you for the clarification and the response. I guess maybe the issue came from TF which caused the error of Ray. After changing the 434-th line of ray/rllib/models/modelv2.py to the following code, the example code can work now # here the workaround is simply adding `or v is None`
v if (isinstance(v, int) or v is None) else v.value for v in obs.shape[:-1] Thank you. |
I'm getting following error when I try to run the given example code
Environment details:
The text was updated successfully, but these errors were encountered: