Skip to content
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.

API to use rlenvs (lua environments) #28 #29

Closed
wants to merge 41 commits into from
Closed

API to use rlenvs (lua environments) #28 #29

wants to merge 41 commits into from

Conversation

SeanNaren
Copy link

@SeanNaren SeanNaren commented Sep 29, 2016

Work in progress but base files to start implementing rlenvs into twrl! Current tasks:

  • Modify env_action_space_info and env_observation_space_info to return in the same format as openai-gym.
  • Add support to switch between openai-gym and rlenvs.
  • Unit tests for api
  • Add working example in /examples folder showing interaction with a Lua rlenv

@korymath I've added you as a contributor to the fork, feel free to make changes yourself! Again thanks for your work so far :)

@korymath
Copy link
Contributor

Great news! Loving the look of how this is coming along...

@Kaixhin
Copy link

Kaixhin commented Sep 30, 2016

Thanks for handling this - if it makes sense to change part of rlenvs instead, please let me know/send a PR. AFAIK Atari is the only major repo that uses rlenvs, so changes should be fine. Worst case we can make two rockspecs to handle API changes.

The API is also a bit rough, so using good ideas from gym would be helpful in the long run.

@SeanNaren
Copy link
Author

Hey @Kaixhin thanks for your library, glad I could help :) I've made some more changes here and there, will continue to plug along! One thing that's a bit iffy right now is the best way to support the displaying of the game which I think will need us to use qlua rather than th. Any suggestions?

@Kaixhin
Copy link

Kaixhin commented Sep 30, 2016

@SeanNaren Sorry had a think about this and can't come up with a much better solution. image.display is a little bit restrictive, but works really well for the cases most people care about e.g. ALE or GridWorld.

@korymath
Copy link
Contributor

I would argue that displaying of the games is important, but not necessarily critical, unless a visual interpretation of the game is necessary (for representation or human interaction).

As well, I really like the way that the gym wraps up their action/observation specifications. Named entities in dictionaries (or tables in Lua) helps to avoid mysterious indexing.

@Kaixhin
Copy link

Kaixhin commented Sep 30, 2016

@korymath Agreed on the way they do their spec. I'm making both of you collaborators on rlenvs - do you think you'd be able to overhaul the API to better match gym? I've got a copy of rlenvs-scm-1.rockspec pointing to a v1 branch, whilst rlenvs-scm-2.rockspec is pointing to master. You can either develop on a different branch or handle it via a PR.

As popular as gym is, rlenvs still serves a purpose for baseline RL tests for the Torch community. Rather than leaving it as it is, I think now with twrl it's worth rethinking the API. Any ideas for displays are welcome too.

@SeanNaren
Copy link
Author

@Kaixhin this is a cool idea, a lot of the wrapper seems to be just manipulating the output of rlenvs into the right format. Don't see why it couldn't just be done inside rlenvs! I'll open up a branch and get to work, thanks a tonne!

@korymath
Copy link
Contributor

Much appreciated!

On 30 September 2016 at 15:24, Sean Naren notifications@github.com wrote:

@Kaixhin https://github.com/Kaixhin this is a cool idea, a lot of the
wrapper seems to be just manipulating the output of rlenvs into the right
format. Don't see why it couldn't just be done inside rlenvs! I'll open
up a branch and get to work, thanks a tonne!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2icYGVTqKIHoeAKI2FUwtKRBhK9ks5qvX36gaJpZM4KJx0L
.


Kory Mathewson

@SeanNaren
Copy link
Author

The rlenvs conversions are nearing an end! A few kinks to work out (maxSteps of environments, render support if any) then I'll get the unit tests done. Also will try to get policy gradients to work with Catch, that would be nice :)

@korymath
Copy link
Contributor

This is fantastic work, and a great update. Thank you for the commitment
and development thus far.

On 21 October 2016 at 08:24, Sean Naren notifications@github.com wrote:

The rlenvs conversions are nearing an end! A few kinks to work out
(maxSteps of environments, render support if any) then I'll get the unit
tests done. Also will try to get policy gradients to work with Catch, that
would be nice :)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2lDh_-NxCEdNRWV-TF5-FwvjeVKks5q2MsogaJpZM4KJx0L
.


Kory Mathewson

@SeanNaren
Copy link
Author

Thanks @korymath! Need some advice on how to approach this; for the rendering to work for rlenvs we need to use qlua instead of th. qlua is installed as an optional package in Torch so most people I assume will have it installed. How would you suggest approaching this?

@korymath
Copy link
Contributor

One way to handle it is a branch?
Alternatively, it can be optionally installed if individuals want to use
it. I think that we do not want to make it a requirement, but rather to
make it optional, for if you want to use pure Lua solutions.

K

On 26 October 2016 at 03:35, Sean Naren notifications@github.com wrote:

Thanks @korymath https://github.com/korymath! Need some advice on how
to approach this; for the rendering to work for rlenvs we need to use qlua
instead of th. qlua is installed as an optional
https://github.com/torch/distro/blob/master/install.sh#L126 package in
Torch so most people I assume will have it installed. How would you suggest
approaching this?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s7d5KPhyYgZAcRR6cTR7P72k-tNRks5q3x7dgaJpZM4KJx0L
.


Kory Mathewson

@SeanNaren
Copy link
Author

I'll have a think as to the best way to implement this, I think a branch might be overkill for adding this functionality! I'm going to start adding unit tests to the rlenvs package first, and I think that might be enough coverage since the client wrapper is a very thin layer over the top of this!

Also started messing with a Catch example bash script, been having some difficulties to prevent it exploding and getting stuck on a certain action, but will keep plugging along!

@korymath
Copy link
Contributor

korymath commented Nov 1, 2016

I want to make sure that this software can be deployed on a server sans any
sort of visualization. So, I agree with you, and want to make sure that
environments are not required to render to run.

I will try the catch example if you provide a run code?

On 1 November 2016 at 08:14, Sean Naren notifications@github.com wrote:

I'll have a think as to the best way to implement this, I think a branch
might be overkill for adding this functionality! I'm going to start adding
unit tests to the rlenvs package first, and I think that might be enough
coverage since the client wrapper is a very thin layer over the top of
this!

Also started messing with a Catch example bash script, been having some
difficulties to prevent it exploding and getting stuck on a certain action,
but will keep plugging along!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2yjZurPISiSYVbRMX_PBdvcSbZNks5q50kpgaJpZM4KJx0L
.


Kory Mathewson

@SeanNaren
Copy link
Author

SeanNaren commented Nov 2, 2016

I've added the catch example script, if you want to render switch the th call with qlua for now :)

EDIT: This should probably be a new issue, but I've noticed strange behaviour with -verboseUpdate param since it defaults to false. The other boolean variables return strings of true or false, however this returns an actual boolean. This is probably also the case for any variables that default to true!

@korymath
Copy link
Contributor

korymath commented Nov 2, 2016

Nice catch on the default value... perhaps you can open an issue to flag that?

@SeanNaren
Copy link
Author

Little update, just waiting on a code check from Kai and then the rlenvs integration portion is done! Then I'll finalise integration into torch-twrl!

@korymath
Copy link
Contributor

korymath commented Jan 7, 2017

Sounds good on this. Keeping an eye on it now that I am back at it.

@Kaixhin
Copy link

Kaixhin commented Jan 7, 2017

Just checking - the PR for rlenvs v2 is finished, so nothing more to do in that repo?

@SeanNaren
Copy link
Author

@Kaixhin yep should be! I'll get back to this next week, I think priority is getting the catch example working with policy gradients :)

@korymath
Copy link
Contributor

korymath commented Jan 7, 2017 via email

@korymath
Copy link
Contributor

Are we just about set on the @SeanNaren? How can I help?

@SeanNaren
Copy link
Author

@korymath sorry for the late response, the last thing I wanted to do was get the rlenvs-catch example working, having some difficulties with that...

@SeanNaren
Copy link
Author

Hey @korymath, thought it be nice to wrap this up :) Opening a new PR!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants