API to use rlenvs (lua environments) #28 #29

SeanNaren · 2016-09-29T09:35:48Z

Work in progress but base files to start implementing rlenvs into twrl! Current tasks:

Modify env_action_space_info and env_observation_space_info to return in the same format as openai-gym.
Add support to switch between openai-gym and rlenvs.
Unit tests for api
Add working example in /examples folder showing interaction with a Lua rlenv

@korymath I've added you as a contributor to the fork, feel free to make changes yourself! Again thanks for your work so far :)

korymath · 2016-09-29T15:13:04Z

Great news! Loving the look of how this is coming along...

…ter sweeps

Kaixhin · 2016-09-30T15:51:13Z

Thanks for handling this - if it makes sense to change part of rlenvs instead, please let me know/send a PR. AFAIK Atari is the only major repo that uses rlenvs, so changes should be fine. Worst case we can make two rockspecs to handle API changes.

The API is also a bit rough, so using good ideas from gym would be helpful in the long run.

SeanNaren · 2016-09-30T16:30:08Z

Hey @Kaixhin thanks for your library, glad I could help :) I've made some more changes here and there, will continue to plug along! One thing that's a bit iffy right now is the best way to support the displaying of the game which I think will need us to use qlua rather than th. Any suggestions?

Kaixhin · 2016-09-30T16:48:57Z

@SeanNaren Sorry had a think about this and can't come up with a much better solution. image.display is a little bit restrictive, but works really well for the cases most people care about e.g. ALE or GridWorld.

korymath · 2016-09-30T21:04:43Z

I would argue that displaying of the games is important, but not necessarily critical, unless a visual interpretation of the game is necessary (for representation or human interaction).

As well, I really like the way that the gym wraps up their action/observation specifications. Named entities in dictionaries (or tables in Lua) helps to avoid mysterious indexing.

Kaixhin · 2016-09-30T21:20:04Z

@korymath Agreed on the way they do their spec. I'm making both of you collaborators on rlenvs - do you think you'd be able to overhaul the API to better match gym? I've got a copy of rlenvs-scm-1.rockspec pointing to a v1 branch, whilst rlenvs-scm-2.rockspec is pointing to master. You can either develop on a different branch or handle it via a PR.

As popular as gym is, rlenvs still serves a purpose for baseline RL tests for the Torch community. Rather than leaving it as it is, I think now with twrl it's worth rethinking the API. Any ideas for displays are welcome too.

SeanNaren · 2016-09-30T21:24:10Z

@Kaixhin this is a cool idea, a lot of the wrapper seems to be just manipulating the output of rlenvs into the right format. Don't see why it couldn't just be done inside rlenvs! I'll open up a branch and get to work, thanks a tonne!

korymath · 2016-09-30T21:25:31Z

Much appreciated!

On 30 September 2016 at 15:24, Sean Naren notifications@github.com wrote:

@Kaixhin https://github.com/Kaixhin this is a cool idea, a lot of the
wrapper seems to be just manipulating the output of rlenvs into the right
format. Don't see why it couldn't just be done inside rlenvs! I'll open
up a branch and get to work, thanks a tonne!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2icYGVTqKIHoeAKI2FUwtKRBhK9ks5qvX36gaJpZM4KJx0L
.

Kory Mathewson

…tilecoding

SeanNaren · 2016-10-21T14:24:40Z

The rlenvs conversions are nearing an end! A few kinks to work out (maxSteps of environments, render support if any) then I'll get the unit tests done. Also will try to get policy gradients to work with Catch, that would be nice :)

korymath · 2016-10-25T18:26:30Z

This is fantastic work, and a great update. Thank you for the commitment
and development thus far.

On 21 October 2016 at 08:24, Sean Naren notifications@github.com wrote:

The rlenvs conversions are nearing an end! A few kinks to work out
(maxSteps of environments, render support if any) then I'll get the unit
tests done. Also will try to get policy gradients to work with Catch, that
would be nice :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2lDh_-NxCEdNRWV-TF5-FwvjeVKks5q2MsogaJpZM4KJx0L
.

Kory Mathewson

SeanNaren · 2016-10-26T09:35:25Z

Thanks @korymath! Need some advice on how to approach this; for the rendering to work for rlenvs we need to use qlua instead of th. qlua is installed as an optional package in Torch so most people I assume will have it installed. How would you suggest approaching this?

korymath · 2016-10-31T17:09:25Z

One way to handle it is a branch?
Alternatively, it can be optionally installed if individuals want to use
it. I think that we do not want to make it a requirement, but rather to
make it optional, for if you want to use pure Lua solutions.

K

On 26 October 2016 at 03:35, Sean Naren notifications@github.com wrote:

Thanks @korymath https://github.com/korymath! Need some advice on how
to approach this; for the rendering to work for rlenvs we need to use qlua
instead of th. qlua is installed as an optional
https://github.com/torch/distro/blob/master/install.sh#L126 package in
Torch so most people I assume will have it installed. How would you suggest
approaching this?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s7d5KPhyYgZAcRR6cTR7P72k-tNRks5q3x7dgaJpZM4KJx0L
.

Kory Mathewson

SeanNaren · 2016-11-01T14:14:01Z

I'll have a think as to the best way to implement this, I think a branch might be overkill for adding this functionality! I'm going to start adding unit tests to the rlenvs package first, and I think that might be enough coverage since the client wrapper is a very thin layer over the top of this!

Also started messing with a Catch example bash script, been having some difficulties to prevent it exploding and getting stuck on a certain action, but will keep plugging along!

korymath · 2016-11-01T17:39:08Z

I want to make sure that this software can be deployed on a server sans any
sort of visualization. So, I agree with you, and want to make sure that
environments are not required to render to run.

I will try the catch example if you provide a run code?

On 1 November 2016 at 08:14, Sean Naren notifications@github.com wrote:

I'll have a think as to the best way to implement this, I think a branch
might be overkill for adding this functionality! I'm going to start adding
unit tests to the rlenvs package first, and I think that might be enough
coverage since the client wrapper is a very thin layer over the top of
this!

Also started messing with a Catch example bash script, been having some
difficulties to prevent it exploding and getting stuck on a certain action,
but will keep plugging along!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAK3s2yjZurPISiSYVbRMX_PBdvcSbZNks5q50kpgaJpZM4KJx0L
.

Kory Mathewson

SeanNaren · 2016-11-02T10:40:43Z

I've added the catch example script, if you want to render switch the th call with qlua for now :)

EDIT: This should probably be a new issue, but I've noticed strange behaviour with -verboseUpdate param since it defaults to false. The other boolean variables return strings of true or false, however this returns an actual boolean. This is probably also the case for any variables that default to true!

korymath · 2016-11-02T18:05:37Z

Nice catch on the default value... perhaps you can open an issue to flag that?

SeanNaren · 2016-12-05T11:24:26Z

Little update, just waiting on a code check from Kai and then the rlenvs integration portion is done! Then I'll finalise integration into torch-twrl!

korymath · 2017-01-07T19:56:42Z

Sounds good on this. Keeping an eye on it now that I am back at it.

Kaixhin · 2017-01-07T20:21:29Z

Just checking - the PR for rlenvs v2 is finished, so nothing more to do in that repo?

SeanNaren · 2017-01-07T20:28:12Z

@Kaixhin yep should be! I'll get back to this next week, I think priority is getting the catch example working with policy gradients :)

korymath · 2017-01-07T20:39:03Z

Agree. Nice work on this, and happy to get back at it after shaking off holiday dust.

…

On 7 January 2017 at 13:28, Sean Naren ***@***.***> wrote: @Kaixhin <https://github.com/Kaixhin> yep should be! I'll get back to this next week, I think priority is getting the catch example working with policy gradients :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAK3s1Nq_FMOm6Z9dnVx2KANvHiTKpI8ks5rP_VcgaJpZM4KJx0L> .

--

____________________________ Kory Mathewson

korymath · 2017-01-30T18:25:34Z

Are we just about set on the @SeanNaren? How can I help?

SeanNaren · 2017-02-01T10:01:20Z

@korymath sorry for the late response, the last thing I wanted to do was get the rlenvs-catch example working, having some difficulties with that...

SeanNaren · 2017-05-27T15:54:56Z

Hey @korymath, thought it be nice to wrap this up :) Opening a new PR!

SeanNaren added 3 commits September 29, 2016 10:31

Base api for rlenvs

3734f2f

Modify to select the max value of the upper/lower bound

b5c3d5e

Transformation of specs into correct format

2db64ab

korymath mentioned this pull request Sep 29, 2016

Interaction with Lua/Torch specific environments #28

Closed

korymath and others added 3 commits September 29, 2016 17:06

updating the run code directory structure, and restructure for parame…

fbf8f39

…ter sweeps

remove old lua paramtersweep, use the shell based one instead

11c5d13

More changes to support rlenvs through run script

37b373c

Support for different envs

efe4e1e

Kory Mathewson and others added 4 commits September 30, 2016 11:16

sweep issues

655572f

sweep code fix for Python/python case-sensitive process kill

8e6fd27

remove env_close function, which was buggering things up

3a0829d

adding log directory

0fb9ad0

shell sweep script

ef4ae51

SeanNaren mentioned this pull request Oct 1, 2016

Modify envs to be compatible with twrl Kaixhin/rlenvs#8

Closed

korymath added 8 commits October 3, 2016 19:54

nicer sweep run script

5e486fb

fix verbose output for boolean or string

6e82e25

more sweep testing

1f8517a

adding testing script, updated submodule

d2c51b7

adding modified submodule

e37e8c2

adding a debug stacktrace to the proctected run call, working to fix …

5841704

…tilecoding

more changes tuning tdLambda

1ed82d8

adding note about statescaling

dc2ebcc

Updated rlenvs api calls, added torch requires for qlua support

c85d630

Use render method exposed in rlenvs

1e141e2

Added rlenvs catch example, added zoom option for render

4b1f344

SeanNaren added 9 commits January 12, 2017 14:29

Base api for rlenvs

7e6b9a2

Modify to select the max value of the upper/lower bound

1004a33

Transformation of specs into correct format

9d57a6c

More changes to support rlenvs through run script

d69dafd

Support for different envs

92019cf

Updated rlenvs api calls, added torch requires for qlua support

44a6fe9

Use render method exposed in rlenvs

5f40801

Added rlenvs catch example, added zoom option for render

c759952

Merge branch 'master' of https://github.com/SeanNaren/torch-twrl

83e84ad

SeanNaren closed this May 27, 2017

SeanNaren mentioned this pull request May 27, 2017

Added rlenvs integration #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API to use rlenvs (lua environments) #28 #29

API to use rlenvs (lua environments) #28 #29

SeanNaren commented Sep 29, 2016 •

edited

Loading

korymath commented Sep 29, 2016

Kaixhin commented Sep 30, 2016 •

edited

Loading

SeanNaren commented Sep 30, 2016

Kaixhin commented Sep 30, 2016

korymath commented Sep 30, 2016

Kaixhin commented Sep 30, 2016 •

edited

Loading

SeanNaren commented Sep 30, 2016

korymath commented Sep 30, 2016

SeanNaren commented Oct 21, 2016

korymath commented Oct 25, 2016

SeanNaren commented Oct 26, 2016

korymath commented Oct 31, 2016

SeanNaren commented Nov 1, 2016

korymath commented Nov 1, 2016

SeanNaren commented Nov 2, 2016 •

edited

Loading

korymath commented Nov 2, 2016

SeanNaren commented Dec 5, 2016

korymath commented Jan 7, 2017

Kaixhin commented Jan 7, 2017

SeanNaren commented Jan 7, 2017

korymath commented Jan 7, 2017 via email

korymath commented Jan 30, 2017

SeanNaren commented Feb 1, 2017

SeanNaren commented May 27, 2017

API to use rlenvs (lua environments) #28 #29

API to use rlenvs (lua environments) #28 #29

Conversation

SeanNaren commented Sep 29, 2016 • edited Loading

korymath commented Sep 29, 2016

Kaixhin commented Sep 30, 2016 • edited Loading

SeanNaren commented Sep 30, 2016

Kaixhin commented Sep 30, 2016

korymath commented Sep 30, 2016

Kaixhin commented Sep 30, 2016 • edited Loading

SeanNaren commented Sep 30, 2016

korymath commented Sep 30, 2016

SeanNaren commented Oct 21, 2016

korymath commented Oct 25, 2016

SeanNaren commented Oct 26, 2016

korymath commented Oct 31, 2016

SeanNaren commented Nov 1, 2016

korymath commented Nov 1, 2016

SeanNaren commented Nov 2, 2016 • edited Loading

korymath commented Nov 2, 2016

SeanNaren commented Dec 5, 2016

korymath commented Jan 7, 2017

Kaixhin commented Jan 7, 2017

SeanNaren commented Jan 7, 2017

korymath commented Jan 7, 2017 via email

korymath commented Jan 30, 2017

SeanNaren commented Feb 1, 2017

SeanNaren commented May 27, 2017

SeanNaren commented Sep 29, 2016 •

edited

Loading

Kaixhin commented Sep 30, 2016 •

edited

Loading

Kaixhin commented Sep 30, 2016 •

edited

Loading

SeanNaren commented Nov 2, 2016 •

edited

Loading