No more Q states #30

Raelifin · 2017-08-04T22:31:32Z

I've never seen a "Q-State" outside of Teacher. Q functions take (state,action) pairs. Storing the concatenation of states and actions isn't particularly elegant and only saves us a tiny amount of computation.

Furthermore, concatenating states and actions only really works when they're the same rank. For Atari environments or other non-MuJoCo environments this will not always be the case. We can extend support for more environments by pulling them apart and adding some checks for environment dimensionality.

Raelifin · 2017-08-04T22:33:11Z

Oh, I also rearrange the main setup logic for Teacher so that it's more likely to immediately crash if something doesn't work, rather than, say, do rollouts and then crash.

nottombrown · 2017-08-05T16:27:07Z

Hmmm, q_state is the standard name for the (obs, action) pairs that are used in q-learning:

https://www.google.com/search?q="q+state"+mdp

However, I think that you're right that keeping the actions and observations explicitly separated until they are fed into the neural net is an improvement. It should make the system more accessible to newcomers.

Have you tested that this PR doesn't cause any performance regressions on Mujoco?

nottombrown · 2017-08-05T16:30:29Z

rl_teacher/teach.py

+            segement_alt_act = self.segment_alt_act_placeholder
+
+        # A vanilla multi-layer perceptron maps a (state, action) pair to a reward (Q-value)
+        mlp = FullyConnectedMLP(self.obs_shape, self.act_shape)


Good call moving mlp to a local variable 👌

I haven't tested on MuJoCo, as my license expired and I wasn't able to renew. :(

nottombrown · 2017-08-15T00:41:56Z

Made some small changes and am running regression tests currently

nottombrown · 2017-08-15T21:28:42Z

Doesn't break performance (actually improved things on this seed). Merging in!

Raelifin added 2 commits August 4, 2017 18:21

No more Q-states. Support for more complex environments.

02440f1

Whoops I suck at git

b42206d

nottombrown reviewed Aug 5, 2017

View reviewed changes

Fix typo, remove unused discrete actionspace support

64acecc

nottombrown merged commit 18670ce into master Aug 15, 2017

nottombrown deleted the no-more-q-states branch August 15, 2017 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No more Q states #30

No more Q states #30

Raelifin commented Aug 4, 2017

Raelifin commented Aug 4, 2017

nottombrown commented Aug 5, 2017

nottombrown Aug 5, 2017

Raelifin Aug 7, 2017

nottombrown commented Aug 15, 2017

nottombrown commented Aug 15, 2017

No more Q states #30

No more Q states #30

Conversation

Raelifin commented Aug 4, 2017

Raelifin commented Aug 4, 2017

nottombrown commented Aug 5, 2017

nottombrown Aug 5, 2017

Choose a reason for hiding this comment

Raelifin Aug 7, 2017

Choose a reason for hiding this comment

nottombrown commented Aug 15, 2017

nottombrown commented Aug 15, 2017