[rllib] Refactor pytorch custom model support #3634

ericl · 2018-12-25T13:05:39Z

What do these changes do?

Clean up the pytorch model API to support RNNs, Dict / Tuple spaces
Unify QMIX RNN model with model catalog

I expect we'll have to make more changes (and add more tests) as we implement PyTorch support more fully; this is just an initial cleanup to better unify QMIX with pytorch A3C.

#3365

ericl · 2018-12-25T13:09:03Z

python/ray/rllib/agents/qmix/qmix_policy_graph.py

@@ -292,8 +301,8 @@ def to_batches(arr):
    @override(PolicyGraph)
    def get_initial_state(self):
        return [
-            self.model.init_hidden().numpy().squeeze()
-            for _ in range(self.n_agents)
+            s.expand([self.n_agents, -1]).numpy()


The main change here is that we return a list of [num_agents, h_size_i] for i in n_state_tensors, rather than [h_size] for i in n_agents previously (which was kind of unnatural and didn't generalize well to multiple state tensors).

Relatedly, I'm regretting allowing multiple hidden state tensors, we probably should have just required them to be fused into one element.

AmplabJenkins · 2018-12-25T13:35:26Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10386/
Test FAILed.

AmplabJenkins · 2018-12-25T15:05:18Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10387/
Test FAILed.

AmplabJenkins · 2018-12-25T15:53:27Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10390/
Test FAILed.

AmplabJenkins · 2018-12-25T16:18:55Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10388/
Test FAILed.

AmplabJenkins · 2018-12-25T16:51:57Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10391/
Test FAILed.

AmplabJenkins · 2018-12-26T13:19:21Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10415/
Test FAILed.

AmplabJenkins · 2018-12-27T14:22:40Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10453/
Test FAILed.

AmplabJenkins · 2018-12-28T05:40:06Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10468/
Test PASSed.

python/ray/rllib/agents/a3c/a3c_torch_policy_graph.py

AmplabJenkins · 2018-12-28T10:07:49Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10475/
Test FAILed.

AmplabJenkins · 2018-12-28T10:42:36Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10474/
Test PASSed.

AmplabJenkins · 2018-12-28T11:25:34Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10472/
Test FAILed.

richardliaw · 2019-01-03T03:44:08Z

python/ray/rllib/models/pytorch/fcnet.py


+    def __init__(self, obs_space, num_outputs, options):
+        TorchModel.__init__(self, obs_space, num_outputs, options)
+        hiddens = options.get("fcnet_hiddens")


should this have a default?

right now, if not provided, this fails at line 26 (which then you may as well make this options["fcnet_hiddens"])

It's filled in by agent, so it should be ok either way.

richardliaw · 2019-01-03T03:51:23Z

python/ray/rllib/models/pytorch/visionnet.py

-            obs: observations and features"""
-        res = self._convs(obs)
+    def _hidden_layers(self, obs):
+        res = self._convs(obs.permute(0, 3, 1, 2))


can this be documented somewhere?

richardliaw · 2019-01-03T04:01:37Z

python/ray/rllib/agents/qmix/qmix_policy_graph.py


-        return (TupleActions(list(actions.transpose([1, 0]))),
-                hiddens.transpose([1, 0, 2]), {})
+        return TupleActions(list(actions.transpose([1, 0]))), hiddens, {}


is the second [1, 0, 2] not needed anymore?

Yeah (see comment above), hiddens are handled differently now.

ericl added 6 commits December 25, 2018 00:00

wip

ef0028d

clean up

23bdb76

wip

961d092

add reg

e1f09b9

rnn

abf73b6

mask test

a05b513

ericl assigned richardliaw Dec 25, 2018

ericl commented Dec 25, 2018

View reviewed changes

ericl added 2 commits December 25, 2018 22:12

doc

a9b6e12

warn

a667375

ericl added 6 commits December 25, 2018 22:41

add torch test

d10180b

diff name

b5c139e

16

0938960

tensor

5771af5

fmt

61c88a6

state

674cd80

Merge remote-tracking branch 'upstream/master' into pytorch-models

d703da1

remove channel major entirely

9c0d31d

fix

c10643a

ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Dec 28, 2018

doc

18c2426

richardliaw reviewed Dec 28, 2018

View reviewed changes

python/ray/rllib/agents/a3c/a3c_torch_policy_graph.py Outdated Show resolved Hide resolved

ericl added 4 commits December 28, 2018 17:31

doc

a389d68

doc fix

8603a40

tf

7a7548b

remove to numpy

164a7cc

richardliaw reviewed Jan 3, 2019

View reviewed changes

richardliaw approved these changes Jan 3, 2019

View reviewed changes

comment

05a5c1c

ericl merged commit 47d36d7 into ray-project:master Jan 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] Refactor pytorch custom model support #3634

[rllib] Refactor pytorch custom model support #3634

ericl commented Dec 25, 2018 •

edited

Loading

ericl Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 26, 2018

AmplabJenkins commented Dec 27, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

richardliaw Jan 3, 2019

richardliaw Jan 3, 2019 •

edited

Loading

ericl Jan 3, 2019

richardliaw Jan 3, 2019

ericl Jan 3, 2019

richardliaw Jan 3, 2019

ericl Jan 3, 2019

[rllib] Refactor pytorch custom model support #3634

[rllib] Refactor pytorch custom model support #3634

Conversation

ericl commented Dec 25, 2018 • edited Loading

What do these changes do?

ericl Dec 25, 2018

Choose a reason for hiding this comment

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 25, 2018

AmplabJenkins commented Dec 26, 2018

AmplabJenkins commented Dec 27, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

AmplabJenkins commented Dec 28, 2018

richardliaw Jan 3, 2019

Choose a reason for hiding this comment

richardliaw Jan 3, 2019 • edited Loading

Choose a reason for hiding this comment

ericl Jan 3, 2019

Choose a reason for hiding this comment

richardliaw Jan 3, 2019

Choose a reason for hiding this comment

ericl Jan 3, 2019

Choose a reason for hiding this comment

richardliaw Jan 3, 2019

Choose a reason for hiding this comment

ericl Jan 3, 2019

Choose a reason for hiding this comment

ericl commented Dec 25, 2018 •

edited

Loading

richardliaw Jan 3, 2019 •

edited

Loading