[RLlib] Slate-Q +GPU torch bug fix. #23464

sven1977 · 2022-03-24T14:20:53Z

Slate-Q +GPU torch bug fix.

observations and next-observations were not properly auto-copied to GPU by the SampleBatch's get_interceptor. This is probably due to the nested dict structure (preprocessor off for SlateQ) arriving in the loss function

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…eq_torch_gpu_fix

avnishn

Looks fine to me. Did you get a chance to run it on prod?

sven1977 · 2022-03-24T14:35:31Z

Yes, this was tested on a GPU machine in the product.

gjoliver · 2022-03-24T16:44:36Z

rllib/agents/slateq/slateq_torch_policy.py

+    # action.shape: [B, S]
+    actions = train_batch[SampleBatch.ACTIONS]
+
+    observation = convert_to_torch_tensor(


just a thought, should we fix this in a more generic way for dict obs space?
not sure if this is also happening to other agents, and they just don't use strange looking obs by default.

gjoliver · 2022-03-24T16:45:10Z

release/rllib_tests/learning_tests/hard_learning_tests.yaml

@@ -487,7 +487,7 @@ slateq-interest-evolution-recsim-env:
                convert_to_discrete_action_space: false
                seed: 0



framework was changed from torch to tf.
do you want to change it back? or run both?

wip

b7c03c8

sven1977 requested review from gjoliver and avnishn as code owners March 24, 2022 14:20

sven1977 added 3 commits March 24, 2022 15:21

Merge branch 'master' of https://github.com/ray-project/ray into slat…

e9cb348

…eq_torch_gpu_fix

wip

66de87e

wip

a5e9acd

avnishn approved these changes Mar 24, 2022

View reviewed changes

sven1977 merged commit 22c9c4a into ray-project:master Mar 24, 2022

gjoliver reviewed Mar 24, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Slate-Q +GPU torch bug fix. #23464

[RLlib] Slate-Q +GPU torch bug fix. #23464

sven1977 commented Mar 24, 2022 •

edited

avnishn left a comment

sven1977 commented Mar 24, 2022

gjoliver Mar 24, 2022

gjoliver Mar 24, 2022

		@@ -487,7 +487,7 @@ slateq-interest-evolution-recsim-env:
		convert_to_discrete_action_space: false
		seed: 0

[RLlib] Slate-Q +GPU torch bug fix. #23464

[RLlib] Slate-Q +GPU torch bug fix. #23464

Conversation

sven1977 commented Mar 24, 2022 • edited

Why are these changes needed?

Related issue number

Checks

avnishn left a comment

Choose a reason for hiding this comment

sven1977 commented Mar 24, 2022

gjoliver Mar 24, 2022

Choose a reason for hiding this comment

gjoliver Mar 24, 2022

Choose a reason for hiding this comment

sven1977 commented Mar 24, 2022 •

edited