-
Notifications
You must be signed in to change notification settings - Fork 431
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
A clear and concise description of what the bug is.
SAC target_entropy = "auto" do not deduce properly the size
To Reproduce
In SACLoss:
Giving a
BoundedContinuous(
shape=torch.Size([2]),
space=ContinuousBox(
low=Tensor(shape=torch.Size([2]), device=cpu, dtype=torch.float32, contiguous=True),
high=Tensor(shape=torch.Size([2]), device=cpu, dtype=torch.float32, contiguous=True)),
device=cpu,
dtype=torch.float32,
domain=continuous)
in action_spec result in wrong target_entropy = -1 instead of -2.
This is because of:
if not isinstance(action_spec, Composite):
action_spec = Composite({self.tensor_keys.action: action_spec})
Which always give action_spec.shape = torch.Size([]), because of the composite transform, hence -1 on target_entropy. The original paper state that target_entropy should be -dim(A) with A the action space.
def target_entropy(self):
target_entropy = self._buffers.get("_target_entropy", None)
if target_entropy is not None:
return target_entropy
target_entropy = self._target_entropy
action_spec = self._action_spec
actor_network = self.actor_network
device = next(self.parameters()).device
if target_entropy == "auto":
action_spec = (
action_spec
if action_spec is not None
else getattr(actor_network, "spec", None)
)
if action_spec is None:
raise RuntimeError(
"Cannot infer the dimensionality of the action. Consider providing "
"the target entropy explicitly or provide the spec of the "
"action tensor in the actor network."
)
if not isinstance(action_spec, Composite):
action_spec = Composite({self.tensor_keys.action: action_spec})
if (
isinstance(self.tensor_keys.action, tuple)
and len(self.tensor_keys.action) > 1
):
action_container_shape = action_spec[self.tensor_keys.action[:-1]].shape
else:
action_container_shape = action_spec.shape
target_entropy = -float(
action_spec.shape[len(action_container_shape) :].numel()
)
Checklist
- I have checked that there is no similar issue in the repo (required)
- I have read the documentation (required)
- I have provided a minimal working example to reproduce the bug (required)
wlruys
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working