Skip to content

[BUG] SACLoss target_entropy='auto' ignores action space dimensionality, always returns -1 #3291

@MathieuFonsProjects

Description

@MathieuFonsProjects

Describe the bug

A clear and concise description of what the bug is.
SAC target_entropy = "auto" do not deduce properly the size

To Reproduce

In SACLoss:
Giving a

BoundedContinuous(
    shape=torch.Size([2]),
    space=ContinuousBox(
        low=Tensor(shape=torch.Size([2]), device=cpu, dtype=torch.float32, contiguous=True),
        high=Tensor(shape=torch.Size([2]), device=cpu, dtype=torch.float32, contiguous=True)),
    device=cpu,
    dtype=torch.float32,
    domain=continuous)

in action_spec result in wrong target_entropy = -1 instead of -2.
This is because of:

if not isinstance(action_spec, Composite):
                action_spec = Composite({self.tensor_keys.action: action_spec})

Which always give action_spec.shape = torch.Size([]), because of the composite transform, hence -1 on target_entropy. The original paper state that target_entropy should be -dim(A) with A the action space.

def target_entropy(self):
        target_entropy = self._buffers.get("_target_entropy", None)
        if target_entropy is not None:
            return target_entropy
        target_entropy = self._target_entropy
        action_spec = self._action_spec
        actor_network = self.actor_network
        device = next(self.parameters()).device
        if target_entropy == "auto":
            action_spec = (
                action_spec
                if action_spec is not None
                else getattr(actor_network, "spec", None)
            )
            if action_spec is None:
                raise RuntimeError(
                    "Cannot infer the dimensionality of the action. Consider providing "
                    "the target entropy explicitly or provide the spec of the "
                    "action tensor in the actor network."
                )
            if not isinstance(action_spec, Composite):
                action_spec = Composite({self.tensor_keys.action: action_spec})
            if (
                isinstance(self.tensor_keys.action, tuple)
                and len(self.tensor_keys.action) > 1
            ):
                action_container_shape = action_spec[self.tensor_keys.action[:-1]].shape
            else:
                action_container_shape = action_spec.shape
            target_entropy = -float(
                action_spec.shape[len(action_container_shape) :].numel()
            )

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions