retain_grad() documentation misleading about non-leaf tensors

The documentation says:

"If you call retain_grad() on a non-leaf node, it results in a no-op."

This is misleading or incomplete.
retain_grad() on a non-leaf tensor with requires_grad=True does not result in a no-op — it correctly retains gradients.

Only calling retain_grad() on:

a leaf tensor is a no-op

a tensor with requires_grad=False throws an error