[BUG]: Activation Checkpointing Failed Test with PyTorch 1.9

### 🐛 Describe the bug

When running unit testing with PyTorch 1.9, the following exception occurs for `tests/test_utils/test_activation_checkpointing.py`. 

## Screenshot

<img width="1003" alt="Lark20220411-162602" src="https://user-images.githubusercontent.com/31818963/162695922-dc88cf7b-fdb3-4063-af3f-2eb40917fa20.png">

## Reasons

This bug occurs because of the invalid use of `ctx.save_for_backward`. In torch autograd function, we need to provide either inputs or intermediate values to `ctx.save_for_backward`. However, when `activation_offload=True`, the current implementation will copy the tensors to cpu, creating new tensors which are neither inputs or intermediate activations (i.e. does not have gradient accumulator objects associated with them). 

```python
for i, arg in enumerate(args):
      if torch.is_tensor(arg):
          if activation_offload:
              tensor_inputs.append(copy_to_device(arg, 'cpu'))
          else:
              tensor_inputs.append(arg)
          ctx.tensor_indices.append(i)
          ctx.inputs.append(None)
      else:
          ctx.inputs.append(arg)

  ctx.save_for_backward(*tensor_inputs)
```

This will trigger the [assertion statement](https://github.com/pytorch/pytorch/blob/ed7ece389daa980a03a2f09848a6439a90af9782/torch/csrc/autograd/saved_variable.cpp#L196) as shown below.

<img width="660" alt="Screenshot 2022-04-11 at 4 27 50 PM" src="https://user-images.githubusercontent.com/31818963/162696232-9fee460e-3275-4914-bccd-4a579b20980f.png">


## Solution

Use `cxt.tensor_inputs = tensor_inputs` instead of `ctx.save_for_backward(*tensor_inputs)` if `activation_offload=True`.


### Environment

CUDA: 11.1
PyTorch: 1.9.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Activation Checkpointing Failed Test with PyTorch 1.9 #718

🐛 Describe the bug

Screenshot

Reasons

Solution

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]: Activation Checkpointing Failed Test with PyTorch 1.9 #718

Description

🐛 Describe the bug

Screenshot

Reasons

Solution

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions