Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve debuggability of activation checkpointing #102241

Closed
wants to merge 6 commits into from

Conversation

soulitzer
Copy link
Contributor

@soulitzer soulitzer commented May 25, 2023

Stack from ghstack (oldest at bottom):

This PR makes some improvements for debuggability of checkpointing:

  • improved error messages that are more understandable
  • errors are now CheckpointError which subclasses RuntimeError (only CheckpointError triggers debug message, see below)
  • stricter error checking by default:
    • shapes, dtypes, and device are compared
    • we also now error when more tensors are being saved for backward during recompute
    • NOTE: checks are relaxed if it is detected that you are doing backward within forward
  • shapes, dtype, and device checking can be disabled by passing determinism_check="none"
  • new debug flag: more helpful error message when debug=True

Note:

  • cpp stack trace is only included for x86 linux machines
  • the error message if cpp stack trace is included can be quite long. For a function checkpointed with 8 operators, the log was around 1300 lines! (should this be hidden behind a flag?)
click to see error message when debug='True' (python stack trace only)
torch.utils.checkpoint.CheckpointError: torch.utils.checkpoint: Recomputed values for the following tensors have different metadata than during the forward pass.
tensor at position 1:
saved metadata: {'shape': torch.Size([1]), 'dtype': torch.float32, 'device': device(type='cpu')}
recomputed metadata: {'shape': torch.Size([2]), 'dtype': torch.float32, 'device': device(type='cpu')}


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/local/pytorch1/test/test_autograd.py", line 5692, in test_checkpoint_detects_non_determinism
    out.backward()
  File "/local/pytorch1/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/local/pytorch1/torch/autograd/__init__.py", line 204, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/local/pytorch1/torch/utils/checkpoint.py", line 1065, in unpack_hook_with_error_cb
    frame.unpack_error_cb(e)
  File "/local/pytorch1/torch/utils/checkpoint.py", line 936, in unpack_error_cb
    raise CheckpointError(
torch.utils.checkpoint.CheckpointError:  An error happened while unpacking tensors; dumping logs of latest computation
because you passed `debug=True` to `torch.utils.checkpoint.checkpoint()`.
Scroll all the way down for guidance on how to navigate these logs.

+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        1. Stack traces of the operators that ran in the original forward     |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.sin.default($0)   (1 of 2 in original)

/local/pytorch1/test/test_autograd.py:5648:save_2_tensors
/local/pytorch1/test/test_autograd.py:5658:fn
/local/pytorch1/torch/utils/checkpoint.py:1177:_checkpoint_without_reentrant
/local/pytorch1/torch/utils/checkpoint.py:431:checkpoint
/local/pytorch1/test/test_autograd.py:5691:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$2: f32[1] = torch._ops.aten.exp.default($1)   (2 of 2 in original)

/local/pytorch1/test/test_autograd.py:5648:save_2_tensors
/local/pytorch1/test/test_autograd.py:5658:fn
/local/pytorch1/torch/utils/checkpoint.py:1177:_checkpoint_without_reentrant
/local/pytorch1/torch/utils/checkpoint.py:431:checkpoint
/local/pytorch1/test/test_autograd.py:5691:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>


+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        2. Stack traces of the operators that ran during recomputation        |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.detach.default($0)   (1 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$2: f32[1] = torch._ops.aten.detach.default($1)   (2 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$3: f32[1] = torch._ops.aten.detach.default($0)   (3 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:1005:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$4: f32[1] = torch._ops.aten.detach.default($3)   (4 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:1005:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$5: f32[1] = torch._ops.aten.sin.default($0)   (5 of 8 in recompute)

/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$6: f32[2] = torch._ops.aten.lift_fresh.default($6)   (6 of 8 in recompute)

/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$7: f32[2] = torch._ops.aten.detach.default($6)   (7 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$8: f32[2] = torch._ops.aten.detach.default($7)   (8 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>


+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|       3. Log of operators in the original forward and recomputation          |
+------------------------------------------------------------------------------+
(Scroll up to correlate stack traces with each operation listed below. This
 helps identify their source in the code.)

IMPORTANT: Differences in "detach" calls between the original forward and the
           recomputation are expected. They are introduced by the checkpointing
           mechanism and can be ignored.

Operations executed during the original forward:

$1: f32[1] = torch._ops.aten.sin.default($0)
$2: f32[1] = torch._ops.aten.exp.default($1)

Operations executed during recomputation:

$1: f32[1] = torch._ops.aten.detach.default($0)
$2: f32[1] = torch._ops.aten.detach.default($1)
$3: f32[1] = torch._ops.aten.detach.default($0)
$4: f32[1] = torch._ops.aten.detach.default($3)
$5: f32[1] = torch._ops.aten.sin.default($0)
$6: f32[2] = torch._ops.aten.lift_fresh.default($6)
$7: f32[2] = torch._ops.aten.detach.default($6)
$8: f32[2] = torch._ops.aten.detach.default($7)

+------------------------------------------------------------------------------+
 ERROR: Detected non-determinism while running activation checkpointing

 You are seeing this error because you passed `debug=True` to checkpoint and
 tensors to be saved during the original forward and differ between those saved
 during recomputation. This can happen if different operators were ran in the
 original forward and in the recomputation.

 To identify where the mismatch may be coming from, you can do the following:

 1) Compare the operators ran during original forward and recomputation to
    see where they differ. These operators are printed above in the order they
    were executed.

 2) Review the stack trace for each operator to locate its invocation source.
    Each operator's stack trace is printed in their execution order.

 Note that the logs can be quite long. Here's how they are structured:
 (Tip: you can Ctrl-f for these headers)

 1. Stack traces of the operators that ran in the original forward
 2. Stack traces of the operators that ran during recomputation
 3. Log of operators in the original forward and recomputation
 4. Error message                                             <--- You are here
--------------------------------------------------------------------------------
click to see error message when debug='True' (with python and cpp stacktrace)
======================================================================
ERROR: test_checkpoint_detects_non_determinism (__main__.TestAutograd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py", line 1071, in unpack_hook_with_error_cb
    return unpack_hook(holder)
  File "/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py", line 1053, in unpack_hook
    frame.check_recomputed_tensors_match(gid)
  File "/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py", line 826, in check_recomputed_tensors_match
    raise CheckpointError(
torch.utils.checkpoint.CheckpointError: torch.utils.checkpoint: Recomputed values for the following tensors have different metadata than during the forward pass.
tensor at position 1:
saved metadata: {'shape': torch.Size([1]), 'dtype': torch.float32, 'device': device(type='cpu')}
recomputed metadata: {'shape': torch.Size([2]), 'dtype': torch.float32, 'device': device(type='cpu')}


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jw3468/local/a/pytorch/test/test_autograd.py", line 5695, in test_checkpoint_detects_non_determinism
    out.backward()
  File "/home/jw3468/local/a/pytorch/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/jw3468/local/a/pytorch/torch/autograd/__init__.py", line 204, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py", line 1073, in unpack_hook_with_error_cb
    frame.unpack_error_cb(e)
  File "/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py", line 944, in unpack_error_cb
    raise CheckpointError(
torch.utils.checkpoint.CheckpointError:  An error happened while unpacking tensors; dumping logs of latest computation
because you passed `debug=True` to `torch.utils.checkpoint.checkpoint()`.
Scroll all the way down for guidance on how to navigate these logs.

+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        1. Stack traces of the operators that ran in the original forward     |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.sin.default($0)   (1 of 2 in original)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5648:save_2_tensors
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5659:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1185:_checkpoint_without_reentrant
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:431:checkpoint
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5694:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$2: f32[1] = torch._ops.aten.exp.default($1)   (2 of 2 in original)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::exp::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableType_2.cpp:0:torch::autograd::VariableType::(anonymous namespace)::exp(c10::DispatchKeySet, at::Tensor const&)
VariableType_2.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::exp>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::exp::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_exp(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5648:save_2_tensors
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5659:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1185:_checkpoint_without_reentrant
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:431:checkpoint
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5694:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start



+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        2. Stack traces of the operators that ran during recomputation        |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.detach.default($0)   (1 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1006:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$2: f32[1] = torch._ops.aten.detach.default($1)   (2 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::torchDispatchFromTensorImpl(c10::TensorImpl const*, char const*, _object*, char const*, c10::SmallVector<pybind11::object, 1u>)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::detach(c10::TensorImpl const*) const
??:0:c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::TensorImpl::shallow_copy_and_detach_core<c10::VariableVersion const&>(c10::VariableVersion const&, bool) const
??:0:c10::TensorImpl::shallow_copy_and_detach(c10::VariableVersion const&, bool) const
offloadstuff.c:0:torch::autograd::make_variable_non_differentiable_view(at::Tensor, at::Tensor const&, bool)
offloadstuff.c:0:torch::autograd::as_view(at::Tensor const&, at::Tensor const&, bool, bool, std::function<at::Tensor (at::Tensor const&)>, torch::autograd::CreationMeta, bool)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1006:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$3: f32[1] = torch._ops.aten.detach.default($0)   (3 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1013:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$4: f32[1] = torch._ops.aten.detach.default($3)   (4 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::torchDispatchFromTensorImpl(c10::TensorImpl const*, char const*, _object*, char const*, c10::SmallVector<pybind11::object, 1u>)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::detach(c10::TensorImpl const*) const
??:0:c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::TensorImpl::shallow_copy_and_detach_core<c10::VariableVersion const&>(c10::VariableVersion const&, bool) const
??:0:c10::TensorImpl::shallow_copy_and_detach(c10::VariableVersion const&, bool) const
offloadstuff.c:0:torch::autograd::make_variable_non_differentiable_view(at::Tensor, at::Tensor const&, bool)
offloadstuff.c:0:torch::autograd::as_view(at::Tensor const&, at::Tensor const&, bool, bool, std::function<at::Tensor (at::Tensor const&)>, torch::autograd::CreationMeta, bool)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1013:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$5: f32[1] = torch._ops.aten.sin.default($0)   (5 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::sin(c10::DispatchKeySet, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::sin>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::sin::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_sin(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$6: f32[2] = torch._ops.aten.lift_fresh.default($6)   (6 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::lift_fresh::call(at::Tensor const&)
tensor_new.cpp:0:torch::utils::(anonymous namespace)::internal_new_from_data(c10::TensorOptions, c10::ScalarType, c10::optional<c10::Device>, _object*, bool, bool, bool, bool)
??:0:torch::utils::tensor_ctor(c10::DispatchKey, c10::ScalarType, torch::PythonArgs&)
python_torch_functions_manual.cpp:0:torch::autograd::THPVariable_tensor(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$7: f32[2] = torch._ops.aten.detach.default($6)   (7 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
PythonFallbackKernel.cpp:0:(anonymous namespace)::pythonFallback(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1006:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::mul_Tensor(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::mul_Tensor>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&, at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, at::Tensor const&)
??:0:at::_ops::mul_Tensor::call(at::Tensor const&, at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_mul(_object*, _object*, _object*)
python_variable_methods.cpp:0:_object* torch::autograd::TypeError_to_NotImplemented_<&torch::autograd::THPVariable_mul>(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:344:method_vectorcall_VARARGS_KEYWORDS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7284:slot_nb_multiply
/usr/local/src/conda/python-3.10.11/Objects/abstract.c:891:binary_op1
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Python/ceval.c:2003:_PyEval_EvalFrameDefault
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start


$8: f32[2] = torch._ops.aten.detach.default($7)   (8 of 8 in recompute)

/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:577:PyObject_CallMethod
??:0:torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName)
PyInterpreter.cpp:0:(anonymous namespace)::torchDispatchFromTensorImpl(c10::TensorImpl const*, char const*, _object*, char const*, c10::SmallVector<pybind11::object, 1u>)
PyInterpreter.cpp:0:(anonymous namespace)::ConcretePyInterpreterVTable::detach(c10::TensorImpl const*) const
??:0:c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::TensorImpl::shallow_copy_and_detach_core<c10::VariableVersion const&>(c10::VariableVersion const&, bool) const
??:0:c10::TensorImpl::shallow_copy_and_detach(c10::VariableVersion const&, bool) const
offloadstuff.c:0:torch::autograd::make_variable_non_differentiable_view(at::Tensor, at::Tensor const&, bool)
offloadstuff.c:0:torch::autograd::as_view(at::Tensor const&, at::Tensor const&, bool, bool, std::function<at::Tensor (at::Tensor const&)>, torch::autograd::CreationMeta, bool)
offloadstuff.c:0:torch::ADInplaceOrView::detach(c10::DispatchKeySet, at::Tensor const&)
offloadstuff.c:0:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::ADInplaceOrView::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, at::Tensor (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::redispatch(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:torch::autograd::VariableType::(anonymous namespace)::detach(c10::DispatchKeySet, at::Tensor const&)
VariableTypeManual.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::detach>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&)
??:0:at::_ops::detach::call(at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_detach(_object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:432:method_vectorcall_NOARGS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1006:pack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_pack_hook(at::Tensor const&)
??:0:torch::autograd::SavedVariable::set_hooks_and_pack_data(std::unique_ptr<torch::autograd::SavedVariableHooks, std::default_delete<torch::autograd::SavedVariableHooks> >&&, at::Tensor const&)
??:0:torch::autograd::SavedVariable::SavedVariable(at::Tensor const&, bool, bool)
VariableType_0.cpp:0:torch::autograd::VariableType::(anonymous namespace)::mul_Tensor(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&)
VariableType_0.cpp:0:c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::mul_Tensor>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
PythonFallbackKernel.cpp:0:void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)
offloadstuff.c:0:c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&, at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, at::Tensor const&)
??:0:at::_ops::mul_Tensor::call(at::Tensor const&, at::Tensor const&)
python_variable_methods.cpp:0:torch::autograd::THPVariable_mul(_object*, _object*, _object*)
python_variable_methods.cpp:0:_object* torch::autograd::TypeError_to_NotImplemented_<&torch::autograd::THPVariable_mul>(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/descrobject.c:344:method_vectorcall_VARARGS_KEYWORDS
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7284:slot_nb_multiply
/usr/local/src/conda/python-3.10.11/Objects/abstract.c:891:binary_op1
/home/jw3468/local/a/pytorch/test/test_autograd.py:5651:save_2_tensors_alt
/usr/local/src/conda/python-3.10.11/Python/ceval.c:2003:_PyEval_EvalFrameDefault
/home/jw3468/local/a/pytorch/test/test_autograd.py:5661:fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1169:recompute_fn
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1049:unpack_hook
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch/torch/utils/checkpoint.py:1071:unpack_hook_with_error_cb
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/autograd/__init__.py:204:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:841:PyObject_CallFunctionObjArgs
??:0:torch::autograd::PySavedVariableHooks::call_unpack_hook()
??:0:torch::autograd::SavedVariable::unpack(std::shared_ptr<torch::autograd::Node>) const
??:0:torch::autograd::generated::ExpBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
offloadstuff.c:0:torch::autograd::Node::operator()(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
??:0:torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
??:0:torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
??:0:torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>, torch::autograd::InputBuffer&&)
??:0:torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&)
??:0:THPEngine_run_backward(_object*, _object*, _object*)
/usr/local/src/conda/python-3.10.11/Objects/methodobject.c:543:cfunction_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/_tensor.py:488:backward
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:5695:test_checkpoint_detects_non_determinism
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:549:_callTestMethod
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:591:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:2319:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/case.py:650:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:122:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/suite.py:84:__call__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:5945:do_call_core
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/runner.py:184:run
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:7494:slot_tp_call
/usr/local/src/conda/python-3.10.11/Objects/call.c:215:_PyObject_MakeTpCall
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:271:runTests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch-env/lib/python3.10/unittest/main.py:101:__init__
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/torch/testing/_internal/common_utils.py:892:run_tests
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Objects/call.c:153:_PyObject_FastCallDictTstate
/usr/local/src/conda/python-3.10.11/Objects/call.c:431:_PyObject_Call_Prepend
/usr/local/src/conda/python-3.10.11/Objects/typeobject.c:1135:type_call
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:112:_PyObject_VectorcallTstate
/home/jw3468/local/a/pytorch/test/test_autograd.py:11251:<module>
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Include/cpython/abstract.h:114:_PyObject_VectorcallTstate
/usr/local/src/conda/python-3.10.11/Include/internal/pycore_ceval.h:46:_PyEval_EvalFrame
/usr/local/src/conda/python-3.10.11/Python/ceval.c:1134:PyEval_EvalCode
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1291:run_eval_code_obj
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1312:run_mod
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:1208:pyrun_file
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:456:_PyRun_SimpleFileObject
/usr/local/src/conda/python-3.10.11/Python/pythonrun.c:90:_PyRun_AnyFileObject
/usr/local/src/conda/python-3.10.11/Modules/main.c:357:pymain_run_file_obj
/usr/local/src/conda/python-3.10.11/Modules/main.c:1090:Py_BytesMain
??:0:__libc_start_call_main
:0:__libc_start_main_alias_2
??:0:_start



+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|       3. Log of operators in the original forward and recomputation          |
+------------------------------------------------------------------------------+
(Scroll up to correlate stack traces with each operation listed below. This
 helps identify their source in the code.)

IMPORTANT: Differences in "detach" calls between the original forward and the
           recomputation are expected. They are introduced by the checkpointing
           mechanism and can be ignored.

Operations executed during the original forward:

$1: f32[1] = torch._ops.aten.sin.default($0)
$2: f32[1] = torch._ops.aten.exp.default($1)

Operations executed during recomputation:

$1: f32[1] = torch._ops.aten.detach.default($0)
$2: f32[1] = torch._ops.aten.detach.default($1)
$3: f32[1] = torch._ops.aten.detach.default($0)
$4: f32[1] = torch._ops.aten.detach.default($3)
$5: f32[1] = torch._ops.aten.sin.default($0)
$6: f32[2] = torch._ops.aten.lift_fresh.default($6)
$7: f32[2] = torch._ops.aten.detach.default($6)
$8: f32[2] = torch._ops.aten.detach.default($7)

+------------------------------------------------------------------------------+
 ERROR: Detected non-determinism while running activation checkpointing

 You are seeing this error because you passed `debug=True` to checkpoint and
 tensors to be saved during the original forward and differ between those saved
 during recomputation. This can happen if different operators were ran in the
 original forward and in the recomputation.

 To identify where the mismatch may be coming from, you can do the following:

 1) Compare the operators ran during original forward and recomputation to
    see where they differ. These operators are printed above in the order they
    were executed.

 2) Review the stack trace for each operator to locate its invocation source.
    Each operator's stack trace is printed in their execution order.

 Note that the logs can be quite long. Here's how they are structured:
 (Tip: you can Ctrl-f for these headers)

 1. Stack traces of the operators that ran in the original forward
 2. Stack traces of the operators that ran during recomputation
 3. Log of operators in the original forward and recomputation
 4. Error message                                             <--- You are here
--------------------------------------------------------------------------------

@pytorch-bot
Copy link

pytorch-bot bot commented May 25, 2023

soulitzer added a commit that referenced this pull request May 25, 2023
ghstack-source-id: bc0782e3977d0844f71ffabaa2002c370bd6f3f6
Pull Request resolved: #102241
@soulitzer soulitzer added release notes: autograd release notes category topic: improvements topic category labels May 25, 2023
soulitzer added a commit that referenced this pull request May 25, 2023
ghstack-source-id: c99720afe0bdccf2edfdbe48bc4b47e66d658de5
Pull Request resolved: #102241
soulitzer added a commit that referenced this pull request May 25, 2023
ghstack-source-id: 35bd4d9e57ce663740a1a62d19103727aa8cdcca
Pull Request resolved: #102241
soulitzer added a commit that referenced this pull request Jun 2, 2023
ghstack-source-id: d07c48373a5d74f7bda8c1fdb382cdb88e0a5482
Pull Request resolved: #102241
soulitzer added a commit that referenced this pull request Jun 2, 2023
ghstack-source-id: b5e1491c330a4811e83490fe2cf409207c041ba0
Pull Request resolved: #102241
<details>

<summary>
click to see error message when debug='True'
</summary>

```
torch.utils.checkpoint.CheckpointError: torch.utils.checkpoint: Recomputed values for the following tensors have different metadata than during the forward pass.
tensor at position 1:
saved metadata: {'shape': torch.Size([1]), 'dtype': torch.float32, 'device': device(type='cpu')}
recomputed metadata: {'shape': torch.Size([2]), 'dtype': torch.float32, 'device': device(type='cpu')}


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/local/pytorch1/test/test_autograd.py", line 5692, in test_checkpoint_detects_non_determinism
    out.backward()
  File "/local/pytorch1/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/local/pytorch1/torch/autograd/__init__.py", line 204, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/local/pytorch1/torch/utils/checkpoint.py", line 1065, in unpack_hook_with_error_cb
    frame.unpack_error_cb(e)
  File "/local/pytorch1/torch/utils/checkpoint.py", line 936, in unpack_error_cb
    raise CheckpointError(
torch.utils.checkpoint.CheckpointError:  An error happened while unpacking tensors; dumping logs of latest computation
because you passed `debug=True` to `torch.utils.checkpoint.checkpoint()`.
Scroll all the way down for guidance on how to navigate these logs.

+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        1. Stack traces of the operators that ran in the original forward     |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.sin.default($0)   (1 of 2 in original)

/local/pytorch1/test/test_autograd.py:5648:save_2_tensors
/local/pytorch1/test/test_autograd.py:5658:fn
/local/pytorch1/torch/utils/checkpoint.py:1177:_checkpoint_without_reentrant
/local/pytorch1/torch/utils/checkpoint.py:431:checkpoint
/local/pytorch1/test/test_autograd.py:5691:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$2: f32[1] = torch._ops.aten.exp.default($1)   (2 of 2 in original)

/local/pytorch1/test/test_autograd.py:5648:save_2_tensors
/local/pytorch1/test/test_autograd.py:5658:fn
/local/pytorch1/torch/utils/checkpoint.py:1177:_checkpoint_without_reentrant
/local/pytorch1/torch/utils/checkpoint.py:431:checkpoint
/local/pytorch1/test/test_autograd.py:5691:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>


+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|        2. Stack traces of the operators that ran during recomputation        |
+------------------------------------------------------------------------------+

$1: f32[1] = torch._ops.aten.detach.default($0)   (1 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$2: f32[1] = torch._ops.aten.detach.default($1)   (2 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$3: f32[1] = torch._ops.aten.detach.default($0)   (3 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:1005:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$4: f32[1] = torch._ops.aten.detach.default($3)   (4 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:1005:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$5: f32[1] = torch._ops.aten.sin.default($0)   (5 of 8 in recompute)

/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$6: f32[2] = torch._ops.aten.lift_fresh.default($6)   (6 of 8 in recompute)

/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$7: f32[2] = torch._ops.aten.detach.default($6)   (7 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>

$8: f32[2] = torch._ops.aten.detach.default($7)   (8 of 8 in recompute)

/local/pytorch1/torch/utils/checkpoint.py:998:pack_hook
/local/pytorch1/test/test_autograd.py:5651:save_2_tensors_alt
/local/pytorch1/test/test_autograd.py:5660:fn
/local/pytorch1/torch/utils/checkpoint.py:1161:recompute_fn
/local/pytorch1/torch/utils/checkpoint.py:1041:unpack_hook
/local/pytorch1/torch/utils/checkpoint.py:1063:unpack_hook_with_error_cb
/local/pytorch1/torch/autograd/__init__.py:204:backward
/local/pytorch1/torch/_tensor.py:488:backward
/local/pytorch1/test/test_autograd.py:5692:test_checkpoint_detects_non_determinism
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:550:_callTestMethod
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:592:run
/local/pytorch1/torch/testing/_internal/common_utils.py:2248:_run_with_retry
/local/pytorch1/torch/testing/_internal/common_utils.py:2319:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/case.py:651:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:122:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/suite.py:84:__call__
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/runner.py:184:run
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:271:runTests
/opt/miniconda3/envs/pytorch1/lib/python3.9/unittest/main.py:101:__init__
/local/pytorch1/torch/testing/_internal/common_utils.py:892:run_tests
/local/pytorch1/test/test_autograd.py:11248:<module>


+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|       3. Log of operators in the original forward and recomputation          |
+------------------------------------------------------------------------------+
(Scroll up to correlate stack traces with each operation listed below. This
 helps identify their source in the code.)

IMPORTANT: Differences in "detach" calls between the original forward and the
           recomputation are expected. They are introduced by the checkpointing
           mechanism and can be ignored.

Operations executed during the original forward:

$1: f32[1] = torch._ops.aten.sin.default($0)
$2: f32[1] = torch._ops.aten.exp.default($1)

Operations executed during recomputation:

$1: f32[1] = torch._ops.aten.detach.default($0)
$2: f32[1] = torch._ops.aten.detach.default($1)
$3: f32[1] = torch._ops.aten.detach.default($0)
$4: f32[1] = torch._ops.aten.detach.default($3)
$5: f32[1] = torch._ops.aten.sin.default($0)
$6: f32[2] = torch._ops.aten.lift_fresh.default($6)
$7: f32[2] = torch._ops.aten.detach.default($6)
$8: f32[2] = torch._ops.aten.detach.default($7)

+------------------------------------------------------------------------------+
 ERROR: Detected non-determinism while running activation checkpointing

 You are seeing this error because you passed `debug=True` to checkpoint and
 tensors to be saved during the original forward and differ between those saved
 during recomputation. This can happen if different operators were ran in the
 original forward and in the recomputation.

 To identify where the mismatch may be coming from, you can do the following:

 1) Compare the operators ran during original forward and recomputation to
    see where they differ. These operators are printed above in the order they
    were executed.

 2) Review the stack trace for each operator to locate its invocation source.
    Each operator's stack trace is printed in their execution order.

 Note that the logs can be quite long. Here's how they are structured:
 (Tip: you can Ctrl-f for these headers)

 1. Stack traces of the operators that ran in the original forward
 2. Stack traces of the operators that ran during recomputation
 3. Log of operators in the original forward and recomputation
 4. Error message                                             <--- You are here
--------------------------------------------------------------------------------
```

</details>

[ghstack-poisoned]
soulitzer added a commit that referenced this pull request Jun 7, 2023
ghstack-source-id: b4595e2473c1993d853d7bfb4fa039f3ba3bc837
Pull Request resolved: #102241
@soulitzer soulitzer changed the title [WIP] Improve debuggability of activation checkpointing Improve debuggability of activation checkpointing Jun 16, 2023
@soulitzer soulitzer requested a review from albanD June 16, 2023 02:35
@soulitzer soulitzer closed this Jun 16, 2023
@facebook-github-bot facebook-github-bot deleted the gh/soulitzer/210/head branch July 16, 2023 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes: autograd release notes category topic: improvements topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant