[AOTInductor] ProxyExecutor support ReinterpretView inputs #110451

SherlockNoMad · 2023-10-03T16:01:27Z

Summary:
See wrapper.codegen_reinterpret_view(), it return a temporary handle for tensor, which has following problem.

            # NB, the return handle here represents a temporary tensor, which will be automatically
            # released.
            # Here's a sample usage in the cpp wrapper code:
            # ```
            # aoti_torch_addmm_out(
            #     buf1,
            #     arg1_1,
            #     RAIIAtenTensorHandle(tmp_tensor_handle_0),
            #     buf0,
            #     1L,
            #     1L));
            # ```
            # RAIIAtenTensorHandle(tmp_tensor_handle_0) will be released after the call to addmm_out.
            # This could be problematic when it's used in a different pattern, for example:
            # ````
            # AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6};
            # aoti_torch_proxy_executor_call_function(..., tensor_args);
            # ````
            # RAIIAtenTensorHandle(tmp_tensor_handle_2) will be invalid when it's used in the latter
            # kernel call.
            return f"RAIIAtenTensorHandle({tmp_name})"

As a result, ProxyExecutor would generate following code, which cause invalid memory access.

Before:

    // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output]
    AtenTensorHandle tmp_tensor_handle_2;
    AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2));
    ...
    AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6};
    int64_t int_args[] = {1};
    aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, int_args, 3, tensor_args);
    buf3.reset();

With fix in this diff, ProxyExecutor generates following code

After:

    // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output]
    AtenTensorHandle tmp_tensor_handle_2;
    AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2));
    ...
    aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, std::vector<int64_t>{1}.data(), 3, std::vector<AtenTensorHandle>{RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}.data());
    buf3.reset();

I am not exactly a big fan of such std::vector{...}.data() for creating a temp array, but I can't think of another fix.

Test Plan: buck2 run mode/dev-nosan deeplearning/aot_inductor/test:test_custom_ops

Reviewed By: desertfire

Differential Revision: D49758764

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler @avikchaudhuri @gmagogsfm

pytorch-bot · 2023-10-03T16:01:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110451

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 7f0e57b with merge base a8a31bc ():

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

trunk / linux-focal-rocm5.6-py3.8 / test (default, 3, 3, linux.rocm.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2023-10-03T16:01:37Z

This pull request was exported from Phabricator. Differential Revision: D49758764

…10451) Summary: See wrapper.codegen_reinterpret_view(), it return a temporary handle for tensor, which has following problem. ``` # NB, the return handle here represents a temporary tensor, which will be automatically # released. # Here's a sample usage in the cpp wrapper code: # ``` # aoti_torch_addmm_out( # buf1, # arg1_1, # RAIIAtenTensorHandle(tmp_tensor_handle_0), # buf0, # 1L, # 1L)); # ``` # RAIIAtenTensorHandle(tmp_tensor_handle_0) will be released after the call to addmm_out. # This could be problematic when it's used in a different pattern, for example: # ```` # AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}; # aoti_torch_proxy_executor_call_function(..., tensor_args); # ```` # RAIIAtenTensorHandle(tmp_tensor_handle_2) will be invalid when it's used in the latter # kernel call. return f"RAIIAtenTensorHandle({tmp_name})" ``` As a result, ProxyExecutor would generate following code, which cause invalid memory access. Before: ``` // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output] AtenTensorHandle tmp_tensor_handle_2; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2)); ... AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}; int64_t int_args[] = {1}; aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, int_args, 3, tensor_args); buf3.reset(); ``` With fix in this diff, ProxyExecutor generates following code After: ``` // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output] AtenTensorHandle tmp_tensor_handle_2; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2)); ... aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, std::vector<int64_t>{1}.data(), 3, std::vector<AtenTensorHandle>{RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}.data()); buf3.reset(); ``` I am not exactly a big fan of such `std::vector{...}.data()` for creating a temp array, but I can't think of another fix. Test Plan: buck2 run mode/dev-nosan deeplearning/aot_inductor/test:test_custom_ops Reviewed By: desertfire Differential Revision: D49758764

facebook-github-bot · 2023-10-03T19:52:03Z

This pull request was exported from Phabricator. Differential Revision: D49758764

facebook-github-bot · 2023-10-03T19:52:08Z

This pull request was exported from Phabricator. Differential Revision: D49758764

facebook-github-bot · 2023-10-04T02:18:35Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2023-10-04T02:20:21Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

facebook-github-bot added the fb-exported label Oct 3, 2023

github-actions bot added module: inductor ciflow/inductor module: export labels Oct 3, 2023

SherlockNoMad requested review from desertfire and chenyang78 October 3, 2023 16:01

desertfire approved these changes Oct 3, 2023

View reviewed changes

SherlockNoMad added the topic: not user facing topic category label Oct 3, 2023

SherlockNoMad force-pushed the export-D49758764 branch from 3bcb149 to 3061433 Compare October 3, 2023 19:52

SherlockNoMad force-pushed the export-D49758764 branch from 3061433 to 7f0e57b Compare October 3, 2023 19:52

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 4, 2023

pytorchmergebot added the merging label Oct 4, 2023

pytorchmergebot added Merged and removed merging labels Oct 4, 2023

pytorchmergebot closed this in 50054b1 Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AOTInductor] ProxyExecutor support ReinterpretView inputs #110451

[AOTInductor] ProxyExecutor support ReinterpretView inputs #110451

Uh oh!

SherlockNoMad commented Oct 3, 2023 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Oct 3, 2023 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 4, 2023

Uh oh!

pytorchmergebot commented Oct 4, 2023

Uh oh!

Uh oh!

[AOTInductor] ProxyExecutor support ReinterpretView inputs #110451

[AOTInductor] ProxyExecutor support ReinterpretView inputs #110451

Uh oh!

Conversation

SherlockNoMad commented Oct 3, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110451

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 3, 2023

Uh oh!

facebook-github-bot commented Oct 4, 2023

Uh oh!

pytorchmergebot commented Oct 4, 2023

Merge started

Uh oh!

Uh oh!

SherlockNoMad commented Oct 3, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Oct 3, 2023 •

edited

Loading