perf(torch): fast but unsafe buildATen & eliminating dispatches #1271

lljbash · 2024-06-24T08:30:47Z

Motivation and Context

旨在加速 torch impl。

Description

新增编译选项，可以启用快速但不安全版本的 buildATen（线程不安全，预设上层传入 torch Tensor），默认关闭，预期用法是在 dipu 中打开。
通过手动调用 dispatch 后的函数，去除大部分原有的 dispatch。

经测试，对于简单算子如 mm，单算子 cpu 耗时几乎完全对齐 torch。

Use cases (Optional)

BC-breaking (Optional)

Checklist

Before PR:

I have read and followed the workflow indicated in the Contributors.md to create this PR.
Pre-commit or linting tools indicated in Contributors.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

CLA has been signed and all committers have signed the CLA in this PR.

yangbofun · 2024-06-24T12:49:58Z

impl/torch/build_aten.hpp

+}
+
+// These macros is designed to avoid early destruction of the wrapper when build optional at::Tensor.
+#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensor, diopiTensors, numTensors)                                                                  \


Suggested change

#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensor, diopiTensors, numTensors) \

#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensors, diopiTensors, numTensors) \

yangbofun · 2024-06-24T12:55:45Z

impl/torch/build_aten.hpp

+// WARNING: This function is UNSAFE. It is the caller's responsibility to ensure that:
+//   1. The returned wrapper is not destroyed when its sliced at::Tensor is still in use in DIOPI.
+//   2. The input diopiConstTensorHandle_t is actually a reinterpret_cast of an at::Tensor*.
+//   3. The input tensor is only used in one thread.


这里为何要说明线程安全，因为pytorch的at::Tensor也是不保证线程安全的，我们的也不用保证。让上层调用者去保证即可。

一般来说只读是安全的吧，我这样搞 const 也被强改了

我又想到一个场景，假如同一个 storage 的不同 tensor 又新建一个 view，说不定 device 也会不对

一般来说只读是安全的吧，我这样搞 const 也被强改了

我又想到一个场景，假如同一个 storage 的不同 tensor 又新建一个 view，说不定 device 也会不对

动了storage的话会有这个问题，所以我们可以不动storage么，在at:tensor里面改device。

at::TensorImpl的成员函数device_type() 在wrapper子类中修改了，让其return cuda,是不是就解决了这个问题了。

at::TensorImpl的成员函数device_type() 在wrapper子类中修改了，让其return cuda,是不是就解决了这个问题了。

这是上一个方案，问题是 at::TensorImpl 的构造比较重，这样搞效率不行

CoolKbh and others added 5 commits June 24, 2024 16:14

change at::xx_out to at:;cuda::xx_out

25e178c

perf: superfast but unsafe buildaten

0f55911

wrapper remain func by macro

e094836

modify func_ext buildaten

64f8c7c

build(torch): add option to switch unsafe buildATen

a20adc2

lljbash requested a review from yangbofun as a code owner June 24, 2024 08:30

style: format cpp

67b6722

lljbash added the nvidia for nvida label Jun 24, 2024

yangbofun reviewed Jun 24, 2024

View reviewed changes

lljbash added 2 commits June 25, 2024 14:06

docs(torch): refine naming and docs

65b9096

style: fix linting errors

e999298

lljbash requested a review from yangbofun June 25, 2024 09:55

yangbofun merged commit de8dfe7 into DeepLink-org:main Jun 26, 2024
14 of 16 checks passed

yangbofun deleted the llj/buildaten branch June 26, 2024 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(torch): fast but unsafe buildATen & eliminating dispatches #1271

perf(torch): fast but unsafe buildATen & eliminating dispatches #1271

lljbash commented Jun 24, 2024 •

edited by yangbofun

Loading

yangbofun Jun 24, 2024

yangbofun Jun 24, 2024

lljbash Jun 25, 2024

yangbofun Jun 25, 2024 •

edited

Loading

yangbofun Jun 25, 2024

lljbash Jun 27, 2024

	#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensor, diopiTensors, numTensors) \
	#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensors, diopiTensors, numTensors) \

perf(torch): fast but unsafe buildATen & eliminating dispatches #1271

perf(torch): fast but unsafe buildATen & eliminating dispatches #1271

Conversation

lljbash commented Jun 24, 2024 • edited by yangbofun Loading

Motivation and Context

Description

Use cases (Optional)

BC-breaking (Optional)

Checklist

yangbofun Jun 24, 2024

Choose a reason for hiding this comment

yangbofun Jun 24, 2024

Choose a reason for hiding this comment

lljbash Jun 25, 2024

Choose a reason for hiding this comment

yangbofun Jun 25, 2024 • edited Loading

Choose a reason for hiding this comment

yangbofun Jun 25, 2024

Choose a reason for hiding this comment

lljbash Jun 27, 2024

Choose a reason for hiding this comment

lljbash commented Jun 24, 2024 •

edited by yangbofun

Loading

yangbofun Jun 25, 2024 •

edited

Loading