-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(torch): fast but unsafe buildATen & eliminating dispatches #1271
Conversation
impl/torch/build_aten.hpp
Outdated
} | ||
|
||
// These macros is designed to avoid early destruction of the wrapper when build optional at::Tensor. | ||
#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensor, diopiTensors, numTensors) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensor, diopiTensors, numTensors) \ | |
#define DIOPI_IMPL_BUILD_ATEN_LIST(atTensors, diopiTensors, numTensors) \ |
impl/torch/build_aten.hpp
Outdated
// WARNING: This function is UNSAFE. It is the caller's responsibility to ensure that: | ||
// 1. The returned wrapper is not destroyed when its sliced at::Tensor is still in use in DIOPI. | ||
// 2. The input diopiConstTensorHandle_t is actually a reinterpret_cast of an at::Tensor*. | ||
// 3. The input tensor is only used in one thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为何要说明线程安全,因为pytorch的at::Tensor也是不保证线程安全的,我们的也不用保证。让上层调用者去保证即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一般来说只读是安全的吧,我这样搞 const 也被强改了
我又想到一个场景,假如同一个 storage 的不同 tensor 又新建一个 view,说不定 device 也会不对
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一般来说只读是安全的吧,我这样搞 const 也被强改了
我又想到一个场景,假如同一个 storage 的不同 tensor 又新建一个 view,说不定 device 也会不对
动了storage的话会有这个问题,所以我们可以不动storage么,在at:tensor里面改device。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at::TensorImpl的成员函数device_type() 在wrapper子类中修改了,让其return cuda,是不是就解决了这个问题了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at::TensorImpl的成员函数device_type() 在wrapper子类中修改了,让其return cuda,是不是就解决了这个问题了。
这是上一个方案,问题是 at::TensorImpl 的构造比较重,这样搞效率不行
Motivation and Context
旨在加速 torch impl。
Description
经测试,对于简单算子如 mm,单算子 cpu 耗时几乎完全对齐 torch。
Use cases (Optional)
BC-breaking (Optional)
Checklist
Before PR:
After PR: