-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task infer blob desc support choosing method #10124
Conversation
oneflow/core/graph/task_node.h
Outdated
@@ -71,6 +71,9 @@ class TaskNode : public Node<TaskNode, TaskEdge> { | |||
DeviceType device_type() const; | |||
virtual const ParallelContext* parallel_ctx() const { return nullptr; } | |||
|
|||
// Different types of ExecNode choose different output BlobDesc inference methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecNode 没有 type,这里说的是 TaskNode Type 吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecNode 没有 type,这里说的是 TaskNode Type 吧
我改下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -72,7 +72,8 @@ class ExecNode final : public Node<ExecNode, ExecEdge> { | |||
std::string VisualStr() const override { return op_->op_name(); } | |||
void ToProto(const ParallelContext*, ExecNodeProto*) const; | |||
|
|||
void InferBlobDescs(const ParallelContext* parallel_ctx); | |||
typedef void (ExecNode::*InferBlobDescsMethod)(const ParallelContext*); | |||
void InferBlobDescsByInputs(const ParallelContext* parallel_ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里只提供了一种 method 吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里只提供了一种 method 吗
对,相当于这个分支不改变执行逻辑,只提供了接口
@@ -43,6 +43,10 @@ class CompTaskNode : public TaskNode { | |||
// op | |||
std::shared_ptr<const Operator> op() const { return op_node_->shared_op(); } | |||
|
|||
ExecNode::InferBlobDescsMethod GetInferBlobDescsMethod() const override { | |||
return &ExecNode::InferBlobDescsByInputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该支持 from sbp 和 logical shape ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该支持 from sbp 和 logical shape ?
from sbp 的推理方法和编译模式关联了,所以就没加到这个分支
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其实为了加速,master 编译这里也可以用 from sbp 吧,这样是不是会更快?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其实为了加速,master 编译这里也可以用 from sbp 吧,这样是不是会更快?
当前估计使用 from sbp 不会变快:
- from sbp 本身的实现开销和之前做 infer physical blobdesc 估计差不多
- 只适用于 user op,改成通用的影响的地方比较多,有个后续 pr 在做这个
- master infer 的过程如果不使用并行,加速不明显
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有的op不支持from sbp吧,比如涉及到求平均,求和或者求最大值的。(我记得当前有一个op是这样的,好像是叫bn?)
还有个问题就是如果sbp变动了会怎么样?当前是要重新推导一遍
比如自动并行会大规模修改sbp。这时候如果有个logical desc储存着应该好一点
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有的op不支持from sbp吧,比如涉及到求平均,求和或者求最大值的。(我记得当前有一个op是这样的,好像是叫bn?)
user op 都是符合的,这里使用的场景,之前用 physical infer 推理后,也会再用 sbp 做下 check,所以也是符合的。
还有个问题就是如果sbp变动了会怎么样?当前是要重新推导一遍 比如自动并行会大规模修改sbp。这时候如果有个logical desc储存着应该好一点
因为这个推理发生在 plan 生成阶段,是在 自动不行之后,按说 sbp 已经稳定了。
Co-authored-by: Yipeng Li <jamesonli1313@gmail.com>
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Speed stats:
|
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/10124/ |
No description provided.