-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decorator 4 disable recursive boxing call #5796
Decorator 4 disable recursive boxing call #5796
Conversation
…neflow into broadcast_consistent_shape
…r checking consistent tensor meta.
#include "oneflow/core/rpc/include/global_process_ctx.h" | ||
|
||
namespace oneflow { | ||
|
||
/*static*/ Maybe<Symbol<RankGroup>> RankGroup::New(Symbol<ParallelDesc> parallel_desc) { | ||
return DECORATE(&RankGroup::RawNew, ThreadLocal)(parallel_desc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从 RawNew(Symbol parallel_desc) 的实现来看它并不需要缓存?直接调 New(const std::set<int64_t>& ranks) 就可以,然后 New(const std::set<int64_t>& ranks) 缓存了 RawNew(const std::set<int64_t>& ranks) 的结果
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
构建和查询std::set<int64_t>成本也不低。Symbol的成本只有int64_t
sbp = (flow.sbp.broadcast,) | ||
y = x.to_consistent(placement=placement, sbp=sbp) | ||
y.check_meta_consistency() | ||
print(y.shape, y.placement, y.sbp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 print 如果是为了检查正确性,需要改成 test_case.assertTrue(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯。这应该是忘删了。
@@ -74,7 +72,6 @@ template<Maybe<void> (*SendOrRecv)(const TransportToken&, int64_t, void*, std::s | |||
std::function<void()>*)> | |||
Maybe<void> AccessToAllOtherRanks(Symbol<RankGroup> rank_group, const TransportToken& token, | |||
AsyncTransportCtx* ctx) { | |||
CHECK_OR_RETURN(rank_group->ContainingCurrentRank()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的检查为什么移动到了外层函数里呢,放在这里会更合理吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里发生过BUG。某些情况下rank0 会给 rank{1, 2, 3}发数据,显然rank{1, 2, 3}不会包含rank0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
也许我应该改一下函数名称
++*recursive_depth; | ||
RetT&& ret = func(arg0, arg1, outputs, args...); | ||
--*recursive_depth; | ||
if (*recursive_depth == 0) { JUST(InitConsistentId(outputs)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议:这个地方可以考虑用 raii?这样可以直接 return func(...),并且之后如果 40 行的区域里出现会提前返回的情况(比如 RetT&& ret = func(arg0, JUST(arg1), outputs, args...);
)也能保证这里一定执行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以。
struct CheckConsistentTensorMeta<RetT, const std::shared_ptr<one::Tensor>&, Args...> { | ||
static_assert(is_maybe<RetT>::value, "returned value type must be Maybe<T>."); | ||
template<RetT (*func)(const std::shared_ptr<one::Tensor>&, Args...)> | ||
static RetT Call(const std::shared_ptr<one::Tensor>& tensor, Args... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数可以直接返回 CheckConsistentTensorMeta<RetT, const one::Tensor&, Args...>::Call(*tensor, args...)
吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以。
mark。暂时移除 automerge,两个小时以后加回来。 |
Speed stats:
|
本pr修复一个关键BUG: 递归调用eager consistent op interpreter时不再设置consistent id,因为此时不是logical op。
本pr加入更强的检查,递归调用eager consistent op interpreter时,禁止boxing行为。