Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase #5601

Merged
merged 48 commits into from
Jul 30, 2021
Merged

rebase #5601

merged 48 commits into from
Jul 30, 2021

Conversation

lixinqi
Copy link
Contributor

@lixinqi lixinqi commented Jul 26, 2021

No description provided.


Maybe<void> CheckIsDeviceSupportedByOp(const ParallelDesc& parallel_desc,
const std::string& op_type_name) {
if (IsCpuOnly(op_type_name)) { CHECK_EQ_OR_RETURN(parallel_desc.device_tag(), "cpu"); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以后可能会支持其他device(除了cpu和cuda),不能只判断cpu吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

未来支持其他device的时候到时再处理。因为就算这一次漏写了,也不会出事。

{
// Infer OpArgMutConsistentTensorMeta.
const auto& GetInputTensorMeta = [](int32_t i) {
UNIMPLEMENTED();
Copy link
Contributor

@hjchen2 hjchen2 Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InferLogicalShapeAndDType不会调用这个lambda,所以这里才直接UNIMPLEMENTED()么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的。这是source op

@@ -37,7 +37,10 @@ namespace one {

namespace {

Maybe<Symbol<Device>> GetDefaultDevice() { return Device::New("cpu", 0); }
Maybe<Symbol<Device>> GetDefaultDevice(const OpExprInterpContext& ctx) {
if (ctx.device.has_value()) { return ctx.device.value(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接来一个全局的default_device_symbol_应该更好,而且建议python中调用flow.device("type:index")不应该每次都创建一个device,而是返回一个单例的device,因为我发现python里面flow.device("type:index")还挺耗时的。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已解决

@@ -86,7 +93,9 @@ class TensorInfo final {
private:
std::shared_ptr<const Shape> shape_;
DataType dtype_;
// TODO: Add device info
Maybe<Symbol<Device>> device_; // for local tensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也可以改成Optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已解决

@@ -264,7 +264,11 @@
bind_python: True

- name: "constant"
signature: "Tensor Constant(*, Shape shape, Scalar value, DataType dtype)"
signature: "Tensor Constant(*, Shape shape, Scalar value, DataType dtype, Int64 device)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device可以默认为None,按照pytorch的逻辑None device应该就是cpu,

ignature: "Tensor Constant(*, Shape shape, Scalar value, DataType dtype,  Int64 device=None)

在functor那里可以用Optional<Int64>来接它。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的。我正好难以处理None的情形。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已解决

}
const auto& parallel_distribution = JUST(MakeParallelDistribution(sbp_tuple));
if (!JUST(*Global<Maybe<bool>, MultiClient>::Get())) {
JUST(attrs.SetAttr<std::string>("nd_sbp", parallel_distribution->DebugString()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DebugString感觉怪怪的~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这都是因为user_op_attr那里并没支持sbp基本类型。如果只能序列化的话,我倾向于序列化成txt而不是binary,因为不在乎那点存储,可读性反而很重要。
我尝试把这里改成PbMessage2TxtString

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,PbMessage2TxtString可以

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已解决

@hjchen2 hjchen2 self-requested a review July 26, 2021 13:27
@hjchen2 hjchen2 removed the automerge label Jul 26, 2021
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 12:23
@oneflow-ci-bot oneflow-ci-bot self-requested a review July 29, 2021 12:23
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 15:20
@oneflow-ci-bot oneflow-ci-bot self-requested a review July 29, 2021 17:08
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 18:09
@oneflow-ci-bot oneflow-ci-bot self-requested a review July 30, 2021 06:28
Comment on lines +32 to +35
placement: flow.placement = None,
sbp: Union[
flow._oneflow_internal.sbp.sbp, List[flow._oneflow_internal.sbp.sbp]
] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

配置placement和sbp。

def test_consistent_naive(test_case):
placement = flow.placement("cpu", {0: [0]})
sbp = (flow.sbp.broadcast,)
x = flow.ones((16, 16), placement=placement, sbp=sbp)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

示例

@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 30, 2021 06:53
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 140.3ms (= 7013.5ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.7ms (= 6333.8ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 140.3ms / 126.7ms)

PyTorch resnet50 time: 81.2ms (= 4061.9ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.0ms (= 3701.4ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 81.2ms / 74.0ms)

PyTorch resnet50 time: 54.4ms (= 2718.4ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 47.5ms (= 2374.8ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.14 (= 54.4ms / 47.5ms)

PyTorch resnet50 time: 47.7ms (= 2386.0ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 43.4ms (= 2169.7ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 47.7ms / 43.4ms)

PyTorch resnet50 time: 41.3ms (= 2063.6ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 42.4ms (= 2119.6ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.97 (= 41.3ms / 42.4ms)

@oneflow-ci-bot oneflow-ci-bot merged commit bf4bdd6 into master Jul 30, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the deduce_consistent_op_interpreter branch July 30, 2021 09:04
Maybe<Symbol<cfg::ParallelDistribution>> MakeParallelDistribution(
const std::vector<Symbol<cfg::SbpParallel>>& sbp_tuple) const {
static thread_local std::map<std::vector<Symbol<cfg::SbpParallel>>,
Symbol<cfg::ParallelDistribution>>
Copy link
Contributor

@chengtbf chengtbf Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数是不是应该提取成为一个公共函数?因为不光是这个一个 Constant Functor 里会用到 @lixinqi @hjchen2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants