Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flow.randn #5736

Merged
merged 17 commits into from
Aug 9, 2021
Merged

add flow.randn #5736

merged 17 commits into from
Aug 9, 2021

Conversation

VertexC
Copy link
Contributor

@VertexC VertexC commented Aug 5, 2021

image

oneflow/core/functional/impl/random_functor.cpp Outdated Show resolved Hide resolved
oneflow/core/functional/impl/random_functor.cpp Outdated Show resolved Hide resolved
oneflow/user/kernels/normal_generator.cpp Outdated Show resolved Hide resolved
oneflow/user/kernels/normal_generator.h Outdated Show resolved Hide resolved
oneflow/user/kernels/normal_generator.h Outdated Show resolved Hide resolved
oneflow/user/kernels/normal_generator.h Outdated Show resolved Hide resolved
oneflow/user/ops/random_op.cpp Outdated Show resolved Hide resolved
oneflow/user/ops/random_op.cpp Outdated Show resolved Hide resolved
super().__init__()
# TODO: make shape process as a util
assert size is not None, "shape must not be None!"
assert isinstance(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实这些检查都是不必要的,functional接口在类型不匹配时会直接报错

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种检查报错可能对用户更加友好一点?

python/oneflow/nn/modules/random_ops.py Outdated Show resolved Hide resolved
python/oneflow/nn/modules/random_ops.py Outdated Show resolved Hide resolved
python/oneflow/nn/modules/random_ops.py Outdated Show resolved Hide resolved
@VertexC
Copy link
Contributor Author

VertexC commented Aug 8, 2021

这个op是有状态的,还是要包装成module来持有状态

random op因为有generator是有状态的 但是会有下面这种创建一个Randn module,再从这个module创建tensor的需求吗

m = flow.Randn(...)
tensor_1 = m()
tensor_2 = m()

如果没有的话,只暴露一个flow.randn的接口,应该不需要module吧。

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 8, 2021 14:16
@hjchen2
Copy link
Contributor

hjchen2 commented Aug 9, 2021

这个op是有状态的,还是要包装成module来持有状态

random op因为有generator是有状态的 但是会有下面这种创建一个Randn module,再从这个module创建tensor的需求吗

m = flow.Randn(...)
tensor_1 = m()
tensor_2 = m()

如果没有的话,只暴露一个flow.randn的接口,应该不需要module吧。

这里如果完全对齐pytorch是不需要module的,调用这个接口,如果没有提供generator,有两种处理方式:1、用全局的generator就好了,但这样的话sbp肯定就不能支持B了;2、直接创建一个新的generator,新的generator的seed每次会不会变化,如果不变,那每次随机出来都是同样的值,如果seed会变化,那怎么保证sbp为B的情况下,multi-client间的seed相同?

你上面说的Randn module多次执行的问题,这个天然就存在这种需求的。比如说我用Randn module搭建了一个网络,然后迭代多次,是不是就会多次执行这个module呢。

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 9, 2021 09:39
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 9, 2021 09:39
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 136.9ms (= 6845.3ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 125.8ms (= 6288.5ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.09 (= 136.9ms / 125.8ms)

PyTorch resnet50 time: 81.1ms (= 4055.2ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 73.0ms (= 3649.4ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 81.1ms / 73.0ms)

PyTorch resnet50 time: 58.7ms (= 2935.1ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 50.7ms (= 2534.9ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.16 (= 58.7ms / 50.7ms)

PyTorch resnet50 time: 49.7ms (= 2484.8ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 40.5ms (= 2027.0ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.23 (= 49.7ms / 40.5ms)

PyTorch resnet50 time: 43.8ms (= 2187.8ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.6ms (= 2078.2ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.05 (= 43.8ms / 41.6ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 9, 2021 12:11
@oneflow-ci-bot oneflow-ci-bot merged commit 8eefb6c into master Aug 9, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the dev_flow_randn branch August 9, 2021 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants