Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev fused functional #5954

Merged
merged 18 commits into from
Aug 24, 2021
Merged

Dev fused functional #5954

merged 18 commits into from
Aug 24, 2021

Conversation

MARD1NO
Copy link
Contributor

@MARD1NO MARD1NO commented Aug 19, 2021

增加 fused_bias_add_gelu 和 fused_bias_add_dropout

eager和静态图下有区分,暂不支持 fused_bias_add_dropout的 OptionalInput 后向优化

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 23, 2021 09:50
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 23, 2021 09:50
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 23, 2021 10:01
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 140.2ms (= 7009.6ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 128.3ms (= 6416.8ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.09 (= 140.2ms / 128.3ms)

PyTorch resnet50 time: 82.9ms (= 4145.0ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 76.4ms (= 3819.1ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.09 (= 82.9ms / 76.4ms)

PyTorch resnet50 time: 60.4ms (= 3018.3ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.5ms (= 2425.2ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.24 (= 60.4ms / 48.5ms)

PyTorch resnet50 time: 50.3ms (= 2515.4ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 43.6ms (= 2182.4ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.15 (= 50.3ms / 43.6ms)

PyTorch resnet50 time: 42.9ms (= 2146.5ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 37.0ms (= 1852.1ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.16 (= 42.9ms / 37.0ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 23, 2021 11:32
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 141.2ms (= 7059.9ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 128.9ms (= 6443.6ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 141.2ms / 128.9ms)

PyTorch resnet50 time: 85.2ms (= 4262.0ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 75.2ms (= 3761.5ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.13 (= 85.2ms / 75.2ms)

PyTorch resnet50 time: 58.5ms (= 2926.5ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.1ms (= 2406.8ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.22 (= 58.5ms / 48.1ms)

PyTorch resnet50 time: 49.7ms (= 2486.6ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.2ms (= 2059.1ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.21 (= 49.7ms / 41.2ms)

PyTorch resnet50 time: 43.8ms (= 2190.0ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 38.6ms (= 1931.8ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.13 (= 43.8ms / 38.6ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 24, 2021 01:00
@oneflow-ci-bot oneflow-ci-bot merged commit a4532c6 into master Aug 24, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the dev_fused_functional branch August 24, 2021 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants