Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scalar_reverse_pow op/kernel #7729

Merged
merged 33 commits into from
Mar 11, 2022
Merged

scalar_reverse_pow op/kernel #7729

merged 33 commits into from
Mar 11, 2022

Conversation

Flowingsun007
Copy link
Contributor

@Flowingsun007 Flowingsun007 commented Mar 8, 2022

fix bug in issue:#7722

  • implementation of "scalar_reverse_pow" op/kernel
  • test case

@Flowingsun007 Flowingsun007 marked this pull request as ready for review March 9, 2022 09:01
@Flowingsun007 Flowingsun007 changed the title scalar_tensor_pow forward scalar_tensor_pow op/kernel Mar 9, 2022
@Flowingsun007
Copy link
Contributor Author

修复:#7722 中的问题,@BBuf @simonJJJ 可以帮忙review下

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2022

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 9, 2022 09:31
@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 11, 2022 03:07
@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 11, 2022 05:31
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 128.6ms (= 12855.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 141.9ms (= 14191.5ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.10 (= 141.9ms / 128.6ms)

❌ OneFlow resnet50 time: 79.7ms (= 7972.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 87.6ms (= 8757.5ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.10 (= 87.6ms / 79.7ms)

OneFlow resnet50 time: 53.5ms (= 10701.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.7ms (= 12147.2ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.14 (= 60.7ms / 53.5ms)

OneFlow resnet50 time: 44.9ms (= 8971.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 52.4ms (= 10479.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.17 (= 52.4ms / 44.9ms)

OneFlow resnet50 time: 38.9ms (= 7782.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 40.9ms (= 8178.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.05 (= 40.9ms / 38.9ms)

✔️ OneFlow resnet50 time: 142.6ms (= 14258.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.6ms (= 16159.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 161.6ms / 142.6ms)

OneFlow resnet50 time: 86.6ms (= 8657.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.0ms (= 10402.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 104.0ms / 86.6ms)

OneFlow resnet50 time: 62.3ms (= 12458.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.2ms (= 15449.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.24 (= 77.2ms / 62.3ms)

OneFlow resnet50 time: 52.4ms (= 10478.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.2ms (= 13833.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 69.2ms / 52.4ms)

OneFlow resnet50 time: 50.7ms (= 10133.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 62.5ms (= 12499.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 62.5ms / 50.7ms)

@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 11, 2022 14:36
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 128.7ms (= 12874.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 146.9ms (= 14690.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.14 (= 146.9ms / 128.7ms)

❌ OneFlow resnet50 time: 80.4ms (= 8035.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 85.2ms (= 8518.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 85.2ms / 80.4ms)

OneFlow resnet50 time: 53.3ms (= 10661.9ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 61.1ms (= 12212.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.15 (= 61.1ms / 53.3ms)

OneFlow resnet50 time: 43.1ms (= 8612.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.9ms (= 9579.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.11 (= 47.9ms / 43.1ms)

OneFlow resnet50 time: 39.6ms (= 7926.6ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 41.6ms (= 8320.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.05 (= 41.6ms / 39.6ms)

✔️ OneFlow resnet50 time: 142.7ms (= 14265.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.5ms (= 16449.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 164.5ms / 142.7ms)

OneFlow resnet50 time: 90.8ms (= 9075.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.8ms (= 10284.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 102.8ms / 90.8ms)

OneFlow resnet50 time: 62.9ms (= 12571.6ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.2ms (= 15441.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 77.2ms / 62.9ms)

OneFlow resnet50 time: 53.0ms (= 10600.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.7ms (= 13531.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.28 (= 67.7ms / 53.0ms)

OneFlow resnet50 time: 47.9ms (= 9583.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 61.9ms (= 12385.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 61.9ms / 47.9ms)

@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 11, 2022 15:54
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 128.8ms (= 12875.6ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 140.2ms (= 14022.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 140.2ms / 128.8ms)

✔️ OneFlow resnet50 time: 78.3ms (= 7828.7ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 88.6ms (= 8860.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 88.6ms / 78.3ms)

OneFlow resnet50 time: 53.8ms (= 10764.5ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.9ms (= 12181.2ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.13 (= 60.9ms / 53.8ms)

OneFlow resnet50 time: 44.7ms (= 8949.0ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.8ms (= 9553.0ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.07 (= 47.8ms / 44.7ms)

OneFlow resnet50 time: 36.0ms (= 7197.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 41.3ms (= 8266.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.15 (= 41.3ms / 36.0ms)

✔️ OneFlow resnet50 time: 143.1ms (= 14309.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 157.0ms (= 15696.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.10 (= 157.0ms / 143.1ms)

OneFlow resnet50 time: 87.8ms (= 8780.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.9ms (= 10293.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 102.9ms / 87.8ms)

OneFlow resnet50 time: 61.7ms (= 12336.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.7ms (= 15348.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.24 (= 76.7ms / 61.7ms)

OneFlow resnet50 time: 52.1ms (= 10419.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.4ms (= 13487.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 67.4ms / 52.1ms)

OneFlow resnet50 time: 48.0ms (= 9599.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 61.1ms (= 12224.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 61.1ms / 48.0ms)

@Flowingsun007 Flowingsun007 merged commit bada333 into master Mar 11, 2022
@Flowingsun007 Flowingsun007 deleted the fix_scalar_pow branch March 11, 2022 16:51
marigoold pushed a commit that referenced this pull request Mar 15, 2022
* scalar_tensor_pow forward

* add test case

* scalar_tensor_pow backward

* refine
wyg1997 pushed a commit that referenced this pull request Mar 17, 2022
* scalar_tensor_pow forward

* add test case

* scalar_tensor_pow backward

* refine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants