Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flow._C.frac #9979

Merged
merged 28 commits into from
Mar 30, 2023
Merged

flow._C.frac #9979

merged 28 commits into from
Mar 30, 2023

Conversation

ccssu
Copy link
Contributor

@ccssu ccssu commented Mar 13, 2023

  • 定义Op
  • 实现具体的Kernel计算逻辑
  • 实现对应的gradient function
  • 导出functional接口
  • 在 python 层导出 _C.api
  • 编写文档
  • 将文档添加到 docs 中的 rst 文件
  • 添加测试(准确度和global测试)

参考文档:

@CLAassistant
Copy link

CLAassistant commented Mar 13, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@BBuf BBuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以参考这个pr直接用element_wise的模板来实现,效率更高。#9958 (忽略这个pr里面的高阶导部分即可)

@ccssu
Copy link
Contributor Author

ccssu commented Mar 13, 2023

可以参考这个pr直接用element_wise的模板来实现,效率更高。#9958 (忽略这个pr里面的高阶导部分即可)

嗯嗯,好的

@ccssu ccssu requested a review from daquexian as a code owner March 14, 2023 01:22
@ccssu ccssu requested a review from doombeaker as a code owner March 14, 2023 01:42
@ccssu
Copy link
Contributor Author

ccssu commented Mar 16, 2023

image
image

Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
@BBuf BBuf removed the request for review from oneflow-ci-bot March 20, 2023 02:33
@github-actions
Copy link
Contributor

Speed stats:

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 140.9ms (= 14089.8ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.9ms (= 14387.3ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.02 (= 143.9ms / 140.9ms)

OneFlow resnet50 time: 80.5ms (= 8054.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.7ms (= 8369.8ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.04 (= 83.7ms / 80.5ms)

OneFlow resnet50 time: 49.1ms (= 9816.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.6ms (= 12119.7ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.23 (= 60.6ms / 49.1ms)

OneFlow resnet50 time: 33.5ms (= 6697.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 54.0ms (= 10807.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.61 (= 54.0ms / 33.5ms)

OneFlow resnet50 time: 25.0ms (= 5009.3ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.1ms (= 8413.8ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.68 (= 42.1ms / 25.0ms)

OneFlow swin dataloader time: 0.244s (= 48.861s / 200, num_workers=1)
PyTorch swin dataloader time: 0.148s (= 29.657s / 200, num_workers=1)
Relative speed: 0.607 (= 0.148s / 0.244s)

OneFlow swin dataloader time: 0.070s (= 13.933s / 200, num_workers=4)
PyTorch swin dataloader time: 0.046s (= 9.136s / 200, num_workers=4)
Relative speed: 0.656 (= 0.046s / 0.070s)

OneFlow swin dataloader time: 0.043s (= 8.604s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.340s / 200, num_workers=8)
Relative speed: 0.504 (= 0.022s / 0.043s)

❌ OneFlow resnet50 time: 152.4ms (= 15242.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.2ms (= 16123.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.06 (= 161.2ms / 152.4ms)

OneFlow resnet50 time: 90.9ms (= 9091.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.6ms (= 10257.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 102.6ms / 90.9ms)

OneFlow resnet50 time: 58.8ms (= 11760.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 80.4ms (= 16075.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 80.4ms / 58.8ms)

OneFlow resnet50 time: 41.9ms (= 8376.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.7ms (= 14937.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.78 (= 74.7ms / 41.9ms)

OneFlow resnet50 time: 36.5ms (= 7299.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.2ms (= 14045.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.92 (= 70.2ms / 36.5ms)

@github-actions
Copy link
Contributor

CI failed when running job: cpu-misc. PR label automerge has been removed

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.1ms (= 14105.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.5ms (= 14351.3ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.02 (= 143.5ms / 141.1ms)

OneFlow resnet50 time: 80.6ms (= 8060.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.3ms (= 8330.2ms / 100, input_shape=[8, 3, 224, 224])
❌ Relative speed: 1.03 (= 83.3ms / 80.6ms)

OneFlow resnet50 time: 49.2ms (= 9847.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 55.6ms (= 11118.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.13 (= 55.6ms / 49.2ms)

OneFlow resnet50 time: 32.1ms (= 6426.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.7ms (= 9142.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.42 (= 45.7ms / 32.1ms)

OneFlow resnet50 time: 25.5ms (= 5092.6ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.0ms (= 7400.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.45 (= 37.0ms / 25.5ms)

OneFlow swin dataloader time: 0.235s (= 47.096s / 200, num_workers=1)
PyTorch swin dataloader time: 0.153s (= 30.647s / 200, num_workers=1)
Relative speed: 0.651 (= 0.153s / 0.235s)

OneFlow swin dataloader time: 0.066s (= 13.146s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.223s / 200, num_workers=4)
Relative speed: 0.626 (= 0.041s / 0.066s)

OneFlow swin dataloader time: 0.038s (= 7.615s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.350s / 200, num_workers=8)
Relative speed: 0.571 (= 0.022s / 0.038s)

❌ OneFlow resnet50 time: 152.3ms (= 15230.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.5ms (= 16154.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.06 (= 161.5ms / 152.3ms)

OneFlow resnet50 time: 91.0ms (= 9097.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.6ms (= 10262.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 102.6ms / 91.0ms)

OneFlow resnet50 time: 59.3ms (= 11850.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15642.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 78.2ms / 59.3ms)

OneFlow resnet50 time: 42.4ms (= 8474.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.5ms (= 14095.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.66 (= 70.5ms / 42.4ms)

OneFlow resnet50 time: 36.5ms (= 7307.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.9ms (= 13772.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.88 (= 68.9ms / 36.5ms)

@ccssu ccssu requested review from oneflow-ci-bot and removed request for oneflow-ci-bot March 23, 2023 09:16
@github-actions
Copy link
Contributor

Speed stats:

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.0ms (= 14101.9ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.2ms (= 14516.8ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.03 (= 145.2ms / 141.0ms)

OneFlow resnet50 time: 80.7ms (= 8066.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.7ms (= 8368.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.04 (= 83.7ms / 80.7ms)

OneFlow resnet50 time: 49.6ms (= 9927.5ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.7ms (= 11536.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.16 (= 57.7ms / 49.6ms)

OneFlow resnet50 time: 32.8ms (= 6566.2ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 41.9ms (= 8384.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.28 (= 41.9ms / 32.8ms)

OneFlow resnet50 time: 24.8ms (= 4961.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 35.8ms (= 7167.6ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.44 (= 35.8ms / 24.8ms)

OneFlow swin dataloader time: 0.236s (= 47.187s / 200, num_workers=1)
PyTorch swin dataloader time: 0.149s (= 29.726s / 200, num_workers=1)
Relative speed: 0.630 (= 0.149s / 0.236s)

OneFlow swin dataloader time: 0.069s (= 13.839s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.367s / 200, num_workers=4)
Relative speed: 0.605 (= 0.042s / 0.069s)

OneFlow swin dataloader time: 0.039s (= 7.890s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.580s / 200, num_workers=8)
Relative speed: 0.580 (= 0.023s / 0.039s)

❌ OneFlow resnet50 time: 152.8ms (= 15284.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.2ms (= 16115.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.05 (= 161.2ms / 152.8ms)

OneFlow resnet50 time: 91.2ms (= 9116.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.8ms (= 10284.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 102.8ms / 91.2ms)

OneFlow resnet50 time: 59.0ms (= 11798.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.9ms (= 15784.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 78.9ms / 59.0ms)

OneFlow resnet50 time: 42.5ms (= 8506.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.8ms (= 15355.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.81 (= 76.8ms / 42.5ms)

OneFlow resnet50 time: 35.8ms (= 7167.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.6ms (= 14525.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 2.03 (= 72.6ms / 35.8ms)

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.0ms (= 14096.6ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.8ms (= 14378.2ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.02 (= 143.8ms / 141.0ms)

OneFlow resnet50 time: 80.4ms (= 8043.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.4ms (= 8440.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.05 (= 84.4ms / 80.4ms)

OneFlow resnet50 time: 48.8ms (= 9766.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 56.4ms (= 11273.7ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.15 (= 56.4ms / 48.8ms)

OneFlow resnet50 time: 32.2ms (= 6436.3ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 47.6ms (= 9520.1ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.48 (= 47.6ms / 32.2ms)

OneFlow resnet50 time: 24.5ms (= 4906.4ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 40.6ms (= 8110.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.65 (= 40.6ms / 24.5ms)

OneFlow swin dataloader time: 0.257s (= 51.313s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.160s / 200, num_workers=1)
Relative speed: 0.588 (= 0.151s / 0.257s)

OneFlow swin dataloader time: 0.072s (= 14.334s / 200, num_workers=4)
PyTorch swin dataloader time: 0.044s (= 8.899s / 200, num_workers=4)
Relative speed: 0.621 (= 0.044s / 0.072s)

OneFlow swin dataloader time: 0.040s (= 8.056s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.547s / 200, num_workers=8)
Relative speed: 0.564 (= 0.023s / 0.040s)

❌ OneFlow resnet50 time: 152.8ms (= 15280.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.0ms (= 16099.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.05 (= 161.0ms / 152.8ms)

OneFlow resnet50 time: 91.0ms (= 9102.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.3ms (= 10329.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 103.3ms / 91.0ms)

OneFlow resnet50 time: 59.3ms (= 11857.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.1ms (= 15613.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 78.1ms / 59.3ms)

OneFlow resnet50 time: 41.9ms (= 8381.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 73.6ms (= 14712.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.76 (= 73.6ms / 41.9ms)

OneFlow resnet50 time: 36.6ms (= 7326.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.9ms (= 13770.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.88 (= 68.9ms / 36.6ms)

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.2ms (= 14124.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 142.9ms (= 14293.2ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.01 (= 142.9ms / 141.2ms)

OneFlow resnet50 time: 81.8ms (= 8176.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.0ms (= 8600.9ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.05 (= 86.0ms / 81.8ms)

OneFlow resnet50 time: 50.8ms (= 10164.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 58.0ms (= 11594.6ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.14 (= 58.0ms / 50.8ms)

OneFlow resnet50 time: 33.8ms (= 6767.9ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 42.5ms (= 8500.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.26 (= 42.5ms / 33.8ms)

OneFlow resnet50 time: 27.4ms (= 5483.9ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 39.9ms (= 7973.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.45 (= 39.9ms / 27.4ms)

OneFlow swin dataloader time: 0.239s (= 47.883s / 200, num_workers=1)
PyTorch swin dataloader time: 0.147s (= 29.334s / 200, num_workers=1)
Relative speed: 0.613 (= 0.147s / 0.239s)

OneFlow swin dataloader time: 0.071s (= 14.218s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.149s / 200, num_workers=4)
Relative speed: 0.573 (= 0.041s / 0.071s)

OneFlow swin dataloader time: 0.043s (= 8.694s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.522s / 200, num_workers=8)
Relative speed: 0.520 (= 0.023s / 0.043s)

❌ OneFlow resnet50 time: 153.1ms (= 15313.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.7ms (= 16373.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 163.7ms / 153.1ms)

OneFlow resnet50 time: 92.6ms (= 9263.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.8ms (= 10382.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 103.8ms / 92.6ms)

OneFlow resnet50 time: 61.1ms (= 12227.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15646.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.28 (= 78.2ms / 61.1ms)

OneFlow resnet50 time: 43.0ms (= 8595.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.2ms (= 14038.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 70.2ms / 43.0ms)

OneFlow resnet50 time: 36.6ms (= 7323.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.7ms (= 13549.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.85 (= 67.7ms / 36.6ms)

@ccssu ccssu enabled auto-merge (squash) March 30, 2023 07:36
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.2ms (= 14124.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.1ms (= 14508.4ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.03 (= 145.1ms / 141.2ms)

OneFlow resnet50 time: 83.3ms (= 8325.5ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.4ms (= 8644.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.04 (= 86.4ms / 83.3ms)

OneFlow resnet50 time: 51.6ms (= 10320.9ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 58.0ms (= 11608.7ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.12 (= 58.0ms / 51.6ms)

OneFlow resnet50 time: 33.9ms (= 6778.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 44.6ms (= 8914.1ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.32 (= 44.6ms / 33.9ms)

OneFlow resnet50 time: 26.3ms (= 5250.3ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 44.5ms (= 8893.8ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.69 (= 44.5ms / 26.3ms)

OneFlow swin dataloader time: 0.239s (= 47.717s / 200, num_workers=1)
PyTorch swin dataloader time: 0.148s (= 29.661s / 200, num_workers=1)
Relative speed: 0.622 (= 0.148s / 0.239s)

OneFlow swin dataloader time: 0.069s (= 13.725s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.235s / 200, num_workers=4)
Relative speed: 0.600 (= 0.041s / 0.069s)

OneFlow swin dataloader time: 0.038s (= 7.521s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.488s / 200, num_workers=8)
Relative speed: 0.597 (= 0.022s / 0.038s)

❌ OneFlow resnet50 time: 154.4ms (= 15435.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.9ms (= 16486.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 164.9ms / 154.4ms)

OneFlow resnet50 time: 93.8ms (= 9376.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.6ms (= 10458.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 104.6ms / 93.8ms)

OneFlow resnet50 time: 61.3ms (= 12260.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.1ms (= 15818.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 79.1ms / 61.3ms)

OneFlow resnet50 time: 43.3ms (= 8668.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.8ms (= 14559.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.68 (= 72.8ms / 43.3ms)

OneFlow resnet50 time: 35.9ms (= 7178.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.8ms (= 13369.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.86 (= 66.8ms / 35.9ms)

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9979/

@ccssu ccssu merged commit a0f0c1c into master Mar 30, 2023
@ccssu ccssu deleted the dev_add_frac_op branch March 30, 2023 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants