Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev flow.utils.data part2 #5500

Merged
merged 144 commits into from
Jul 27, 2021
Merged

Dev flow.utils.data part2 #5500

merged 144 commits into from
Jul 27, 2021

Conversation

Flowingsun007
Copy link
Contributor

@Flowingsun007 Flowingsun007 commented Jul 15, 2021

dataloader最小实现(part2)


此pr主要改动:

1.导出dataloader,dataset基础接口

外部可以直接使用导出后的接口,如,flow.utils.data.Dataset,flow.utils.data.DataLoader等:

dataset = flow.utils.data.TensorDataset(features, labels)
data_iter = flow.utils.data.DataLoader(dataset, batch_size, shuffle=True, num_workers=0)

2.引入torchvision包下的部分datasets和transforms转换工具

torchvision下的transforms和datasets两个文件夹中的部分常用功能,譬如:

  • datasets引入了cv里常用的demo数据集——mnist/fashion-mnist/cifar10/cifar100,支持对这些数据集的自动下载,解压格式处理;
  • transforms主要是对dataloader加载后的数据进行totensor,resize,normalize等操作。

在多个网络/数据集上测试,数据均可以正常加载,训练loss可正常下降。test case在oneflow/python/utils/vision/

  • test_mlp_flow.py # 测试fashion-mnist
  • test_lenet_flow.py # 测试fashion-mnist
  • test_cifar_flow.py # 测试cifar10
(oneflow-dev) [luyang@oneflow-15 vision]$ python test_mlp_flow.py
epoch 1, loss 0.0034, train acc 0.714, test acc 0.784, cost >>>>>>> 132.84298586845398(s)
epoch 2, loss 0.0022, train acc 0.809, test acc 0.818, cost >>>>>>> 130.11517643928528(s)
epoch 3, loss 0.0019, train acc 0.829, test acc 0.793, cost >>>>>>> 131.28324055671692(s)
epoch 4, loss 0.0018, train acc 0.841, test acc 0.796, cost >>>>>>> 130.2774519920349(s)

(oneflow-dev) [luyang@oneflow-15 vision]$ python test_lenet_flow.py
training on  cuda:0
epoch 1, loss 1.3111, train acc 0.515, test acc 0.684, time 129.4 sec
epoch 2, loss 0.6790, train acc 0.741, test acc 0.745, time 141.3 sec
epoch 3, loss 0.5768, train acc 0.780, test acc 0.782, time 167.0 sec
epoch 4, loss 0.5070, train acc 0.813, test acc 0.800, time 132.8 sec
epoch 5, loss 0.4620, train acc 0.829, test acc 0.826, time 134.6 sec

注:test case由于训练比较慢,暂时没有移入test/dataloader作为CI的test case,后续看情况是否需要跑CI

Flowingsun007 and others added 30 commits June 10, 2021 11:15
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 26, 2021 06:30
@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 26, 2021 08:09
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 26, 2021 09:50
@Flowingsun007 Flowingsun007 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 26, 2021 16:59
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 26, 2021 18:06
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 27, 2021 02:44
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 139.6ms (= 6980.0ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 124.6ms (= 6230.2ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.12 (= 139.6ms / 124.6ms)

PyTorch resnet50 time: 82.7ms (= 4137.5ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.3ms (= 3714.1ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 82.7ms / 74.3ms)

PyTorch resnet50 time: 58.4ms (= 2922.5ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 57.2ms (= 2857.6ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.02 (= 58.4ms / 57.2ms)

PyTorch resnet50 time: 48.2ms (= 2411.1ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 51.7ms (= 2586.7ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 0.93 (= 48.2ms / 51.7ms)

PyTorch resnet50 time: 43.0ms (= 2148.6ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 53.4ms (= 2670.1ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.80 (= 43.0ms / 53.4ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 27, 2021 06:04
@Flowingsun007 Flowingsun007 merged commit 3d341a7 into master Jul 27, 2021
@Flowingsun007 Flowingsun007 deleted the dev_flow.utils.data_part2 branch July 27, 2021 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants