Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev flow.utils.data part3 #5644

Merged
merged 98 commits into from Aug 13, 2021
Merged

Dev flow.utils.data part3 #5644

merged 98 commits into from Aug 13, 2021

Conversation

Flowingsun007
Copy link
Contributor

@Flowingsun007 Flowingsun007 commented Jul 28, 2021

dataloader最小实现(part3)


此pr主要改动:

1.支持flow.utils.data.DistributedSampler

基于DistributedSampler可以完成DDP的数据集构建,使用方式参考pytorch examples:

train_dataset = datasets.ImageFolder(
        traindir,
        transforms.Compose([
            transforms.RandomResizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normalize,
        ]))

if args.distributed:
    train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset)
else:
    train_sampler = None

train_loader = torch.utils.data.DataLoader(
    train_dataset, batch_size=args.batch_size, shuffle=(train_sampler is None),
    num_workers=args.workers, pin_memory=True, sampler=train_sampler)
  • TODO:补充一个DDP + DistributedSampler的test case

2.支持更多的常用cv类datasets

__all__ = [
    "MNIST", 
    "FashionMNIST", 
    "CIFAR10", 
    "CIFAR100",
    "CocoCaptions",
    "CocoDetection",
    "ImageNet",
    "VOCDetection",
    "VOCSegmentation",
    "DatasetFolder",
    "ImageFolder"
]

3. 支持对齐torchvision的更多transforms方法:

__all__ = [
    "Compose", 
    "ToTensor", 
    "PILToTensor",
    "ConvertImageDtype",
    "ToPILImage",
    "Normalize", 
    "Resize",
    "Scale",
    "CenterCrop",
    "Pad",
    "Lambda",
    "RandomTransforms",
    "RandomApply",
    "RandomOrder",
    "RandomChoice",
    "RandomCrop",
    "RandomHorizontalFlip",
    "RandomVerticalFlip",
    "RandomResizedCrop",
    "RandomSizedCrop",
    "FiveCrop",
    "TenCrop",
    "InterpolationMode"
]

@Flowingsun007 Flowingsun007 marked this pull request as ready for review August 3, 2021 12:36
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 13, 2021 07:04
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 13, 2021 10:59
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 13, 2021 12:07
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 13, 2021 13:48
@oneflow-ci-bot oneflow-ci-bot removed their request for review August 13, 2021 15:56
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 13, 2021 15:56
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 13, 2021 17:16
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 141.6ms (= 7081.5ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 128.3ms (= 6413.8ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 141.6ms / 128.3ms)

PyTorch resnet50 time: 83.4ms (= 4171.9ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.6ms (= 3731.2ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.12 (= 83.4ms / 74.6ms)

PyTorch resnet50 time: 59.4ms (= 2971.5ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.4ms (= 2419.8ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.23 (= 59.4ms / 48.4ms)

PyTorch resnet50 time: 50.2ms (= 2510.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 39.2ms (= 1957.7ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.28 (= 50.2ms / 39.2ms)

PyTorch resnet50 time: 39.8ms (= 1989.5ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 37.7ms (= 1886.7ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.05 (= 39.8ms / 37.7ms)

@oneflow-ci-bot oneflow-ci-bot merged commit 2ad6321 into master Aug 13, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the dev_flow.utils.data-part3 branch August 13, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants