Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

Merged

Conversation

Oscilloscope98
Copy link
Contributor

@Oscilloscope98 Oscilloscope98 commented Dec 29, 2022

Description

Add a generalized how-to guide for accelerate PyTorch cv data process pipeline。

Related python scripts:

1. Why the change?

The cv data process pipeline acceleration are exactly the same for PyTorch and PyTorch Lightning applications. There is no need to add separated how-to guides in PyTorch/PyTorch Lightning Training sections.

2. Summary of the change

  • Add a Nano how-to guide section "Preprocessing"

  • Add how to guide “How to accelerate a computer vision data processing pipeline” for PyTorch

  • Restyled quote blocks for better note/warning/related reading box styles
    Before:

    After:

3. How to test?

@rnwang04
Copy link
Contributor

rnwang04 commented Jan 3, 2023

  1. maybe we can split below code into two parts to make it more clear :

xxxx

# from torchvision import transforms
# from torchvision.datasets import OxfordIIITPet
from bigdl.nano.pytorch.vision import transforms
from bigdl.nano.pytorch.vision.datasets import OxfordIIITPet

# Data processing steps are the same as using torchvision
train_transform = transforms.Compose([transforms.Resize(256),
                                        transforms.RandomCrop(224),
                                        transforms.RandomHorizontalFlip(),
                                        transforms.ColorJitter(brightness=.5, hue=.3),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])
val_transform = transforms.Compose([transforms.Resize(256),
                                    transforms.CenterCrop(224),
                                    transforms.ToTensor(),
                                    transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])

train_dataset = OxfordIIITPet(root="/tmp/data", transform=train_transform, download=True)
val_dataset = OxfordIIITPet(root="/tmp/data", transform=val_transform)

xxx

# obtain training indices that will be used for validation
import torch

indices = torch.randperm(len(train_dataset))
val_size = len(train_dataset) // 4
train_dataset = torch.utils.data.Subset(train_dataset, indices[:-val_size])
val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])

# create dataloaders
from torch.utils.data.dataloader import DataLoader

train_dataloader = DataLoader(train_dataset, batch_size=32)
val_dataloader = DataLoader(val_dataset, batch_size=32)
  1. what preprocess acceleration does bigdl.nano.pytorch.vision do? shall we explain this a bit?

Copy link
Contributor

@rnwang04 rnwang04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Oscilloscope98 Oscilloscope98 merged commit bcb7649 into intel-analytics:main Jan 5, 2023
liu-shaojun pushed a commit that referenced this pull request Mar 25, 2024
…process pipeline (#7125)

* Restyle blockquote elements in web

* Add a generalized how-to section for preprocessing, including the data process accelerastion for PyTorch

* Small fix

* Update based on comments and small typo fixes

* Small fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants