[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

Oscilloscope98 · 2022-12-29T10:08:27Z

Description

Add a generalized how-to guide for accelerate PyTorch cv data process pipeline。

Related python scripts:

1. Why the change?

The cv data process pipeline acceleration are exactly the same for PyTorch and PyTorch Lightning applications. There is no need to add separated how-to guides in PyTorch/PyTorch Lightning Training sections.

2. Summary of the change

Add a Nano how-to guide section "Preprocessing"
Add how to guide “How to accelerate a computer vision data processing pipeline” for PyTorch
Restyled quote blocks for better note/warning/related reading box styles
Before:

After:

3. How to test?

Document test: https://yuwentestdocs.readthedocs.io/en/nano-pytorch-cv-pipeline/doc/Nano/Howto/index.html
Github Notebook preview: https://github.com/Oscilloscope98/BigDL/blob/nano-pytorch-cv-pipeline/python/nano/tutorial/notebook/preprocessing/pytorch/accelerate_pytorch_cv_data_pipeline.ipynb
Notebook test locally (conda create an empty environment with python=3.7)

rnwang04 · 2023-01-03T09:36:48Z

maybe we can split below code into two parts to make it more clear :

xxxx

# from torchvision import transforms
# from torchvision.datasets import OxfordIIITPet
from bigdl.nano.pytorch.vision import transforms
from bigdl.nano.pytorch.vision.datasets import OxfordIIITPet

# Data processing steps are the same as using torchvision
train_transform = transforms.Compose([transforms.Resize(256),
                                        transforms.RandomCrop(224),
                                        transforms.RandomHorizontalFlip(),
                                        transforms.ColorJitter(brightness=.5, hue=.3),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])
val_transform = transforms.Compose([transforms.Resize(256),
                                    transforms.CenterCrop(224),
                                    transforms.ToTensor(),
                                    transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])

train_dataset = OxfordIIITPet(root="/tmp/data", transform=train_transform, download=True)
val_dataset = OxfordIIITPet(root="/tmp/data", transform=val_transform)

xxx

# obtain training indices that will be used for validation
import torch

indices = torch.randperm(len(train_dataset))
val_size = len(train_dataset) // 4
train_dataset = torch.utils.data.Subset(train_dataset, indices[:-val_size])
val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])

# create dataloaders
from torch.utils.data.dataloader import DataLoader

train_dataloader = DataLoader(train_dataset, batch_size=32)
val_dataloader = DataLoader(val_dataset, batch_size=32)

what preprocess acceleration does bigdl.nano.pytorch.vision do? shall we explain this a bit?

…a process accelerastion for PyTorch

rnwang04

LGTM

…process pipeline (#7125) * Restyle blockquote elements in web * Add a generalized how-to section for preprocessing, including the data process accelerastion for PyTorch * Small fix * Update based on comments and small typo fixes * Small fixes

Oscilloscope98 added document Nano labels Dec 29, 2022

Oscilloscope98 marked this pull request as ready for review January 3, 2023 01:36

Oscilloscope98 requested review from TheaperDeng and rnwang04 January 3, 2023 01:43

Oscilloscope98 added 5 commits January 5, 2023 15:04

Restyle blockquote elements in web

1809357

Add a generalized how-to section for preprocessing, including the dat…

39b3ec8

…a process accelerastion for PyTorch

Small fix

bb03003

Update based on comments and small typo fixes

2c228fc

Small fixes

0e56995

Oscilloscope98 force-pushed the nano-pytorch-cv-pipeline branch from 9d0c391 to 2c228fc Compare January 5, 2023 07:38

rnwang04 approved these changes Jan 5, 2023

View reviewed changes

Oscilloscope98 merged commit bcb7649 into intel-analytics:main Jan 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

Oscilloscope98 commented Dec 29, 2022 •

edited

rnwang04 commented Jan 3, 2023

rnwang04 left a comment

[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

[Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline #7125

Conversation

Oscilloscope98 commented Dec 29, 2022 • edited

Description

1. Why the change?

2. Summary of the change

3. How to test?

rnwang04 commented Jan 3, 2023

rnwang04 left a comment

Choose a reason for hiding this comment

Oscilloscope98 commented Dec 29, 2022 •

edited