# PVInspect integration with PyTorch

PVInspect interfaces directly with PyTorch, since PVInspect `ImageSequence`s can be simply converted into PyTorch datasets. To demonstrate this, let's first set up everything:

In [1]:
import pvinspect as pv
from pvinspect import datasets
from pvinspect.data.image import DType
from pvinspect.integration.pytorch.dataset import ClassificationDataset
from torchvision.transforms import Compose, RandomHorizontalFlip, RandomVerticalFlip, ToTensor, ToPILImage

  from .autonotebook import tqdm as notebook_tqdm


Now, we load our sample dataset and only use the training data:

In [2]:
seq = datasets.elpv().pandas.query("testset == False")
len(seq)

2324

Note that this has annotations available in form of boolean metra attributes:

In [3]:
seq[:3].meta

Unnamed: 0,original_filename,modality,defect_probability,wafer,crack,inactive,blob,finger,testset
0,cell0382.png,EL_IMAGE,1.0,mono,True,False,False,True,False
1,cell2581.png,EL_IMAGE,0.333333,poly,False,False,False,False,False
2,cell0396.png,EL_IMAGE,1.0,mono,True,True,False,True,False


For training, we usually have some data augmentation pipeline. Let's set up a simple one:

In [4]:
tfms = Compose([ToPILImage(), RandomHorizontalFlip(), RandomVerticalFlip(), ToTensor()])

Now, converting `seq` into a PyTorch classification dataset using the transforms is as simple as:

In [5]:
ds = ClassificationDataset(seq, meta_classes=["crack", "inactive"], data_transform=tfms)
ds

<pvinspect.integration.pytorch.dataset.ClassificationDataset at 0x2888a2830>

With the `meta_classes` parameter, we specify the meta attributes that should be converted into one-hot class variables. We can now access individual elements from the dataset:

In [6]:
ds[0]

(tensor([[[0.1137, 0.1137, 0.1137,  ..., 0.0980, 0.1020, 0.1020],
          [0.1176, 0.1176, 0.1176,  ..., 0.1059, 0.1059, 0.1059],
          [0.1216, 0.1216, 0.1216,  ..., 0.1059, 0.1059, 0.1059],
          ...,
          [0.2157, 0.2157, 0.2157,  ..., 0.2235, 0.2196, 0.2157],
          [0.2118, 0.2157, 0.2157,  ..., 0.2235, 0.2196, 0.2157],
          [0.2118, 0.2157, 0.2157,  ..., 0.2196, 0.2157, 0.2157]]]),
 tensor([1., 0.]))

Here, we see that


1.   Image data is corretcly converted into PyTorch tensors, which shows that the transforms are applied
2.   Specified classes are converted into a one-hot vector and returned as a second return value

