# Dataset Preprocessing

In [1]:
%cd ../..
import torch
from torch import nn, Tensor, optim
import torchvision.transforms as T

from pathlib import Path

from dlbpgm.datasets.preprocessing import TrayPreprocessor, RGBVIFilter, ImageToLatentConverter
from dlbpgm.datasets.metadata import Metadata

## Crop Plants
In the original dataset the images of the plants are provided in the form of 4 high-resolution images of trays.
We cut out each individual plant from these trays in an area of $512^2$ px.
The coordinates of the respective plants can be imported from the `dlbpgm` package:

In [4]:
from dlbpgm.datasets.preprocessing import tray_grids
tray_grids

In a next step we list up source and target directories of the images:

In [5]:
trays_dirs = [
    Path("Arabitopsis/images_and_annotations/PSI_Tray031/tv/"),
    Path("Arabitopsis/images_and_annotations/PSI_Tray032/tv/"),
    Path("Arabitopsis/images_and_annotations/PSI_Tray033/tv/"),
    Path("Arabitopsis/images_and_annotations/PSI_Tray034/tv/"),
]

target_dirs = [
    Path("Arabidopsis/plants/1"),
    Path("Arabidopsis/plants/2"),
    Path("Arabidopsis/plants/3"),
    Path("Arabidopsis/plants/4"),
]

We will iteratively configure a `TrayPreprocessor` Object from the `dlbpgm` package for each tray, which we will further refer to by an index value `i`.

The `TrayPreprocessor` needs to be specified with a source and a target directory, as well as the crop size and the plant positions, coming from the imported coordinates.
The tray number is only passed to name the outcoming images appropriately.
As a last argument, we pass a `RGBVIFilter` which is resposible for the separation of fore- and background in the images.

In [11]:
i = 2

preprocessor = TrayPreprocessor(
    source_dir=trays_dirs[i],
    target_dir=target_dirs[i],
    tray_number=i+1, 
    crop_size=512, 
    grid=tray_grids[i], 
    rgbvi_filter=RGBVIFilter()
)

In [12]:
preprocessor.process_files()

## Pre-calculate Latent Representations
In order to improve training speed, we pre-calculate the latent representations of the images.
For this, we use the `ImageToLatentConverter` from the `dlbpgm` package.

To speed up conversion, in a first step we set up a GPU device:

In [2]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
device

In a next step we load the metadata of the dataset and specify a location, where the latent variables should be stored.

In [3]:
metadata = Metadata.load("Arabidopsis/metadata.csv")
latents_dir = Path("Arabidopsis/latents128")
metadata['latent_path'] = metadata.latent_path.apply(lambda p: latents_dir/p)

To convert images into latent variables, we use the `ImageToLatentConverter` from the `dlbpgm` package. 
In order to use this tool, we provide it with the metadata, batch size, and number of workers as input parameters. 
Additionally, we have the option to specify any transformations that should be applied to the images before the conversion process, by using the `transforms` parameter. 
This includes resizing images to the desired size or applying augmentation techniques if needed.

In [44]:
image_size = 128

converter = ImageToLatentConverter(
    metadata, 
    batch_size=16, 
    num_workers=4,
    transforms=T.Compose([
        T.Lambda(lambda t: t[:, :3] * t[:, 3:]), # RGBA -> RGB * A
        # T.RandomVerticalFlip(p=1.0),
        T.Resize((image_size, image_size), antialias=True),
    ])
).to(device)

In [None]:
converter.convert_images()