![image](https://drive.google.com/u/0/uc?id=15DUc09hFGqR8qcpYiN1OajRNaASmiL6d&export=download)

# **Taller Extra - ISIS4825**

## **Generación de Datos y Preprocesamiento**
## **Contenido**
1. [**Objetivos**](#id1)
2. [**Problema**](#id2)
3. [**Importando las librerías necesarias para el laboratorio**](#id3)
4. [**Visualización y Análisis Exploratorio**](#id4)

## **Objetivos**<a name="id1"></a>

- Familiarizarse con la generación de datos usando código nativo, TensorFlow y PyTorch
- Aplicar `scikit-image` al preprocesamiento de las imágenes.
- Generar y almacenar datos en la nube para proceder con el modelamiento y la validación.

## **Problema**<a name="id2"></a>
- Se busca limpiar y guardar los datos generados del procesamiento de los datos relacionados a tomografías de hígado con cáncer.

## **Notebook Configuration**

In [None]:
!shred -u setup_colab.py
!shred -u setup_colab_general.py
!wget -q "https://github.com/jpcano1/python_utils/raw/main/setup_colab_general.py" -O setup_colab_general.py
!wget -q "https://github.com/jpcano1/python_utils/raw/main/ISIS_4825/setup_colab.py" -O setup_colab.py
import setup_colab as setup
setup.setup_extra_workshop()

## **Importando las librerías necesarias para el laboratorio**

In [None]:
from utils import general as gen
from utils import extra_utils as extra

from tensorflow import keras
from tensorflow.keras.utils import Sequence

from torch.utils.data import Dataset

from skimage import filters, morphology, exposure
import cv2

from tqdm.auto import tqdm

import nibabel as nib

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
plt.style.use("seaborn-deep")
import seaborn as sns

In [None]:
data_dir = gen.create_and_verify("data", "media", "nas", "01_Datasets", 
                                 "CT", "LITS", "Training Batch 1")
data_list = gen.read_listdir(data_dir)

## **Preparación de los Datos con Generadores**
- Vamos a utilizar los datos de la competencia Liver Tumor Segmentation, los cuales hacen uso de Tomografías Axiales de hígado.

![image](https://i.imgur.com/eDN20ck.png)

In [None]:
class ClassicDataGenerator:
    def __init__(self, data_dirs, as_array=False, transform=None, 
                 *args, **kwargs):
        self.volume_dirs = [p for p in data_list if "segmentation" not in p]
        self.label_dirs =  [p for p in data_list if "segmentation" in p]
        self.as_array = as_array
        self.transform = transform

    def __len__(self):
        return len(self.volume_dirs)

    def __getitem__(self, idx):
        vol = nib.load(self.volume_dirs[idx])
        lab = nib.load(self.label_dirs[idx])

        if self.as_array:
            return vol.get_fdata(), lab.get_fdata()
        return vol, lab

    def __iter__(self):
        self.idx = 0
        return self

    def __next__(self):
        if self.idx < self.__len__():
            vol = nib.load(self.volume_dirs[self.idx])
            lab = nib.load(self.label_dirs[self.idx])
            self.idx += 1

            if self.as_array:
                return vol.get_fdata(), lab.get_fdata()
            return vol, lab
        else:
            raise StopIteration

In [None]:
lits_dataset = ClassicDataGenerator(data_list)

In [None]:
for X, y in lits_dataset:
    print(f"Volume shape: {X.shape}")
    print(f"Label shape: {y.shape}")

In [None]:
class KerasDataGenerator(Sequence):
    def __init__(self, data_dirs, as_array=False):
        super(KerasDataGenerator, self).__init__()
        self.volume_dirs = [p for p in data_list if "segmentation" not in p]
        self.label_dirs = [p for p in data_list if "segmentation" in p]
        self.as_array = as_array

    def __len__(self):
        return len(self.volume_dirs)

    def __getitem__(self, idx):
        vol = nib.load(self.volume_dirs[idx])
        lab = nib.load(self.label_dirs[idx])
        if self.as_array:
            return vol.get_fdata(), lab.get_fdata()
        return vol, lab

In [None]:
lits_dataset = KerasDataGenerator(data_list)

In [None]:
for X, y in lits_dataset:
    print(f"Volume shape: {X.shape}")
    print(f"Label shape: {y.shape}")

In [None]:
class TorchDataGenerator(Dataset):
    def __init__(self, data_dir, as_array=False, transform=None):
        super(TorchDataGenerator, self).__init__()
        self.volume_dirs = [p for p in data_list if "segmentation" not in p]
        self.label_dirs = [p for p in data_list if "segmentation" in p]
        
        self.as_array = as_array
        self.transform = transform

    def __len__(self):
        return len(self.volume_dirs)

    def __getitem__(self, idx):
        vol = nib.load(self.volume_dirs[idx])
        lab = nib.load(self.label_dirs[idx])
        if self.as_array:
            return vol.get_fdata(), lab.get_fdata()
        return vol, lab

In [None]:
lits_dataset = TorchDataGenerator(data_list)

In [None]:
for X, y in lits_dataset:
    print(f"Volume shape: {X.shape}")
    print(f"Label shape: {y.shape}")

## **Visualización y Análisis Exploratorio**<a name="id4"></a>

In [None]:
lits_dataset = ClassicDataGenerator(data_list, as_array=True)

In [None]:
np.random.seed(1234)
random_sample = np.random.randint(0, len(lits_dataset))

In [None]:
vol, lab = lits_dataset[random_sample]

In [None]:
vol.shape, lab.shape

In [None]:
vol_slice, lab_slice = extra.get_vol_slice(vol, lab, 130)
vol_slice.shape, lab_slice.shape

In [None]:
gen.visualize_subplot([vol_slice, lab_slice], ["Liver", "Tumor Mask"], (1, 2), 
                      cmap="bone", figsize=(12, 6))

In [None]:
labeled_image = extra.get_labeled_image(vol_slice, lab_slice, 3)

In [None]:
plt.figure(figsize=(6, 6))
gen.imshow(labeled_image, title="Labeled Image")

## **Preprocesamiento y Generado de Datos**<a name="id5"></a>

In [None]:
def pipeline(img):
    img = exposure.equalize_hist(img)
    img = gen.scale(img, 0, 255)
    selem = morphology.square(10)
    img = filters.median(img)
    img = filters.rank.mean_bilateral(img, selem)
    return img

In [None]:
preprocessed = pipeline(vol_slice)

In [None]:
gen.visualize_subplot([vol_slice, preprocessed],
                      ["Original", "Procesada"], (1, 2), (12, 6))

In [None]:
extra.create_data(lits_dataset, sanity_check=True)

In [None]:
X = np.load("train_data/data/X_0.npy")
y = np.load("train_data/labels/y_0.npy")

In [None]:
X.shape

In [None]:
y.shape

In [None]:
gen.visualize_subplot([X, y[..., 0]], ["Image", "Label"], (1, 2), 
                      cmap="bone")

In [None]:
gen.visualize_subplot([X, y[..., 1]], ["Image", "Label"], (1, 2), 
                      cmap="bone")