### Training the Tumor Patch Detector

The `train_tumor_patch_detector` function is responsible for initiating the training process of a machine learning model aimed at detecting tumor areas within patches of Whole Slide Images (WSIs) of angiosarcoma. This function is a part of the APEDIA project and facilitates the training of a deep learning model that can accurately identify regions of interest (tumor patches) based on PD-L1 expression and other cellular characteristics.

#### Parameters
- **data_df_path** (*str*): The file path to the DataFrame containing the dataset information. This DataFrame needs to include paths to image data and corresponding masks. Additionally, it must contain a cv_split column specifying the cross-validation split for each data point, where each row is assigned an integer (e.g., 0 to 4) indicating its split group. The inclusion of a target label column is optional, as this label solely influences the validation process.
- **output_dir** (*str*): The directory path where the training outputs, including the trained model weights and training logs, will be saved.
- **column_path_data** (*str*, default: 'pdl1'): The column name in the DataFrame that contains paths to the image data.
- **column_path_mask** (*str*, default: 'mask'): The column name in the DataFrame that contains paths to the corresponding masks.
- **column_scalar_label** (*str*, default: 'dummy'): The column name for scalar labels. Use 'dummy' if there are no scalar labels and dummy targets are to be used.
- **lr** (*float*, default: 0.001): Learning rate for the training process.
- **epochs** (*int*, default: 50): The number of training epochs.
- **bs** (*int*, default: 8): Batch size for training.
- **aug_mult** (*float*, default: 1): Multiplier for data augmentation; adjusts the intensity and variety of augmentations applied.
- **label_smoothing** (*float*, default: 0.0): The amount of label smoothing to apply, aiding in regularizing the model.
- **encoder_name** (*str*, default: 'timm-efficientnet-b5'): The name of the model encoder to be used.
- **cv_split** (*int*, default: 0): Index of the cross-validation split to use for training and validation data separation.
- **num_workers** (*int*, default: 4): The number of worker processes to use for data loading.
- **do_cosine_annealing** (*bool*, default: False): Flag to determine whether to apply cosine annealing to the learning rate schedule.

#### Usage
To train the tumor patch detector, define the parameters in `train_tumor_patch_detector_params` dictionary as shown below, and then call the `train_tumor_patch_detector` function with this dictionary. Ensure that the mandatory parameters `data_df_path` and `output_dir` are set to valid paths before running the training.

#### CLI
APEDIA train tumor patch detector can also be called via e.g.:
```
apedia train_tumor_patch_detector --data_df_path "/home/fabian/projects/phd/APEDIA/data/example_patch_df.feather" --output_dir "/home/fabian/projects/phd/APEDIA/data/outputs" --epochs 1 --column_path_data "path_patch_pdl1" --column_path_mask "path_patch_mask" --column_scalar_label "tumors"
```

## Import the `train_tumor_patch_detector` function

In [None]:
from apedia.train_tumor_patch_detector import train_tumor_patch_detector

## Set the parameters

In [None]:
# Train the tumor patch detector
# Set the parameters

train_tumor_patch_detector_params = {
    "data_df_path": "/path/to/your/dataset/example_patch_df.feather",
    "output_dir": "/path/to/your/output/directory",
    # Other parameters can be adjusted as needed
    "column_path_data": 'pdl1',
    "column_path_mask": 'mask',
    "column_scalar_label": 'dummy',
    "lr": 0.001,
    "epochs": 50,
    "bs": 8,
    "aug_mult": 1,
    "label_smoothing": 0.0,
    "encoder_name": "timm-efficientnet-b5",
    "cv_split": 0,
    "num_workers": 4,
    "do_cosine_annealing": False,
}

## Run the training

In [None]:
train_tumor_patch_detector_params['data_df_path'] = '/home/fabian/projects/phd/APEDIA/data/example_patch_df.feather'
train_tumor_patch_detector_params['output_dir'] = '/home/fabian/projects/phd/APEDIA/data/outputs'
train_tumor_patch_detector_params['epochs'] = 1
train_tumor_patch_detector_params['column_path_data'] = 'path_patch_pdl1'
train_tumor_patch_detector_params['column_path_mask'] = 'path_patch_mask'
train_tumor_patch_detector_params['column_scalar_label'] = 'tumors'

In [None]:
train_tumor_patch_detector(train_tumor_patch_detector_params)