# Python Script: Prepare Data for Evating the Model(s)

```python
prepare_data_model_evaluation.py
```
This script prepares the original MRI scans from the Kaggle competition titled 'UW-Madison GI Tract Image Segmentation,' which will be used to evaluate the segmentation model (RestNetClassifier + SegFormer).

The script takes as input parameters the path of the image database and the train.csv file that includes all the information needed to generate the target masks:
* The images are min-max normalized and saved in RGB PNG format in a specified target folder. If the image is grayscale, it is copied into each of the three channels.
* The target masks are created from the run-length code provided in ```bash train.csv``` (column ```python segmentation```).
* The script also allows for saving the images in a 2.5D format, where the first (R) channel is used to store the actual image 'i', the second (G) channel to store the image 'i + stride', and the third (B) channel to store the image 'i + 2*stride'.

The usage sample of this application is given next:

```bash
python .\prepare_data_model_evaluation.py -dimension 2d -csv data/train.csv -input_dir images/train -output_dir evaluation_data

```
More information about the input parameters is provided with the help parameter as follows:

```bash
python .\prepare_data_model_evaluation.py --help

usage: prepare_data_model_evaluation.py [-h] [-dimension {2d,2.5d}]
                                        [-stride STRIDE] [-csv CSV]
                                        [-input_dir INPUT_DIR]
                                        [-output_dir OUTPUT_DIR]
                                        [-test_patients TEST_PATIENTS]

options:
-h, --help                     Show this help message and exit
-dimension {2d,2.5d}           Choose either '2d' or '2.5d'
-stride STRIDE                 Specify the stride as an integer (default 1) for 2.5d
-csv CSV                       Path and file name of the csv file with rle data (default 'data/train.csv'
-input_dir INPUT_DIR           Specify the directory where the input images reside (default 'images/train')
-output_dir OUTPUT_DIR         Specify the directory where the images will be stored (default 'evaluation_data')
-test_patients TEST_PATIENTS   Specify the list of test images for inference (default "['2', '6', '7', '9', '11', '15', '16', '140', '145', '146', '147', '148', '149', '154', '156']")
```                

# Usage Sample

In [4]:
!python .\prepare_data_model_evaluation.py -dimension 2d -csv data/train.csv -input_dir images/train -output_dir evaluation_data

Dimension: 2d
Stride: 1
CSV: data/train.csv
Input Dir: images/train
Output Dir: evaluation_data
Test Patients: ['2', '6', '7', '9', '11', '15', '16', '140', '145', '146', '147', '148', '149', '154', '156']



Test :: case11:   0%|          | 0/432 [00:00<?, ?it/s]
Test :: case11:   0%|          | 2/432 [00:00<00:27, 15.82it/s]
Test :: case11:   1%|          | 4/432 [00:00<00:26, 16.03it/s]
Test :: case11:   1%|1         | 6/432 [00:00<00:27, 15.56it/s]
Test :: case11:   2%|1         | 8/432 [00:00<00:27, 15.69it/s]
Test :: case11:   2%|2         | 10/432 [00:00<00:28, 15.03it/s]
Test :: case11:   3%|2         | 12/432 [00:00<00:26, 15.64it/s]
Test :: case11:   3%|3         | 14/432 [00:00<00:28, 14.79it/s]
Test :: case11:   4%|3         | 17/432 [00:01<00:24, 16.90it/s]
Test :: case11:   5%|4         | 20/432 [00:01<00:21, 19.12it/s]
Test :: case11:   5%|5         | 22/432 [00:01<00:21, 19.33it/s]
Test :: case11:   6%|5         | 25/432 [00:01<00:18, 22.15it/s]
Test :: case11:   6%|6         | 28/432 [00:01<00:17, 22.50it/s]
Test :: case11:   7%|7         | 31/432 [00:01<00:16, 23.75it/s]
Test :: case11:   8%|8         | 35/432 [00:01<00:14, 27.33it/s]
Test :: case11:   9%|9         | 40/4