# <font color = teal> Cheatsheet </font>

The cheatsheet is structured to provide guidance on the order in which the different steps should be executed. More detailed information about each step can be found from the notebook mentioned. 

Listed attributes should be checked before using the scripts. 

-----

## <font color = teal>1) Downloading data</font>

[Introduction to data handling](1_introduction_data_handling.ipynb)

----

## <font color = teal>2) Preprocessing data (optional)</font>
[Introduction to data handling](1_introduction_data_handling.ipynb)

Check before use:
```
from_directory = <original data location>
new_directory = <processed data location>

# ------------------------------
# --- PREPROCESS TRANSFORMS ----

<wanted transforms for preprocessing>

# ------------------------------
# ------------------------------

```

Run:
```
python preprocess_data.py
```

----


## <font color = teal> 3) Splitting data into csv files</font>
[Introduction to data handling](1_introduction_data_handling.ipynb)

Check before use:
```
stratified = <boolean: whether to perform a DBwise or a stratified split>
data_dir = <data location>
csv_dir = <csv location>
labels = <list of labels (labels as strings in SNOMED CT Codes)>
```

Run:
```
python create_data_csvs.py
```

----

## <font color = teal>4) Creating yaml files</font>

[Yaml files of database-wise split for training and testing](2_physionet_DBwise_yaml_files.ipynb)

[Yaml files of stratified split for training and testing](2_physionet_stratified_yaml_files.ipynb)

---

##  <font color = teal>5) Training a model</font>
[Introduction to training models](3_introduction_training.ipynb)

Check before use:
```
    csv_root = <csv location>
```

- Yaml files in `/configs/training/`
    - train_file: csv file (.csv) used in training
    - val_file: csv file (.csv) used in validation
    - batch_size: batch size for DataLoader
    - num_workers: number of workers for DataLoader
    - epochs: number of epochs in training loop
    - lr: learning rate
    - weight_decay: regularization rate
    - device_count: number of devices used (GPUs)

Run:
```
python train_model.py <yaml file OR directory>
```

---

## <font color = teal>6) Testing a model</font>

[Introduction to testing and evaluating models](4_introduction_testing_evaluation.ipynb)

Check before use:
```
    csv_root = <csv location>
```

- Yaml files in `/configs/predicting/`
    - test_file: csv file (.csv) used in testing
    - model: model file (.pth) used in testing
    - threshold: decision threshold for predictions
    - device_count: number of devices used (GPUs)

Run:
```
python run_model.py <yaml file OR directory>
```