# Finetune: train on random init CNN and original prepared dataset Version 2!

Copy of `jupyter_notebooks/finetune_train_random_original.ipynb`,

but now we try adding macro f1 metric to the model metrics to be evaluated on the validation set at epoch end. The goal is to get macro f1 scores into the output history.csv so that we can plot the validation score by epoch curve. This would be our attempt to reproduce Figure 3 in the paper.

## Prereqs

Local git repo checked out to `dhxu2-f1-metric` branch (temporary, will remove this prereq once we validate this f1 change)

---

Original prepared dataset is the code in `finetuning/readme.md` which samples at 250 hz, 65 seconds.

Now we try training on an uninitialized CNN.

We do this by *not* passing in the `--weights-file` parameter to `finetuning.trainer`.

It's not quite clear whether we need to explicitly fill the CNN with randomized weights or if the CNN network weights themselves are already randomized when no model weights are loaded into it.

In [1]:
# You may also manually mount drive by clicking on folder icon in left sidebar
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
PROJECT_ROOT = '/content/drive/MyDrive/DLHProject'

If you have colab pro, clone our repo to /root directory.

In [4]:
# REPO = PROJECT_ROOT + '/Danielgitrepo'
# Below is if you have colab pro
REPO = '/root/DLH_TransferLearning/'

In [5]:
%cd $REPO

/root/DLH_TransferLearning


In [7]:
%%capture
! pip install -r requirements.txt

In [8]:
DATA_DIR = PROJECT_ROOT + '/data'

In [9]:
! ls $DATA_DIR

icentia11k		     icentia11k_subset_unzipped		 physionet_finetune
icentia11k_corrupted	     physionet				 physionet_preread
icentia11k_subset	     physionet_250hz_15000pad_norm_True  session_checkpoint.dat
icentia11k_subset_corrupted  physionet_data.zip			 temp.torrent


Before we run the finetuning trainer, we should prepare job output directory. We make a sister `jobs` directory to the data directory.

It should be noted that finetune trainer should also create the jobs directory for you by Line49:

```python
os.makedirs(str(args.job_dir), exist_ok=True)
```

But I think it is a good practice to just set up the directory structure yourself instead of just assuming the code will do it all for you.

In [11]:
JOB_DIR = PROJECT_ROOT + '/jobs'
# already created in the precursor notebook
# ! mkdir -p $JOB_DIR

In [10]:
! ls $PROJECT_ROOT

 Archive			 Mylesgitrepo
 Danielgitrepo			'Notes from last year’s Class.gdoc'
 data				 proposal
 data_partial_download		 Sched:Daniel:M-F1700+,Sa-SuAllday:Vacay:None.gdoc
 DL4H_Team_1			 Sched:Myles:M-Su0300-1900:VacayMar01-Mar17.gdoc
 DownloadData.ipynb		 Sched:Ted:M-F1700+,Sa-Su1500+:Vacay:Mar31-Apr06.gdoc
 ECG_TransferLearningPaper.pdf	 Scheduling.gdoc
 ExampleNotebook.ipynb		'Spring24: Project Grading rubrik.gdoc'
 jobs				 Tedgitrepo
'Meeting Notes.gdoc'


In [12]:
! ls $JOB_DIR

finetune_random_cnn_original_data


**Revision**

We just create a new job subdirectory for this notebook.

> `jobs/finetune_random_cnn_original_data_with_f1`

---
Note that while we did create the `jobs/` directory, we will defer the creation of fine tuning train specific output directory to the trainer code.

So for this experiment, we will write the fine tune results out to `jobs/finetune_random_cnn_original_data`. The name indicates 2 things

1. Random pretrained CNN used
2. We use the same preprocessing steps the authors suggest in their README. Which again is **not** aligned with what they say in the paper.

## Other discrepancies

This is the exact code that the authors say we should run finetuning with:

```shell script
python -m finetuning.trainer \
--job-dir "jobs/af_classification" \
--train "data/physionet_train.pkl" \
--test "data/physionet_test.pkl" \
--weights-file "jobs/beat_classification/resnet18.weights" \
--val-size 0.0625 \
--arch "resnet18" \
--batch-size 64 \
--epochs 200
```

The discrepancies:

1. `--val-metric` is NOT specified. The default value is `loss`. The help message describes this parameter as

  > Validation metric used to find the best model at each epoch.

  However, in the paper, the authors say that they use macro F1 score to evaluate on validation set and also to select the best model.

## Reproducibility

The trainer code also has a `--seed` parameter, which is not provided in the above code snippet.

For our own benefit, we shall set `--seed 2024` for reproducibility in *our* own work.

In [13]:
job_dir = JOB_DIR + '/finetune_random_cnn_original_data_with_f1'
train = DATA_DIR + '/physionet_finetune/physionet_train.pkl'
test = DATA_DIR + '/physionet_finetune/physionet_test.pkl'

print(f"job_dir: {job_dir}")
print(f"train: {train}")
print(f"test: {test}")

job_dir: /content/drive/MyDrive/DLHProject/jobs/finetune_random_cnn_original_data_with_f1
train: /content/drive/MyDrive/DLHProject/data/physionet_finetune/physionet_train.pkl
test: /content/drive/MyDrive/DLHProject/data/physionet_finetune/physionet_test.pkl


Now we use teh same settings Myles ran when he tried the precursor notebook. Namely:

1. V100 GPU (16 gb)
2. batch 128


In [15]:
%%time
# We've removed --weights-file parameter.
# We've set --val-metric to f1
# We've set --seed to 2024
# We've set --verbose to see what's going on
! python -m finetuning.trainer \
--job-dir $job_dir \
--train $train \
--test $test \
--val-size 0.0625 \
--val-metric "f1" \
--arch "resnet18" \
--batch-size 128 \
--epochs 200 \
--seed 2024 \
--verbose

2024-04-10 07:50:46.753615: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-10 07:50:46.753671: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-10 07:50:46.755060: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-04-10 07:50:46.762402: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Creating working directory in /content/drive/