Fine-tuning
-----------
A common paradigm in deep learning is to pretrain a model on relevant data and then *fine-tune* it on smaller
application-focused examples.

We show how this premise can be used from the perspective of a large dataset of many people, and see each
person as a fine-tuning opportunity. This is very-similar to the un-aligned/DA/DG case of fine-tuning from
Kostas and Rudzicz 2020 (https://doi.org/10.1088/1741-2552/abb7a7).

To keep things as simple as possible, we use pretty much the same configuration and, as much as possible, code
as the `Basics` example. Return to that if anything is confusing.

```yaml
Configuratron:
  preload: True

use_gpu: False
test_fraction: 0.5

mmidb:
  name: "Physionet MMIDB"
  toplevel: /path/to/eegmmidb
  tmin: 0
  tlen: 6
  data_max: 0.001
  data_min: -0.001
  events:
    - T1
    - T2
  exclude_sessions:
    - "*R0[!48].edf"  # equivalently "*R0[1235679].edf"
    - "*R1[!2].edf"   # equivalently "*R1[134].edf"
  exclude_people:
    - S088
    - S090
    - S092
    - S100
  train_params:
    epochs: 7
    batch_size: 4
  lr: 0.0001
  fine_lr: 0.00001
  folds: 5
```

Below we will start with some identical code to load our dataset, and prepare a TIDNet model for classification.

In [1]:
from dn3.configuratron import ExperimentConfig
from dn3.trainable.processes import StandardClassification
from dn3.trainable.models import TIDNet

from torch.optim.lr_scheduler import CosineAnnealingLR

# Since we are doing a lot of loading, this is nice to suppress some tedious information
import mne
mne.set_log_level(False)

config_filename = 'my_config.yml'
experiment = ExperimentConfig(config_filename)
ds_config = experiment.datasets['mmidb']

dataset = ds_config.auto_construct_dataset()


Scanning /ais/clspace5/spoclab/BrainData/MNE-eegbci-data/physiobank/database/eegmmidb. If there are a lot of files, this may take a while...:   0%|          | 0/4 [00:00<?, ?it/s, extension=.fif]

Found 1 datasets.


Scanning /ais/clspace5/spoclab/BrainData/MNE-eegbci-data/physiobank/database/eegmmidb. If there are a lot of files, this may take a while...: 100%|██████████| 4/4 [00:03<00:00,  1.17it/s, extension=.bdf]
Loading Physionet MMIDB:   0%|          | 0/105 [00:00<?, ?person/s]

Creating dataset of 420 Epoched recordings from 105 people.


Loading Physionet MMIDB: 100%|██████████| 105/105 [01:06<00:00,  1.57person/s]


This time, we will also create two functions that exhibit the two different (though not necessarily mutually exclusive)
way one might adjust from one domain to a slightly different one. Freezing and fine-tuning.

In [2]:
def frozen_tuning(training_data, testing_data, model):
    model.freeze_features()
    tune_process = StandardClassification(model, learning_rate=ds_config.rate, cuda=experiment.use_gpu)
    tune_process.fit(training_data, **ds_config.train_params)
    # We unfreeze so that the model can be subsequently trained again
    model.freeze_features(unfreeze=True)
    return tune_process.evaluate(testing_data)['Accuracy']

def fine_tuning(training_data, testing_data, model):
    tune_process = StandardClassification(model, learning_rate=ds_config.fine_lr, cuda=experiment.use_gpu,)
    tune_process.fit(training_data, **ds_config.train_params)
    return tune_process.evaluate(testing_data)['Accuracy']


Now we'll make some helpers to compare the tuned performance for three possible scenarios:

    1. Freeze features with a new classifier
    2. The same as the above, but then fine-tune *all weights* including the new final layer
    3. Just fine-tuning all the general weights from the start

In [3]:
PERFORMANCE_COLUMNS = ['Tuned', 'Frozen', 'Frozen then Tuned']
def tuning_performance_comparison(training_data, testing_data, model):
    just_tune_model = model.clone()
    just_tune_performance = fine_tuning(training_data, testing_data, just_tune_model)

    freeze_performance = frozen_tuning(training_data, testing_data, model)
    freeze_then_tune = fine_tuning(training_data, testing_data, model)

    return dict(zip(PERFORMANCE_COLUMNS, (just_tune_performance, freeze_performance, freeze_then_tune)))

Now everything runs pretty much the same as our basic process, with person-specific performance reporting. Except this
time, we will compare the different tuning techniques instead of just evaluating the model with our test person.

In [4]:
import tqdm
results = list()
for training, validation, test in tqdm.tqdm(dataset.lmso(ds_config.folds), total=ds_config.folds,
                                            desc="LMSO", unit='fold'):
    tidnet = TIDNet.from_dataset(dataset)
    general_process = StandardClassification(tidnet, cuda=ds_config.use_gpu, learning_rate=ds_config.lr)

    # General training
    tqdm.tqdm.write("General training...")
    general_process.fit(training_dataset=training, validation_dataset=validation, **ds_config.train_params)

    # Tuning
    tqdm.tqdm.write("Fine tuning...")
    for _, _, test_thinker in test.loso():
        # Now split the test_thinker further for training and testing (the middle return value would be validation)
        tune_train, _, tune_test = test_thinker.split(test_frac=experiment.test_fraction, validation_frac=0)

        performance = tuning_performance_comparison(tune_train, tune_test, tidnet.clone())
        best_perf = max(performance.values())
        tqdm.tqdm.write("Evaluated person {}, Best performance: {:.2%}".format(test_thinker.person_id, best_perf))

        summary = {'Person':test_thinker.person_id,
                   "Before Tuning": general_process.evaluate(test_thinker)['Accuracy'],
                   'Best Result': best_perf}
        summary.update(performance)
        results.append(summary)

Epoch:   0%|          | 0/20 [00:00<?, ?it/s]
Iteration:   0%|          | 0/1322 [00:00<?, ?it/s][A
Iteration:   0%|          | 0/1322 [00:00<?, ?it/s, Accuracy=0, loss=1.81][A
Iteration:   0%|          | 0/1322 [00:00<?, ?it/s, Accuracy=1, loss=0.311][A
Iteration:   0%|          | 2/1322 [00:00<01:14, 17.79it/s, Accuracy=1, loss=0.311][A
Iteration:   0%|          | 2/1322 [00:00<01:46, 12.41it/s, Accuracy=1, loss=0.277][A
Iteration:   0%|          | 2/1322 [00:00<02:18,  9.56it/s, Accuracy=0.5, loss=0.977][A
Iteration:   0%|          | 2/1322 [00:00<02:48,  7.81it/s, Accuracy=0, loss=1.25]   [A
Iteration:   0%|          | 5/1322 [00:00<01:07, 19.45it/s, Accuracy=0, loss=1.25][A
Iteration:   0%|          | 5/1322 [00:00<01:20, 16.40it/s, Accuracy=0.5, loss=0.627][A
Iteration:   0%|          | 5/1322 [00:00<01:32, 14.22it/s, Accuracy=0, loss=2.08]   [A
Iteration:   0%|          | 5/1322 [00:00<01:44, 12.56it/s, Accuracy=1, loss=0.233][A
Iteration:   1%|          | 8/1322 [00:

Let's use a `DataFrame` this time to compare the performances a little more effectively.

In [5]:
from pandas import DataFrame
results = DataFrame(results)
for tune_option in PERFORMANCE_COLUMNS:
    results[tune_option + ' Improvement'] = results[tune_option] - results['Before Tuning']
print(results.describe())

     before_tuning  freeze_then_tune    frozen person     tuned
0         0.523810          0.666667  0.523810   S087  0.761905
1         0.714286          0.809524  0.809524   S109  0.714286
2         0.738095          0.904762  0.952381   S011  0.857143
3         0.833333          0.809524  0.904762   S081  1.000000
4         0.523810          0.952381  0.714286   S042  0.809524
5         0.809524          0.809524  0.904762   S034  0.904762
6         0.523810          0.666667  0.619048   S024  0.761905
7         0.690476          0.523810  0.428571   S066  0.761905
8         0.714286          0.666667  0.523810   S080  0.714286
9         0.952381          0.952381  0.904762   S020  0.761905
10        0.690476          0.952381  0.904762   S006  0.952381
11        1.000000          1.000000  1.000000   S105  1.000000
12        0.833333          0.952381  0.952381   S041  0.952381
13        1.000000          1.000000  1.000000   S040  1.000000
14        0.952381          0.952381  1.