Copyright (c) 2020-2021 Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# HPO for Fine-Tuning Pre-trained Language Models


## 1. Introduction


In this notebook, we demonstrate a procedure for troubleshooting HPO failure in fine-tuning pre-trained language models (introduced in the following paper):

*An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models. Xueqing Liu, Chi Wang. To appear in ACL-IJCNLP 2021*

FLAML requires `Python>=3.6`. To run this notebook example, please install flaml with the `notebook` option:
```bash
pip install flaml[notebook]
```

In [None]:
!pip install flaml[notebook];


## 2. Initial Experimental Study (Section 4)

### Installing dependencies

Install the package dependencies which are not covered in ```flaml[notebook]```:

In [None]:
!pip install ray transformers datasets torch


### Load dataset

Load the dataset using AutoTransformer.prepare_data. In this notebook, we use the Microsoft Research Paraphrasing Corpus (MRPC) dataset as an example:

In [None]:
from flaml.nlp.autotransformers import AutoTransformers
autohf = AutoTransformers()
preparedata_setting = {
        "dataset_subdataset_name": "glue:rte",
        "pretrained_model_size": "electra-base-discriminator:base",
        "data_root_path": "data/",
        "max_seq_length": 128,
        }
autohf.prepare_data(**preparedata_setting)

Set the time budget for running HPO. In the paper, we set the time budget to be the same as grid search time:

In [None]:
time_budget = 420

Run HPO for 1GST:

In [None]:
autohf_settings = {"resources_per_trial": {"gpu": 1, "cpu": 1},
                    "num_samples": -1, # unlimited sample size
                    "time_budget": time_budget,
                    "ckpt_per_epoch": 5,
                    "fp16": True,
                   }
validation_metric, analysis = autohf.fit(**autohf_settings,)
print(validation_metric)

: