# Training a `Robust' Adapter with AdapterDrop

This notebook extends our quickstart adapter training notebook to illustrate how we can use AdapterDrop
to robustly train an adapter, i.e. adapters that allow us to dynmically dropp layers for faster multi-task inference.
Please have a look at the original adapter training notebook for more details on the setup.

## Installation

First, let's install the required libraries:

In [1]:
!pip install git+https://github.com/Adapter-Hub/adapter-transformers.git
!pip install datasets

Collecting git+https://github.com/Adapter-Hub/adapter-transformers.git
  Cloning https://github.com/Adapter-Hub/adapter-transformers.git to /private/var/folders/0y/ybdccp1j41142qkg4cs84snm0000gn/T/pip-req-build-k2vvz3_x
  Running command git clone -q https://github.com/Adapter-Hub/adapter-transformers.git /private/var/folders/0y/ybdccp1j41142qkg4cs84snm0000gn/T/pip-req-build-k2vvz3_x
  Installing build dependencies ... [?25l- \ | / done
[?25h  Getting requirements to build wheel ... [?25l- done
[?25h    Preparing wheel metadata ... [?25l- \ done
[?25hCollecting sentencepiece==0.1.91
  Downloading sentencepiece-0.1.91-cp37-cp37m-macosx_10_6_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 7.8 MB/s eta 0:00:01
Collecting tokenizers==0.9.3
  Downloading tokenizers-0.9.3-cp37-cp37m-macosx_10_11_x86_64.whl (2.0 MB)
[K     |████████████████████████████████| 2.0 MB 6.4 MB/s eta 0:00:01
Building wheels for collected packages: adapter-transf

In [2]:
from datasets import load_dataset
from transformers import RobertaTokenizer

dataset = load_dataset("rotten_tomatoes")
tokenizer = RobertaTokenizer.from_pretrained("roberta-base")

def encode_batch(batch):
  """Encodes a batch of input data using the model tokenizer."""
  return tokenizer(batch["text"], max_length=80, truncation=True, padding="max_length")

# Encode the input data
dataset = dataset.map(encode_batch, batched=True)
# The transformers model expects the target class column to be named "labels"
dataset.rename_column_("label", "labels")
# Transform to pytorch tensors and only output the required columns
dataset.set_format(type="torch", columns=["input_ids", "attention_mask", "labels"])

ConnectionError: HTTPSConnectionPool(host='s3.amazonaws.com', port=443): Max retries exceeded with url: /datasets.huggingface.co/datasets/datasets/rotten_tomatoes/rotten_tomatoes.py (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd64fb4f390>: Failed to establish a new connection: [Errno 64] Host is down'))

## Training

In [None]:
from transformers import RobertaConfig, RobertaModelWithHeads, AdapterType

config = RobertaConfig.from_pretrained(
    "roberta-base",
    num_labels=2,
    id2label={ 0: "👎", 1: "👍"},
)
model = RobertaModelWithHeads.from_pretrained(
    "roberta-base",
    config=config,
)

# Add a new adapter
model.add_adapter("rotten_tomatoes", AdapterType.text_task)
# Add a matching classification head
model.add_classification_head("rotten_tomatoes", num_labels=2)
# Activate the adapter
model.train_adapter("rotten_tomatoes")

To dynamically drop adapter layers during training, we make use of HuggingFace's `TrainerCallback'.

In [None]:
import numpy as np
from transformers import TrainingArguments, Trainer, EvalPrediction, TrainerCallback

class AdapterDropTrainerCallback(TrainerCallback):
  def on_step_begin(self, args, state, control, **kwargs):
    skip_layers = list(range(np.random.randint(0, 11)))
    kwargs['model'].set_active_adapters("sst-2", skip_layers=skip_layers)

  def on_evaluate(self, args, state, control, **kwargs):
    # Deactivate skipping layers during evaluation (otherwise it would use the
    # previous randomly chosen skip_layers and thus yield results not comparable
    # across different epochs)
    kwargs['model'].set_active_adapters("sst-2", skip_layers=None)


training_args = TrainingArguments(
    learning_rate=1e-4,
    num_train_epochs=6,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    logging_steps=200,
    output_dir="./training_output",
    overwrite_output_dir=True,
)

def compute_accuracy(p: EvalPrediction):
  preds = np.argmax(p.predictions, axis=1)
  return {"acc": (preds == p.label_ids).mean()}

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
    compute_metrics=compute_accuracy,
)

trainer.add_callback(AdapterDropTrainerCallback())

We can now train and evaluate our robustly trained adapter!

In [None]:
trainer.train()
trainer.evaluate()