# Test Adapters

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification, AutoModel, AutoConfig

In [13]:


# output more information
import logging
logging.basicConfig(level=logging.INFO)

# load pre-trained BERT tokenizer from Huggingface
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# tokenize an input sentence
sentence = "It's also, clearly, great fun."
sentence_negative = "this is really the worst day in history. it sucks"

# convert input tokens to indices and create PyTorch input tensor
input_tensor = torch.tensor([tokenizer.encode(sentence)])

input_tensor_negative = torch.tensor([tokenizer.encode(sentence_negative)])

# load pre-trained BERT model from Huggingface
# the `BertForSequenceClassification` class includes a prediction head for sequence classification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

INFO:transformers.tokenization_utils:loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /Users/jonathanhilgart/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
INFO:transformers.configuration_utils:loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /Users/jonathanhilgart/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
INFO:transformers.configuration_utils:Model config BertConfig {
  "adapters": {
    "adapters": {},
    "config_map": {}
  },
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_siz

In [10]:
input_tensor

tensor([[ 101, 2009, 1005, 1055, 2036, 1010, 4415, 1010, 2307, 4569, 1012,  102]])

# We now add a pre-trained task adapter that is useful to our task from Adapter Hub.
- As we’re doing sentiment classification, we use an adapter trained on the SST-2 dataset in this case.

In [14]:
# load pre-trained task adapter from Adapter Hub
# this method call will also load a pre-trained classification head for the adapter task
adapter_name = model.load_adapter('sst-2@ukp', config='pfeiffer')

# activate the adapter we just loaded, so that it is used in every forward pass
model.set_active_adapters(adapter_name)

# predict output tensor
outputs = model(input_tensor)
ouputs_negative_sentiment = model(input_tensor_negative)

# retrieve the predicted class label
predicted = torch.argmax(outputs[0]).item()
assert predicted == 1

INFO:transformers.adapter_utils:Found matching adapter at: adapters/ukp/bert-base-uncased_sentiment_sst-2_pfeiffer.json
INFO:transformers.adapter_utils:Resolved adapter files at https://public.ukp.informatik.tu-darmstadt.de/AdapterHub/text_task/sst/bert-base-uncased/pfeiffer/bert-base-uncased_sentiment_sst-2_pfeiffer.zip.
INFO:transformers.adapter_model_mixin:Loading module configuration from /Users/jonathanhilgart/.cache/torch/adapters/f23c9704bc526e1a5c605a1f1c76e7225da0fff90086a7e9483da11de926d624-04066537e8abe7c5ee72d7804a94afca7d5ff566b6731d82c77951cc3493ed8a-extracted/adapter_config.json
INFO:transformers.adapter_config:Adding adapter 'sst-2' of type 'text_task'.
INFO:transformers.adapter_model_mixin:Loading module weights from /Users/jonathanhilgart/.cache/torch/adapters/f23c9704bc526e1a5c605a1f1c76e7225da0fff90086a7e9483da11de926d624-04066537e8abe7c5ee72d7804a94afca7d5ff566b6731d82c77951cc3493ed8a-extracted/pytorch_adapter.bin
INFO:transformers.adapter_model_mixin:Loading modul

In [12]:
outputs

(tensor([[-4.2206,  3.9714]], grad_fn=<AddmmBackward>),)

In [15]:
ouputs_negative_sentiment

(tensor([[ 3.8247, -3.5260]], grad_fn=<AddmmBackward>),)

In [16]:
torch.argmax(ouputs_negative_sentiment[0]).item()

0

# Adapter Types
- Task adapter: Task adapters are fine-tuned to learn representations for a specific downstream tasks such as sentiment analysis, question answering etc. Task adapters for NLP were first introduced by Houlsby et al., 2019.

- Language adapter: Language adapters are used to learn language-specific transformations. After being trained on a language modeling task, a language adapter can be stacked before a task adapter for training on a downstream task. To perform zero-shot cross-lingual transfer, one language adapter can simply be replaced by another. In terms of architecture, language adapters are largely similar to task adapters, except for an additional invertible adapter layer after the embedding layer. This setup was introduced and is further explained by Pfeiffer et al., 2020.