# 2️⃣ Using Adapters from AdapterHub

In [the previous notebook](https://colab.research.google.com/github/Adapter-Hub/adapter-transformers/blob/master/notebooks/01_Adapter_Training.ipynb), we saw how to train our own adapter for a downstream task. In this notebook, we'll go through the steps to use adapters that others have trained and shared on _AdapterHub_ for **inference**.

We will use an adapter for BERT [trained on the SQuAD task](https://adapterhub.ml/explore/qa/squad1/bert/) for **extractive question answering**. This adapter achieves an F1 score of 87.75 on the dev set of SQuAD 1.1, nearly on par with full finetuning.

As you will see, most of the code is identical to using fully finetuned models with `transformers`.

## Installation

Let's install the `adapter-transformers` libraries first:

## Usage

Before loading the adapter, we instantiate the model we want to use, a pre-trained `bert-base-uncased` model from HuggingFace. We use `adapter-transformers`'s `AutoModelWithHeads` class to be able to add a prediction head flexibly.

In [1]:
from transformers import AutoTokenizer, AutoModelWithHeads

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModelWithHeads.from_pretrained("bert-base-uncased")

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModelWithHeads: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertModelWithHeads from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModelWithHeads from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


And now to the interesting part: Using `load_adapter()`, we download and add a pre-trained adapter from the Hub. The first parameter specifies the name of the adapter whereas the second selects the [adapter architectures](https://docs.adapterhub.ml/adapters.html#adapter-architectures) to search for.

Also note that most adapters come with a prediction head included. Thus, this method will also load the question answering head trained together with the adapter.

In [4]:
adapter_name = model.load_adapter("qa/squad1@ukp", config="houlsby")

Overwriting existing adapter 'squad'.
Overwriting existing head 'squad'


With `set_active_adapters()` we tell our model to use the adapter we just loaded in every forward pass.

In [5]:
model.set_active_adapters(adapter_name)

In [18]:
model

BertModelWithHeads(
  (bert): BertModel(
    (invertible_adapters): ModuleDict()
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNo

Now let's see our adapter in action! We create a question answering pipeline using our model and define some context text:

In [14]:
from transformers import QuestionAnsweringPipeline

qa = QuestionAnsweringPipeline(model=model, tokenizer=tokenizer)

context = """
Author is EasonC13, The current modus operandi in NLP involves downloading and fine-tuning pre-trained models consisting of millions or billions of parameters.
Storing and sharing such large trained models is expensive, slow, and time-consuming, which impedes progress towards more general and versatile NLP methods that learn from and for many tasks.
Adapters -- small learnt bottleneck layers inserted within each layer of a pre-trained model -- ameliorate this issue by avoiding full fine-tuning of the entire model.
However, sharing and integrating adapter layers is not straightforward.
We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
The framework, built on top of the popular HuggingFace Transformers library, enables extremely easy and quick adaptations of state-of-the-art pre-trained models (e.g., BERT, RoBERTa, XLM-R) across tasks and languages.
Downloading, sharing, and training adapters is as seamless as possible using minimal changes to the training scripts and a specialized infrastructure.
Our framework enables scalable and easy access to sharing of task-specific models, particularly in low-resource scenarios.
AdapterHub includes all recent adapter architectures and can be found at AdapterHub.ml.
"""

In [15]:
# ignore all FutureWarnings
from warnings import simplefilter
simplefilter(action='ignore', category=FutureWarning)

Finally, we can ask our model some questions about AdapterHub:

In [17]:
def answer_questions(questions):
  for question in questions:
    result = qa(question=question, context=context)
    print("❔", question)
    print("💡", result["answer"])
    print()

answer_questions([
  "What are Adapters?",
  "What do Adapters avoid?",
  "What is proposed?",
  "What does AdapterHub allow?",
  "Where can I find AdapterHub?",
  "Who is the author?",
])

❔ What are Adapters?
💡 small learnt bottleneck layers inserted within each layer of a pre-trained model

❔ What do Adapters avoid?
💡 full fine-tuning of the entire model

❔ What is proposed?
💡 AdapterHub

❔ What does AdapterHub allow?
💡 dynamic "stitching-in"

❔ Where can I find AdapterHub?
💡 AdapterHub.ml

❔ Who is the author?
💡 EasonC13



That's it! Of course, there are much more adapters available on _AdapterHub_ beyond QA adapters. Click through [our Explore page](https://adapterhub.ml/explore/) to discover all of them.

➡️ Also, the possibilities of using adapters don't stop here! Check out [the next notebook](https://colab.research.google.com/github/Adapter-Hub/adapter-transformers/blob/master/notebooks/03_Adapter_Fusion.ipynb) to see how multiple adapters can be combined for transfer learning.