# Parallel Adapter Inference

The [`Parallel` adapter composition block](https://docs.adapterhub.ml/adapter_composition.html#parallel) allows to forward the same input through multiple adapter blocks in parallel in a single forward pass. This can be useful for multi-task inference to perform multiple tasks on a single input sequence. However, the `Parallel` can also be used to _train_ multiple adapters in parallel.

In this example, we use `Parallel` to simulataneously execute named entity recognition (NER) and sentiment classification on some input sentences.
We leverage two adapters trained independently on the _CoNLL2003_ task for NER and the _SST-2_ task for sentiment analysis.
Both adapters are freely available via [HuggingFace's Model Hub](https://huggingface.co/models?library=adapters&sort=downloads).


## Installation

Let's install the `adapters` library first:

In [4]:
!pip install -Uq adapters

## Usage

Before loading the adapter, we instantiate the model we want to use, a pre-trained `roberta-base` model from HuggingFace. We use `adapters`'s `AutoAdapterModel` class to be able to add a prediction head flexibly.

In [None]:
from adapters import AutoAdapterModel
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("roberta-base")
model = AutoAdapterModel.from_pretrained("roberta-base")

Using `load_adapter()`, we download and add pre-trained adapters. In our example, we use adapters hosted on the HuggingFace Model Hub, therefore we add `source="hf"` to the loading method.

Also note that most adapters come with a prediction head included. Thus, this method will also load the heads trained together with each adapter.

In [None]:
ner_adapter = model.load_adapter("AdapterHub/roberta-base-pf-conll2003", source="hf")
sentiment_adapter = model.load_adapter("AdapterHub/roberta-base-pf-sst2", source="hf")

Now's when the `Parallel` block comes into play: With `set_active_adapters()`, we specify an adapter setup that uses the two adapters we just loaded in parallel.

In [9]:
import adapters.composition as ac

model.set_active_adapters(ac.Parallel(ner_adapter, sentiment_adapter))

With everything set up, the only thing left to do is to let our model run. For this purpose, we use a small helper method that calls the model forward pass and processes the outputs of the two prediction heads.

In [11]:
import torch

def analyze_sentence(sentence):
  tokens = tokenizer.tokenize(sentence)
  input_ids = torch.tensor(tokenizer.convert_tokens_to_ids(tokens))
  outputs = model(input_ids)

  # Post-process NER output
  ner_labels_map = model.get_labels_dict(ner_adapter)
  ner_label_ids = torch.argmax(outputs[0].logits, dim=2).numpy().squeeze().tolist()
  ner_labels = [ner_labels_map[id_] for id_ in ner_label_ids]
  annotated = []
  for token, label_id in zip(tokens, ner_label_ids):
    token = token.replace('\u0120', '')
    label = ner_labels_map[label_id]
    annotated.append(f"{token}<{label}>")
  print("NER: " + " ".join(annotated))

  # Post-process sentiment output
  sentiment_labels = model.get_labels_dict(sentiment_adapter)
  label_id = torch.argmax(outputs[1].logits).item()
  print("Sentiment: " + sentiment_labels[label_id])
  print()

Let's test our pipeline with some example sentences (taken from the XSum training set):

In [12]:
sentences = [
  "A man in central Germany tried to leave his house by the front door only to find a brick wall there.",
  "The Met Office has issued a yellow weather warning for ice across most of Wales.",
  "A vibrant animation telling stories of indigenous Australia will be projected on to the Sydney Opera House every night at sunset."
]

for sentence in sentences:
  analyze_sentence(sentence)

NER: A<O> man<O> in<O> central<O> Germany<B-LOC> tried<O> to<O> leave<O> his<O> house<O> by<O> the<O> front<O> door<O> only<O> to<O> find<O> a<O> brick<O> wall<O> there<O> .<O>
Sentiment: negative

Sentiment: negative

NER: A<O> vibrant<O> animation<O> telling<O> stories<O> of<O> indigenous<O> Australia<B-LOC> will<O> be<O> projected<O> on<O> to<O> the<O> Sydney<B-LOC> Opera<I-ORG> House<I-LOC> every<O> night<O> at<O> sunset<O> .<O>
Sentiment: positive



Voilá! Each sentence is annotated using NER tags and classified based on its sentiment.

**How does it work?** At the first occurrence of an adapter layer, `adapters` will automatically replicate the input by the number of adapters. This mechanism is especially useful if only later Transformers layers include adapters as the input will be replicated as late as possible.

**Where to go from here?**

➡️ Make sure to check out the [corresponding chapter in the documentation](https://docs.adapterhub.ml/adapter_composition.html) to learn more about adapter composition and the `Parallel` block.

➡️ Also check out [Rücklé et al., 2021](https://arxiv.org/pdf/2010.11918.pdf) who also use parallel inference in their analysis of adapter efficiency.

➡️ To see more `adapters` features in action, visit our [notebooks folder on GitHub](https://github.com/Adapter-Hub/adapters/tree/master/notebooks).