<a href="https://colab.research.google.com/github/aaalexlit/medium_articles/blob/main/Leveraging_Huggingface_with_pipelines.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Environment setup

In [None]:
%%capture
!pip install transformers
!pip install datasets
!pip3 install memory_profiler
%load_ext memory_profiler

# Inference using original code

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("amandakonet/climatebert-fact-checking")
tokenizer = AutoTokenizer.from_pretrained("amandakonet/climatebert-fact-checking")

Downloading (…)lve/main/config.json:   0%|          | 0.00/951 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/329M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/378 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [None]:
sample_claim = ['Beginning in 2005, however, polar ice modestly receded for several years']
sample_evidence = ['Polar Discovery "Continued Sea Ice Decline in 2005']

In [None]:
def predict_using_sample_code(claims, evidences):
  features = tokenizer(claims, 
                    evidences,  
                    padding='max_length', truncation=True, return_tensors="pt", max_length=512)

  model.eval()
  with torch.no_grad():
    scores = model(**features).logits
    label_mapping = ['entailment', 'contradiction', 'neutral']
    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
    return labels

## Run inference on a provided sample

In [None]:
%%time
%memit predict_using_sample_code(sample_claim, sample_evidence)

peak memory: 1074.25 MiB, increment: 10.64 MiB
CPU times: user 1.3 s, sys: 81.8 ms, total: 1.38 s
Wall time: 1.92 s


## Load more samples

In [None]:
from datasets import load_dataset
cf_df = load_dataset("amandakonet/climate_fever_adopted", split='test').to_pandas()
input_claims = cf_df['claim'].values.tolist()
input_evidences = cf_df['evidence'].values.tolist()

Downloading metadata:   0%|          | 0.00/1.12k [00:00<?, ?B/s]

Downloading readme: 0.00B [00:00, ?B/s]

Downloading and preparing dataset None/None (download: 1.01 MiB, generated: 2.66 MiB, post-processed: Unknown size, total: 3.66 MiB) to /root/.cache/huggingface/datasets/amandakonet___parquet/amandakonet--climate_fever_adopted-5a2eed1d355d5a34/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...


Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/520k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/247k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/288k [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/3 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/4298 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1535 [00:00<?, ? examples/s]

Generating valid split:   0%|          | 0/1842 [00:00<?, ? examples/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/amandakonet___parquet/amandakonet--climate_fever_adopted-5a2eed1d355d5a34/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec. Subsequent calls will reuse this data.


In [None]:
len(input_claims), len(input_evidences)

(1535, 1535)

The following cell normally kills the notebook with Out Of Memory  
but that one time I ran it I got loads of RAM in the environment for some reason

In [None]:
%%time
%memit labels = predict_using_sample_code(input_claims, input_evidences)

peak memory: 70648.12 MiB, increment: 52950.69 MiB
CPU times: user 12min 46s, sys: 6min 37s, total: 19min 23s
Wall time: 3min 48s


# Use `transformers.pipeline` instead

In [None]:
from transformers import pipeline

def predict_using_pipelines(claims: [str], evidences: [str]) -> ([str], [float]):
    def claim_evidence_pair_data():
        for claim, evidence in zip(claims, evidences):
            yield {"text": claim, "text_pair": evidence}

    pipe = pipeline("text-classification", model=model,
                    tokenizer=tokenizer, device=-1,
                    truncation=True, padding=True)
    labels = []
    probs = []
    for out in pipe(claim_evidence_pair_data(), batch_size=1):
        labels.append(out['label'])
        probs.append(out['score'])
    return labels, probs

First, we run the inference using the pepeline but not xformers

In [None]:
%%time
%memit pred_labels, pred_probs = predict_using_pipelines(input_claims, input_evidences)

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


peak memory: 1552.07 MiB, increment: 11.77 MiB
CPU times: user 4min 18s, sys: 2.92 s, total: 4min 21s
Wall time: 44.1 s


## Install `xformers` 

In [None]:
%%capture
!pip install xformers

Xformers library improves memory usage drastically



In [None]:
%%time
%memit pred_labels, pred_probs = predict_using_pipelines(input_claims, input_evidences)

peak memory: 1552.35 MiB, increment: 0.25 MiB
CPU times: user 4min 18s, sys: 2.93 s, total: 4min 21s
Wall time: 43.9 s
