# POC - Transformers

## Setup

In [2]:
from transformers import pipeline

In [3]:
from transformers import AutoTokenizer

In [4]:
from transformers import set_seed

In [19]:
from transformers import ViltProcessor, ViltForQuestionAnswering
import requests
from PIL import Image

## Sentiment Analysis 

In [8]:
classifier = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [9]:
classifier("We are very happy to show you the 🤗 Transformers library.")

[{'label': 'POSITIVE', 'score': 0.9997795224189758}]

## Speech Recognition 

In [10]:
generator = pipeline(task="automatic-speech-recognition")

No model was supplied, defaulted to facebook/wav2vec2-base-960h and revision 55bb623 (https://huggingface.co/facebook/wav2vec2-base-960h).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [11]:
generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")

{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP LIVE UP THE TRUE MEANING OF ITS TREES'}

## Image Classification 

In [12]:
image_classifier = pipeline(model="google/vit-base-patch16-224")

In [13]:
preds = image_classifier(
    images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)

preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]

In [14]:
preds

[{'score': 0.4403, 'label': 'lynx, catamount'},
 {'score': 0.0343,
  'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'},
 {'score': 0.0321, 'label': 'snow leopard, ounce, Panthera uncia'},
 {'score': 0.0235, 'label': 'Egyptian cat'},
 {'score': 0.023, 'label': 'tiger cat'}]

## Tokenization

In [15]:
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

In [16]:
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

In [17]:
sequence = "In a hole in the ground there lived a hobbit."
tokenizer(sequence)

{'input_ids': [101, 1999, 1037, 4920, 1999, 1996, 2598, 2045, 2973, 1037, 7570, 10322, 4183, 1012, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

## Question Answer - NLP 

https://huggingface.co/deepset/roberta-base-squad2

In [20]:
model_name = "deepset/roberta-base-squad2"

In [21]:
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)

Downloading (…)lve/main/config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


Downloading model.safetensors:   0%|          | 0.00/496M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/79.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

In [22]:
QA_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
nlp(QA_input)

{'score': 0.21171464025974274,
 'start': 59,
 'end': 84,
 'answer': 'gives freedom to the user'}

In [None]:
# b) Load model & tokenizer
# model = AutoModelForQuestionAnswering.from_pretrained(model_name)
# tokenizer = AutoTokenizer.from_pretrained(model_name)

## GPT 2

https://huggingface.co/gpt2

In [24]:
generator = pipeline('text-generation', model='gpt2')

Downloading (…)lve/main/config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [31]:
set_seed(42)
generator("India is great country but", max_length=30, num_return_sequences=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'India is great country but it has to be in the middle. He also mentioned in his speech that this is not a question of the national sovereignty and'},
 {'generated_text': 'India is great country but sometimes, India is bad and sometimes, the best is always the worst. India is also a nation of intolerance and there are'},
 {'generated_text': 'India is great country but it\'s very bad, if you say that, well, we\'re not like that, so why don\'t you come?"'},
 {'generated_text': "India is great country but not one of the world's well developed, which, frankly, the people have been waiting to grow, they might not live"},
 {'generated_text': 'India is great country but we have all to go in and try to show India is good and the other side is evil."\n\nThe president also'}]

## Document Question Answering 

In [8]:
model = pipeline(model="impira/layoutlm-document-qa", task="document-question-answering")

In [13]:
model("ssw.png","List all notable people")

[{'score': 0.03532538190484047,
  'answer': 'Ghulam Mustafa Khan',
  'start': 180,
  'end': 182}]

## Visual QA

In [18]:
modelVqa = pipeline(model="dandelin/vilt-b32-finetuned-vqa", task="visual-question-answering")

Downloading pytorch_model.bin:   0%|          | 0.00/470M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/320 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading (…)rocessor_config.json:   0%|          | 0.00/251 [00:00<?, ?B/s]

Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


In [34]:
url = "https://th.bing.com/th/id/OIP.sfgBr1qnwi4joiZT7Sk0IgHaJs?w=155&h=203&c=7&r=0&o=5&pid=1.7"
image = Image.open(requests.get(url, stream=True).raw)
text = "Name of radio?"

In [35]:
processor = ViltProcessor.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
model = ViltForQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

In [36]:
encoding = processor(image, text, return_tensors="pt")

In [37]:
outputs = model(**encoding)
logits = outputs.logits
idx = logits.argmax(-1).item()
print("Predicted answer:", model.config.id2label[idx])

Predicted answer: 0
