# **Implmenting Transformers**

Text Classification - Rating and Sentiment Analysis in the code

In [1]:
pip install transformers

Collecting transformers
  Downloading transformers-4.15.0-py3-none-any.whl (3.4 MB)
[K     |████████████████████████████████| 3.4 MB 5.1 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 62.4 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 56.2 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.2.1-py3-none-any.whl (61 kB)
[K     |████████████████████████████████| 61 kB 589 kB/s 
[?25hCollecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 59.3 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
 

In [2]:
from transformers import pipeline
st = f"I do not like mobile games"
seq = pipeline(task="text-classification", model='nlptown/bert-base-multilingual-uncased-sentiment')
print(f"Result: { seq(st) }")

Downloading:   0%|          | 0.00/953 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/638M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/851k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Result: [{'label': '2 stars', 'score': 0.43159592151641846}]


# **Question-Answering**

Extractive Question Answering is about attempting to find an answer presented a question in an assigned context

In [4]:
from transformers import pipeline
sentence = r"""
Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations 
in it, “and what is the use of a book,” thought Alice “without pictures or conversations?” So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and 
stupid), whether the pleasure of making a daisy chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.
"""
output = pipeline("question-answering", model="csarron/roberta-base-squad-v1")
question = output(question="Who was very tired?", context=sentence)
print(f"Answer: {question['answer']}")

Answer: Alice


# **Masked Language Modeling**

The Masked Language Modeling task is regarding masking tokens of an addressed text sentence with a masking pass, where the model is required to satisfy each mask with a relevant permit.

In [6]:
from transformers import pipeline
nlp = pipeline("fill-mask")
nlp(f"{nlp.tokenizer.mask_token} yogi will win the election")

No model was supplied, defaulted to distilroberta-base (https://huggingface.co/distilroberta-base)


[{'score': 0.1074429452419281,
  'sequence': 'How yogi will win the election',
  'token': 6179,
  'token_str': 'How'},
 {'score': 0.10092117637395859,
  'sequence': 'Indian yogi will win the election',
  'token': 25767,
  'token_str': 'Indian'},
 {'score': 0.07127007842063904,
  'sequence': 'Why yogi will win the election',
  'token': 7608,
  'token_str': 'Why'},
 {'score': 0.030875442549586296,
  'sequence': 'If yogi will win the election',
  'token': 1106,
  'token_str': 'If'},
 {'score': 0.014264990575611591,
  'sequence': 'Only yogi will win the election',
  'token': 19933,
  'token_str': 'Only'}]

# **Named Entity Recognition**

The Named Entity Recognition job refers to the authorization of a class to each token of a presented text sequence.

In [10]:
from transformers import pipeline
seq = r"""
I am Fernando, and I live in Mexico. I am a Machine Learning Engineer, and I work at Hitch.
"""
nlp = pipeline(task='ner')
for item in nlp(seq):
    print(f"{item['word'], item['entity']}")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english)


('Fernando', 'I-PER')
('Mexico', 'I-LOC')
('Learning', 'I-ORG')
('Engineer', 'I-MISC')
('Hit', 'I-ORG')
('##ch', 'I-ORG')


# **Text Summarization**

The Text Summarization task commits to the uprooting of a summary provided a planned text. The description of the job and summarisation identifier is needed to initialize the pipeline.

In [11]:
from transformers import pipeline
txt = r'''
Machine learning is the study of computer algorithms that improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning is an important component of the growing field of data 
science . Machine learning, deep learning, and neural networks are all sub-fields of artificial intelligence . As big data continues to grow, the market demand for data scientists will increase, requiring them to assist in the identification of 
the most relevant business questions. Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make 
decisions with minimal human intervention.
'''
nlp = pipeline(task='summarization')
nlp(txt, max_length=130, min_length=30)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 (https://huggingface.co/sshleifer/distilbart-cnn-12-6)


Downloading:   0%|          | 0.00/1.76k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.14G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

[{'summary_text': ' Machine learning is the study of computer algorithms that improve automatically through experience and by the use of data . Machine learning, deep learning, and neural networks are all sub-fields of artificial intelligence . As big data continues to grow, the market demand for data scientists will increase .'}]