# Objective
The objective of this exercise is to is to use pretrained bert models and pipelines for various tasks like extractive question answering and extractive summarization.

# Transformers 
PyTorch-Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP). The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the models like BERT, GPT, XLM, RoBERTa, BistilBERT.

# Pipelines
pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

[Library Documentation](https://huggingface.co/docs/transformers/v4.23.1/en/main_classes/pipelines#transformers.pipeline)

In [1]:
%%bash
pip install tqdm boto3 requests regex sentencepiece sacremoses transformers



## Extractive Question Answering
In Extractive Question Answering, a context is provided so that the model can refer to it and make predictions on where the answer lies within the passage.



In [2]:
from transformers import pipeline

question_answering = pipeline('question-answering', model='distilbert-base-uncased-distilled-squad')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


Moving 0 files to the new cache system


0it [00:00, ?it/s]

Downloading:   0%|          | 0.00/451 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/265M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [3]:
context1 = """
Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.
"""

question1 = "What are machine learning models based on?"


question2 = "what are the applications of machine learning models"

response1 = question_answering(question1, context1)
response2 = question_answering(question2, context1)

print(response1['answer'])

sample data


# Exercise 1

Load `deepset/bert-base-uncased-squad2` model using the pipeline function, and use context2 provided below to answer the following questions:
1. What is the address of security services?
2. How can you contact the security office?
3. When is the control centre open?
4. what is the purpose of security team?

In [4]:
# CODE FOR ONLY LOADING THE PIPELINE
question_answering = pipeline('question-answering', model='deepset/bert-base-uncased-squad2')



Downloading:   0%|          | 0.00/693 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/436M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/302 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [5]:
context2="""
The Security team can assist you with access control, bicycle security, building locations, problem reporting, personal security, traffic management and key distribution
Security Services is located at No. 14 Distillery Road (150 metres from the AIB Bank Newcastle) in the centre of the campus. 
The control centre is open 24/7 with security on patrol round the clock.
You can contact the Security Office by email at securityo@universityofgalway.ie
"""


In [8]:
## QUESTION ANSWER CODE GOES HERE

question1 = "What is the address of security services?"
response1 = question_answering(question1, context2)
print(response1['answer'])

question2 = "How can you contact the security office?"
response2 = question_answering(question2, context2)
print(response2['answer'])

question3 = "When is the control centre open?"
response3 = question_answering(question3, context2)
print(response3['answer'])

question4 = "what is the purpose of security team?"
response4 = question_answering(question4, context2)
print(response4['answer'])


No. 14 Distillery Road
by email
24/7
assist you with access control, bicycle security


# SUMMARIZATION

Summarization
Summarization is the task of summarizing a document or an article into a shorter text

[LINK TO DOCUMENTATION](https://huggingface.co/docs/transformers/v4.23.1/en/task_summary#summarization)

In [7]:
summarizer = pipeline('summarization', model='sshleifer/distilbart-cnn-12-6')#, 'mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization')

Downloading:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

In [9]:
summarizer(context1, max_length=64, min_length=32)

[{'summary_text': ' Machine learning (ML) is the study of computer algorithms that improve automatically through experience . It is seen as a part of artificial intelligence . Machine learning algorithms are used in a wide variety of applications such as email filtering and computer vision .'}]

# Exercise 2
1. Use the summarizer pipeline to get the summary of `context2`.
2. Get the summary of the following movie review, having words between 32-64:


Starting "Final Space", I found the lead characters oddly annoying and obnoxious, even though i generally like stupid silly, yet oddly witty humor, it was just too much too soon. But the more the main plot-line proceeded, the better the characters became. The character development in this show is brilliant. Towards the end, one cant help but really feel part of the squad and understand the relations and struggles the characters have towards another.

The plot of "Final Space" is a little bit all over the place. However moving on through the episodes, one couldn't help but feel like it some how all made sense, the different cliche's and scenarios were all the right ingredients which in turn are what gave it a true Science Fiction feel.

THE MUSIC IS SPOT ON. Oh in my opinion by far the greatest aspect of the show. Firstly congratulate who ever selected the sound track and audio. The music hit the scene perfectly almost every time and really added that final touch to make it truly feel like a sci-fi show rather than just another animation.

If you can get past the initial 3-4 episodes, I assure you, you will want to know what will happen next.



In [10]:
## YOUR CODE GOES HERE
print(summarizer(context2, max_length=64, min_length=32))

[{'summary_text': ' Security Services is located at No. 14 Distillery Road (150 metres from the AIB Bank Newcastle) in the centre of the campus . The control centre is open 24/7 with security on patrol round the clock .'}]


In [11]:
movie_review = """
Starting "Final Space", I found the lead characters oddly annoying and obnoxious, even though i generally like stupid silly, yet oddly witty humor, it was just too much too soon. But the more the main plot-line proceeded, the better the characters became. The character development in this show is brilliant. Towards the end, one cant help but really feel part of the squad and understand the relations and struggles the characters have towards another.

The plot of "Final Space" is a little bit all over the place. However moving on through the episodes, one couldn't help but feel like it some how all made sense, the different cliche's and scenarios were all the right ingredients which in turn are what gave it a true Science Fiction feel.

THE MUSIC IS SPOT ON. Oh in my opinion by far the greatest aspect of the show. Firstly congratulate who ever selected the sound track and audio. The music hit the scene perfectly almost every time and really added that final touch to make it truly feel like a sci-fi show rather than just another animation.

If you can get past the initial 3-4 episodes, I assure you, you will want to know what will happen next.
"""

In [12]:
print(summarizer(movie_review, max_length=64, min_length=32))

[{'summary_text': ' The plot of "Final Space" is a little bit all over the place . The music hit the scene perfectly almost every time and really added that final touch to make it truly feel like a sci-fi show .'}]
