# Text Summarization
Evaluate some text summarization approaches in action. In particular, you should do the followings:
- Define a string variable that contains a piece of text as your document.
- Run unsupervised extractive text summarization algorithms to extract key sentences of your document using a library, such as `Sumy`.
- Run pre-trained extractive/abstractive text summarization models to generate an abstract of your document using a library, such as `Transformers` and `bert-extractive-summarizer`.
- Compare and evalaute different summaries and analyze the effect of hyperparameters on the final result quality.

## Defining a Text

In [3]:
text = """
Alec Baldwin says he "didn't pull the trigger" of the gun that fatally wounded cinematographer Halyna Hutchins on the set of his film, Rust.

The star made the claim in his first sit-down interview since the incident in October.

"I would never point a gun at anyone and pull a trigger at them. Never," he told George Stephanopoulos of ABC News.

The interview was recorded on Tuesday, and is due to be broadcast in the US on Thursday evening.

Mr Stephanopoulos described their 80-minute discussion as "raw" and "intense.

The journalist described Mr Baldwin, 63, as "devastated" yet "very candid" and "forthcoming", while previewing the interview on Wednesday's Good Morning America.

"I've done thousands of interviews in the last 20 years at ABC," he said. "This was the most intense I've ever experienced."

Mr Baldwin is best-known for his performances in films like Glengarry Glen Ross and The Hunt For Red October, as well as his impersonation of Donald Trump on US sketch show Saturday Night Live.

The interview marks the first time Mr Baldwin has spoken about the incident on camera, except for a brief interview he gave to TMZ in October, in a bid to stop the paparazzi from following him and his family.

In that appearance, he described the incident as a "one in a trillion episode" and said accidents of this nature very rarely happened on film sets.

Ms Hutchins was shot and killed as Mr Baldwin rehearsed with what he believed to be a "cold" - or safe - gun on the set of Rust in New Mexico.

It is believed to have discharged when he removed it from a holster during rehearsals for a forthcoming scene.

Ms Hutchins was flown to hospital by helicopter after the shooting but later died of her injuries. Director Joel Souza, 48, was also injured.

According to court records, Mr Baldwin was handed the weapon by the film's assistant director, Dave Halls, who did not know it contained live ammunition and indicated it was unloaded by shouting "cold gun".

Mr Halls had been given the gun by Hannah Gutierrez-Reed, the 24-year-old armorer on the film.

Asked by Mr Stephanopoulos how a live bullet had made its way on to the set, Mr Baldwin replied, "I have no idea.

"Someone put a live bullet into a gun. A bullet that wasn't even supposed to be on the property."

Lawyers for Ms Gutierrez-Reed have said she did not know where "the live rounds came from". That question is now at the centre of a police investigation in the US.

Earlier this week, the investigators obtained a warrant to search the premises of an arms supplier in the US.

An affidavit with the warrant said police were told ammunition for the film had come from several sources, including PDQ Arm & Prop.

The affidavit said the ammunition supplier's owner, Seth Kenney, had told investigators the live round might have been from some "reloaded ammunition".

He said the ammunition he supplied for the film consisted of dummy rounds and blanks, according to the affidavit.
"""
print(text)


Alec Baldwin says he "didn't pull the trigger" of the gun that fatally wounded cinematographer Halyna Hutchins on the set of his film, Rust.

The star made the claim in his first sit-down interview since the incident in October.

"I would never point a gun at anyone and pull a trigger at them. Never," he told George Stephanopoulos of ABC News.

The interview was recorded on Tuesday, and is due to be broadcast in the US on Thursday evening.

Mr Stephanopoulos described their 80-minute discussion as "raw" and "intense.

The journalist described Mr Baldwin, 63, as "devastated" yet "very candid" and "forthcoming", while previewing the interview on Wednesday's Good Morning America.

"I've done thousands of interviews in the last 20 years at ABC," he said. "This was the most intense I've ever experienced."

Mr Baldwin is best-known for his performances in films like Glengarry Glen Ross and The Hunt For Red October, as well as his impersonation of Donald Trump on US sketch show Saturday Nigh

## Extractive Text Summarization

### Using Sumy

In [5]:
import sumy.utils
import sumy.nlp.stemmers
import sumy.nlp.tokenizers
import sumy.summarizers.lsa
import sumy.parsers.plaintext
import sumy.summarizers.text_rank

LANGUAGE = "english"
SENTENCE_COUNT = 2

tokenizer = sumy.nlp.tokenizers.Tokenizer(LANGUAGE)
parser = sumy.parsers.plaintext.PlaintextParser.from_string(text, tokenizer)
stemmer = sumy.nlp.stemmers.Stemmer(LANGUAGE)
summarizer = sumy.summarizers.text_rank.TextRankSummarizer(stemmer)
summarizer.stop_words = sumy.utils.get_stop_words(LANGUAGE)

summary_list = [str(s) for s in summarizer(parser.document, SENTENCE_COUNT)]
summary = "\n".join(summary_list)
print(summary)

According to court records, Mr Baldwin was handed the weapon by the film's assistant director, Dave Halls, who did not know it contained live ammunition and indicated it was unloaded by shouting "cold gun".
Asked by Mr Stephanopoulos how a live bullet had made its way on to the set, Mr Baldwin replied, "I have no idea.


### Using bert-extractive-summarizer

In [7]:
import summarizer

model = summarizer.Summarizer()

# RATIO = 0.2
#summary = model(text, ratio=RATIO)

SENTENCE_COUNT = 2
summary = model(text, num_sentences=SENTENCE_COUNT)
print(summary)

2021-12-02 10:31:28.533624: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-02 10:31:28.533651: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Downloading:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.25G [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-large-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Alec Baldwin says he "didn't pull the trigger" of the gun that fatally wounded cinematographer Halyna Hutchins on the set of his film, Rust. Never," he told George Stephanopoulos of ABC News.


## Abstractive Text Summarization

### Using Transformers 1

In [9]:
import transformers
transformers.logging.set_verbosity_error()

MAX_TOKENS = 50
MIN_TOKENS = 10

pipeline = transformers.pipeline("summarization")
summary = pipeline(text, max_length=MAX_TOKENS, min_length=MIN_TOKENS, truncation=False)[0]["summary_text"]
print(summary)

 Alec Baldwin made the claim in his first sit-down interview since the incident in October . Halyna Hutchins was shot and killed as he rehearsed with a gun on the set of Rust . It is believed to have discharged when he


### Using Transformers 2

In [10]:
import transformers
transformers.logging.set_verbosity_error()

MAX_TOKENS = 50
MIN_TOKENS = 10
METHOD = "facebook/bart-large-cnn"

model = transformers.BartForConditionalGeneration.from_pretrained(METHOD)
tokenizer = transformers.BartTokenizer.from_pretrained(METHOD)

inputs = tokenizer.batch_encode_plus([text], return_tensors="pt", truncation=False)          
summary_ids = model.generate(inputs["input_ids"], early_stopping=True, max_length=MAX_TOKENS, min_length=MIN_TOKENS)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
print(summary)

Downloading:   0%|          | 0.00/1.55k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.51G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Alec Baldwin says he 'didn't pull the trigger' of the gun that fatally wounded cinematographer Halyna Hutchins on the set of his film, Rust. The star made the claim in his first sit-down interview since
