# Text Summarization with T5 Text-to-Text Transformer


In [1]:
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/colab_setup.sh -O - | bash

--2021-10-20 10:58:01--  https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/colab_setup.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1608 (1.6K) [text/plain]
Saving to: ‘STDOUT’

-                     0%[                    ]       0  --.-KB/s               setup Colab for PySpark 3.0.2 and Spark NLP 3.1.0

2021-10-20 10:58:02 (1.67 MB/s) - written to stdout [1608/1608]

Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Get:4 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease [15.9 kB]
Ign:5 https://developer.do

In [2]:
import sparknlp 
spark= sparknlp.start()

In [3]:
from sparknlp.annotator import * 
from sparknlp.base import *
from pyspark.ml import Pipeline

In [22]:
text= ''' Atletico Madrid manager Diego Simeone's decision to head down the tunnel without shaking the hand of Liverpool counterpart Jurgen Klopp was perhaps not the most dignified ending to Tuesday's match, but it took nothing away from what was a frantic and ferocious Champions League encounter.

In the BT studio, former Liverpool striker Peter Crouch said Simeone had "let himself down" and ex-Manchester City defender Joleon Lescott called it "cowardly".

But both managers were quick to play down the absence of the traditional post-match show of respect before anything more could be made of it.

Klopp described it as "nothing", adding to BT Sport: "His reaction was like mine (when Atletico knocked Liverpool out at Anfield in 2020) not too good. He was obviously angry, not with me but the game, the world."

Simeone concurred. "I don't normally greet after the game," he said. "I do not like it and I think it is not healthy because there will always be someone who is not satisfied with the game. They have a different culture, which I do not share."

Both said they would shake hands at the return leg at Anfield in two weeks' time.
 '''


In [23]:
df= spark.createDataFrame([[text]]).toDF("text")

Creating pipeline for summarize the given data. 

In [26]:
documentAssembler= DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

t5_summary= T5Transformer.pretrained(name="t5_base", lang="en")\
    .setInputCols(["document"])\
    .setTask("summarize:")\
    .setMaxOutputLength(150)\
    .setOutputCol("summary")

pipeline= Pipeline(stages=[ 
                           documentAssembler,
                           t5_summary
])



t5_base download started this may take some time.
Approximate size to download 446 MB
[OK!]


Fitting and transforming the pipeline with data and displaying the result.

In [27]:
model= pipeline.fit(df)
result= model.transform(df)

In [28]:
result.select("summary.result").show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[atletico madrid manager Diego Simeone opted to head down the tunnel without shaking the hand of Jurgen Klopp . the deci

Another example:

In [29]:
text= ''' It's official. Newcastle United are hunting for a new manager.

In what has to be one of the most inevitable partings of the Premier League era, Steve Bruce lasted just 13 days after the Saudi Arabian-backed £305m takeover of the club.

With the club 19th in the table and without a win after seven games this season, fans had continued to call for Bruce's head.

But who might replace him?

Financier Amanda Staveley, who fronted the consortium, says the new owners are making a "long-term investment" to ensure Newcastle are "regularly competing for major trophies".

And former Magpies winger Chris Waddle has warned the next manager appointment will prove crucial in achieving these aims.

He told BBC Radio 5 Live: "I think Steve Bruce has done a good job - that may be unpopular, but I think he has.

"Manchester City made a real statement when this happened to them, and eventually got a high-profile manager in Pep Guardiola, who is the best.

"That's what Newcastle need to do if they're going to take this club to the next level. They've got to have somebody at the top who is running the team who is a top, top manager from Europe. It needs to be someone who makes you go: 'Wow, what a manager he is.'"

Spanish football expert Guillem Balague, speaking on BBC Radio 5 Live's Euro Leagues podcast, said: "Right now everyone wants to come to the Premier League and the top four manager-wise are sorted, so here is an opportunity for managers to come to the Premier League to a club with a lot of money.

"It's public knowledge they wanted Rafa Benitez to take over, and if that is the profile then well done to them. They are looking for that type of manager.

"You look around for someone to create the foundations for a team who can win the league in five, six, seven years. You have to start now with elite decisions and an elite manager."

BBC Sport takes a look at some of the names being discussed by fans, pundits and bookmakers.

'''

In [30]:
df_1= spark.createDataFrame([[text]]).toDF("text")
model= pipeline.fit(df_1)
result= model.transform(df_1)

In [31]:
result.select("summary.result").show(truncate=False)

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[the new owners of the club are making a "long-term investment" in the club . former winger Chris Waddle has warned the next manager appointment will prove crucial .]|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

