# Summarizing text using pre-trained models

The `transformers` library has a lot of pre-trained models we can use to generate training text summaries. We could even use multiple different models to make sure our search engine is not merely learning the behaviour of a single summarizer.

In [5]:
import transformers as tr

Here is a [list](https://huggingface.co/models?pipeline_tag=summarization&sort=downloads) of models that can summarize text.

### BART-large-cnn

In [8]:
blc_summarizer = tr.pipeline("summarization", model="facebook/bart-large-cnn")

plot_fragment1 = """
Amid a galactic civil war, Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star, a massive space station capable of destroying entire planets. Imperial Senator Princess Leia Organa of Alderaan, secretly one of the Rebellion's leaders, has obtained its schematics, but her starship is intercepted by an Imperial Star Destroyer under the command of the ruthless Darth Vader. Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO.

The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights. 
"""

summary = blc_summarizer(plot_fragment1, max_length=100, min_length=10, do_sample=False)

print(summary)

[{'summary_text': "Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star. Princess Leia Organa hides the plans in the memory system of astromech droid R2-D2. The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker."}]


In [12]:
summarized_plot = "Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star. Princess Leia Organa hides the plans in the memory system of astromech droid R2-D2. The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker."
summary_square = blc_summarizer(summarized_plot, max_length=20, min_length=10, do_sample=False)
print(summary_square)

[{'summary_text': 'Princess Leia Organa hides the plans in the memory system of astromech droid'}]


## sshleifer/distilbart-cnn-12-6

In [15]:
bart_summarizer = tr.pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")

plot_fragment1 = """
Amid a galactic civil war, Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star, a massive space station capable of destroying entire planets. Imperial Senator Princess Leia Organa of Alderaan, secretly one of the Rebellion's leaders, has obtained its schematics, but her starship is intercepted by an Imperial Star Destroyer under the command of the ruthless Darth Vader. Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO.

The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights. 
"""

summary = bart_summarizer(plot_fragment1, max_length=100, min_length=10, do_sample=False)

print(summary)

[{'summary_text': " Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star . Princess Leia Organa hides the plans in the memory system of astromech droid R2-D2 . The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker . Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars ."}]


## google/pegasus-large

In [12]:
pegasus_summarizer = tr.pipeline("summarization", model="google/pegasus-large")

plot_fragment1 = """
Amid a galactic civil war, Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star, a massive space station capable of destroying entire planets. Imperial Senator Princess Leia Organa of Alderaan, secretly one of the Rebellion's leaders, has obtained its schematics, but her starship is intercepted by an Imperial Star Destroyer under the command of the ruthless Darth Vader. Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO.

The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights. 
"""

summary = pegasus_summarizer(plot_fragment1, max_length=200, min_length=10, do_sample=False)

print(summary)

[{'summary_text': "Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to galactic the dark side of the Force and murdered him."}]


In [7]:
plot_fragment_2 = """
R2-D2 plays Leia's full message, in which she begs Obi-Wan to take the Death Star plans to her home planet of Alderaan and give them to her father, a fellow veteran, for analysis. Although Luke initially declines Obi-Wan's offer to accompany him to Alderaan and learn the ways of the Force, he is left with no choice after discovering that Imperial stormtroopers have killed his aunt and uncle and destroyed their farm in their search for the droids. Traveling to a cantina in Mos Eisley to search for transport, Luke and Obi-Wan hire Han Solo, a smuggler with a price on his head due to his debt to local mobster Jabba the Hutt. Pursued by stormtroopers, Obi-Wan, Luke, R2-D2, and C-3PO flee Tatooine with Han and his Wookiee co-pilot Chewbacca on their ship the Millennium Falcon.

Before the Falcon can reach Alderaan, Death Star commander Grand Moff Tarkin destroys the planet in a show of force after interrogating Leia for the location of the Rebel Alliance's base. Upon arrival, the Falcon is captured by the Death Star's tractor beam, but the group manages to evade capture by hiding in the ship's smuggling compartments. As Obi-Wan leaves to disable the tractor beam, Luke persuades Han and Chewbacca to help him rescue Leia after discovering that she is scheduled to be executed. After disabling the tractor beam, Obi-Wan sacrifices himself in a lightsaber duel against Vader, allowing the rest of the group to escape the Death Star with Leia. Using a tracking device, the Empire tracks the Falcon to the hidden Rebel base on Yavin IV.

The schematics reveal a hidden weakness in the Death Star's thermal exhaust port, which could allow the Rebels to trigger a chain reaction in its main reactor with a precise proton torpedo strike. While Han abandons the Rebels after collecting his reward for rescuing Leia, Luke joins their X-wing starfighter squadron in a desperate attack against the approaching Death Star. In the ensuing battle, the Rebels suffer heavy losses as Vader leads a squadron of TIE fighters against them. Han and Chewbacca unexpectedly return to aid them in the Falcon, and knock Vader's ship off course before he can shoot down Luke. Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base. In a triumphant ceremony at the base, Leia awards Luke and Han medals for their heroism. 
"""

summary2 = pegasus_summarizer(plot_fragment_2, max_length=200, min_length=10, do_sample=False)
print(summary2)

[{'summary_text': "Pursued by stormtroopers, Obi-Wan, Luke, R2-D2, and C-3PO flee Tatooine with Han and his Wookiee co-pilot Chewbacca on their ship the Millennium Falcon. Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his awards torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base."}]


In [13]:
plot_fragment_3 = """
Before the Falcon can reach Alderaan, Death Star commander Grand Moff Tarkin destroys the planet in a show of force after interrogating Leia for the location of the Rebel Alliance's base. Upon arrival, the Falcon is captured by the Death Star's tractor beam, but the group manages to evade capture by hiding in the ship's smuggling compartments. As Obi-Wan leaves to disable the tractor beam, Luke persuades Han and Chewbacca to help him rescue Leia after discovering that she is scheduled to be executed. After disabling the tractor beam, Obi-Wan sacrifices himself in a lightsaber duel against Vader, allowing the rest of the group to escape the Death Star with Leia. Using a tracking device, the Empire tracks the Falcon to the hidden Rebel base on Yavin IV.

The schematics reveal a hidden weakness in the Death Star's thermal exhaust port, which could allow the Rebels to trigger a chain reaction in its main reactor with a precise proton torpedo strike. While Han abandons the Rebels after collecting his reward for rescuing Leia, Luke joins their X-wing starfighter squadron in a desperate attack against the approaching Death Star. In the ensuing battle, the Rebels suffer heavy losses as Vader leads a squadron of TIE fighters against them. Han and Chewbacca unexpectedly return to aid them in the Falcon, and knock Vader's ship off course before he can shoot down Luke. Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base. In a triumphant ceremony at the base, Leia awards Luke and Han medals for their heroism. 
"""

summary3 = pegasus_summarizer(plot_fragment_3, max_length=200, min_length=10, do_sample=False)
print(summary3)

[{'summary_text': "Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base."}]


In [3]:
full_plot = """
Amid a galactic civil war, Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star, a massive space station capable of destroying entire planets. Imperial Senator Princess Leia Organa of Alderaan, secretly one of the Rebellion's leaders, has obtained its schematics, but her starship is intercepted by an Imperial Star Destroyer under the command of the ruthless Darth Vader. Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO.

The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights.

R2-D2 plays Leia's full message, in which she begs Obi-Wan to take the Death Star plans to her home planet of Alderaan and give them to her father, a fellow veteran, for analysis. Although Luke initially declines Obi-Wan's offer to accompany him to Alderaan and learn the ways of the Force, he is left with no choice after discovering that Imperial stormtroopers have killed his aunt and uncle and destroyed their farm in their search for the droids. Traveling to a cantina in Mos Eisley to search for transport, Luke and Obi-Wan hire Han Solo, a smuggler with a price on his head due to his debt to local mobster Jabba the Hutt. Pursued by stormtroopers, Obi-Wan, Luke, R2-D2, and C-3PO flee Tatooine with Han and his Wookiee co-pilot Chewbacca on their ship the Millennium Falcon.

Before the Falcon can reach Alderaan, Death Star commander Grand Moff Tarkin destroys the planet in a show of force after interrogating Leia for the location of the Rebel Alliance's base. Upon arrival, the Falcon is captured by the Death Star's tractor beam, but the group manages to evade capture by hiding in the ship's smuggling compartments. As Obi-Wan leaves to disable the tractor beam, Luke persuades Han and Chewbacca to help him rescue Leia after discovering that she is scheduled to be executed. After disabling the tractor beam, Obi-Wan sacrifices himself in a lightsaber duel against Vader, allowing the rest of the group to escape the Death Star with Leia. Using a tracking device, the Empire tracks the Falcon to the hidden Rebel base on Yavin IV.

The schematics reveal a hidden weakness in the Death Star's thermal exhaust port, which could allow the Rebels to trigger a chain reaction in its main reactor with a precise proton torpedo strike. While Han abandons the Rebels after collecting his reward for rescuing Leia, Luke joins their X-wing starfighter squadron in a desperate attack against the approaching Death Star. In the ensuing battle, the Rebels suffer heavy losses as Vader leads a squadron of TIE fighters against them. Han and Chewbacca unexpectedly return to aid them in the Falcon, and knock Vader's ship off course before he can shoot down Luke. Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base. In a triumphant ceremony at the base, Leia awards Luke and Han medals for their heroism. 
"""

longer_summary = pegasus_summarizer(full_plot, max_length=600, min_length=10, do_sample=False)
print(longer_summary)

[{'summary_text': 'Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire.'}]


## t5

In [6]:
t5_summarizer = tr.pipeline("summarization", model="t5-small")

plot_fragment1 = """
The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights. 
"""

summary = t5_summarizer(plot_fragment1, max_length=200, min_length=40, do_sample=False)
print(summary)

[{'summary_text': 'droids are captured by Jawa traders, who sell them to moisture farmers . holographic recording of leia requesting help from an Obi-Wan Kenobi . after he finds R2-D2, he is attacked by scavenging Sand People while searching for him .'}]


In [8]:
plot_fragment2 = """
The schematics reveal a hidden weakness in the Death Star's thermal exhaust port, which could allow the Rebels to trigger a chain reaction in its main reactor with a precise proton torpedo strike. While Han abandons the Rebels after collecting his reward for rescuing Leia, Luke joins their X-wing starfighter squadron in a desperate attack against the approaching Death Star. In the ensuing battle, the Rebels suffer heavy losses as Vader leads a squadron of TIE fighters against them. Han and Chewbacca unexpectedly return to aid them in the Falcon, and knock Vader's ship off course before he can shoot down Luke. Guided by the disembodied voice of Obi-Wan's spirit, Luke uses the Force to aim his torpedoes into the exhaust port, destroying the Death Star moments before it fires on the Rebel base. In a triumphant ceremony at the base, Leia awards Luke and Han medals for their heroism. 
"""

summary2 = t5_summarizer(plot_fragment2, max_length=200, min_length=10, do_sample=True)
print(summary2)

[{'summary_text': 'the thermal exhaust port could trigger a chain reaction in its main reactor with a proton torpedo strike . the rebels suffer heavy losses as Vader leads a squadron of fighters against them .'}]


In [11]:
len(plot_fragment2.split())
len("The schematics reveal a hidden weakness in the Death Star's thermal exhaust port, which could allow the Rebels to trigger a chain reaction in its main reactor with a precise proton torpedo strike. While Han abandons the Rebels after collecting his reward for rescuing Leia, Luke joins their X-wing starfighter squadron in a desperate attack against the approaching Death Star. In the ensuing battle, the Rebels suffer heavy losses as Vader leads a squadron of TIE fighters against them.".split())

79

## google/pegasus-xsum

In [3]:
xsum_summarizer = tr.pipeline("summarization", model="google/pegasus-xsum")

plot_fragment1 = """
Amid a galactic civil war, Rebel Alliance spies have stolen plans to the Galactic Empire's Death Star, a massive space station capable of destroying entire planets. Imperial Senator Princess Leia Organa of Alderaan, secretly one of the Rebellion's leaders, has obtained its schematics, but her starship is intercepted by an Imperial Star Destroyer under the command of the ruthless Darth Vader. Before she is captured, Leia hides the plans in the memory system of astromech droid R2-D2, who flees in an escape pod to the nearby desert planet Tatooine alongside his companion, protocol droid C-3PO.

The droids are captured by Jawa traders, who sell them to moisture farmers Owen and Beru Lars and their nephew Luke Skywalker. While Luke is cleaning R2-D2, he discovers a holographic recording of Leia requesting help from an Obi-Wan Kenobi. Later, after Luke finds R2-D2 missing, he is attacked by scavenging Sand People while searching for him, but is rescued by elderly hermit "Old Ben" Kenobi, an acquaintance of Luke's, who reveals that "Obi-Wan" is his true name. Obi-Wan tells Luke of his days as one of the Jedi Knights, the former peacekeepers of the Galactic Republic who drew mystical abilities from a metaphysical energy field known as "the Force", but were ultimately hunted to near-extinction by the Empire. Luke learns that his father fought alongside Obi-Wan as a Jedi Knight during the Clone Wars until Vader, Obi-Wan's former pupil, turned to the dark side of the Force and murdered him. Obi-Wan offers Luke his father's old lightsaber, the signature weapon of Jedi Knights. 
"""

summary = xsum_summarizer(plot_fragment1, max_length=200, min_length=10, do_sample=False)

print(summary)

Downloading:   0%|          | 0.00/1.36k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.12G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/87.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.82M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/3.36M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

[{'summary_text': 'Star Wars: The Force Awakens is released in cinemas on 25 December 2015.'}]


### Pruning the summaries

To generate the training queries, we could randomly pick a sentence out of the summary.