## Using Libraries for text summarization - ML Based

In [22]:
import pandas as pd

In [23]:
from sumy.summarizers.text_rank import TextRankSummarizer
from sumy.summarizers.lsa import LsaSummarizer
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer

In [24]:
df = pd.read_csv('~/Work/self/DataVault/02_lazy_programmer_nlp/bbc_text_cls.csv')

In [25]:
df.head()

Unnamed: 0,text,labels
0,Ad sales boost Time Warner profit\n\nQuarterly...,business
1,Dollar gains on Greenspan speech\n\nThe dollar...,business
2,Yukos unit buyer faces loan claim\n\nThe owner...,business
3,High fuel prices hit BA's profits\n\nBritish A...,business
4,Pernod takeover talk lifts Domecq\n\nShares in...,business


In [26]:
doc = df[df.labels == 'business']['text'].sample(random_state=42)

In [27]:
doc

480    Christmas sales worst since 1981\n\nUK retail ...
Name: text, dtype: object

In [28]:
doc.iloc[0]

'Christmas sales worst since 1981\n\nUK retail sales fell in December, failing to meet expectations and making it by some counts the worst Christmas since 1981.\n\nRetail sales dropped by 1% on the month in December, after a 0.6% rise in November, the Office for National Statistics (ONS) said. The ONS revised the annual 2004 rate of growth down from the 5.9% estimated in November to 3.2%. A number of retailers have already reported poor figures for December. Clothing retailers and non-specialist stores were the worst hit with only internet retailers showing any significant growth, according to the ONS.\n\nThe last time retailers endured a tougher Christmas was 23 years previously, when sales plunged 1.7%.\n\nThe ONS echoed an earlier caution from Bank of England governor Mervyn King not to read too much into the poor December figures. Some analysts put a positive gloss on the figures, pointing out that the non-seasonally-adjusted figures showed a performance comparable with 2003. The N

### TextRank

In [29]:
summarizer = TextRankSummarizer()
parser = PlaintextParser.from_string(
    doc.iloc[0].split("\n", 1)[1],
    Tokenizer("english"))
summary = summarizer(parser.document, sentences_count=5)

In [30]:
summary

(<Sentence: Retail sales dropped by 1% on the month in December, after a 0.6% rise in November, the Office for National Statistics (ONS) said.>,
 <Sentence: The ONS revised the annual 2004 rate of growth down from the 5.9% estimated in November to 3.2%.>,
 <Sentence: The ONS echoed an earlier caution from Bank of England governor Mervyn King not to read too much into the poor December figures.>,
 <Sentence: Some analysts put a positive gloss on the figures, pointing out that the non-seasonally-adjusted figures showed a performance comparable with 2003.>,
 <Sentence: The November-December jump last year was roughly comparable with recent averages, although some way below the serious booms seen in the 1990s.>)

In [31]:
for s in summary:
  print(str(s))

Retail sales dropped by 1% on the month in December, after a 0.6% rise in November, the Office for National Statistics (ONS) said.
The ONS revised the annual 2004 rate of growth down from the 5.9% estimated in November to 3.2%.
The ONS echoed an earlier caution from Bank of England governor Mervyn King not to read too much into the poor December figures.
Some analysts put a positive gloss on the figures, pointing out that the non-seasonally-adjusted figures showed a performance comparable with 2003.
The November-December jump last year was roughly comparable with recent averages, although some way below the serious booms seen in the 1990s.


### Latent Semantic Analysis - LSA

In [32]:
summarizer = LsaSummarizer()
summary = summarizer(parser.document, sentences_count=5)
for s in summary:
  print(str(s))

UK retail sales fell in December, failing to meet expectations and making it by some counts the worst Christmas since 1981.
Morrisons, Woolworths, House of Fraser, Marks & Spencer and Big Food all said that the festive period was disappointing.
And a British Retail Consortium survey found that Christmas 2004 was the worst for 10 years.
Yet, other retailers - including HMV, Monsoon, Jessops, Body Shop and Tesco - reported that festive sales were well up on last year.
Investec chief economist Philip Shaw said he did not expect the poor retail figures to have any immediate effect on interest rates.
