**Making a Summary of a Text**


POS stands for **part of speech**. 
<br>
TextBlob will  tell us what part of speech each word in a text corresponds to. It can tell us if a word in a sentence is functioning as a noun, an adjective, a verb, etc. for that, we will need to install the following nltk component:

* `import nltk`
* `nltk.download('averaged_perceptron_tagger')`

In [1]:
from textblob import *
import numpy as np

Use the POS tagger

In [2]:
# define the text to tagg
txt = TextBlob(' Computers are heavy. Programmers work hard. The sun shines')
# extract the tagging using the method ".tags"
for word,pos in txt.tags:
    print(word,pos)

Computers NNS
are VBP
heavy JJ
Programmers NNS
work VBP
hard RB
The DT
sun NN
shines NNS


**Case Study: Making a summary of a text**

Define the text to summarize

In [11]:
text = 'US stocks rose after President Donald Trump announced tariffs \
that were narrower than some traders had anticipated. Treasuries and the\
dollar gained. The S&P 500 advanced for the fourth time in five days as \
investors found relief in the president’s decision to exclude Canada and Mexico\
while giving other countries wiggle room from levies on imports of steel and \
aluminum that will take effect in 15 days. Technology companies paced gains. \
Ten-year Treasury yields fell to 2.86 percent. \
The dollar rose against the euro after the European Central Bank’s decided to drop \
a pledge to increase asset purchases if necessary, and as President Mario Draghi \
downplayed the change. Crude oil traded near $60 a barrel and gold slipped as a \
Bloomberg gauge of commodities slid for a second day.'

 Method A: Finding some random words

In [12]:
blob = TextBlob(text)

# gather words
nouns = list()
for word, tag in blob.tags:
    if tag == 'NN':
        nouns.append(word.lemmatize())

print('There are in total',len(blob.tags),'words in the text\n')

#pick 5 names from the list

print('This text is about...')
for i in np.random.choice(np.arange(0,len(nouns)),5,replace=False):
    word = Word(nouns[i])
    print(word)

There are in total 133 words in the text

This text is about...
asset
steel
thedollar
pledge
oil


Method B : Finding the shortes sentences<br>
Make a list of sentences that contain no more than 5 words.

In [13]:
short_sentences = list()
for sentence in blob.sentences:
    if len(sentence.words) <= 5:
        short_sentences.append(sentence)

In [14]:
[t for t in short_sentences]

[Sentence("Treasuries and thedollar gained."),
 Sentence("Technology companies paced gains.")]

Method C: Retrieve the nouns from the text

In [15]:
noun_phrases = blob.noun_phrases

for i in noun_phrases:
    print(i)

us stocks
donald trump
treasuries
president ’ s decision
canada
mexicowhile
countries wiggle room
technology companies
ten-year
treasury yields
european central bank ’ s
mario draghi
crude
bloomberg
