In this notebook, let us see two examples of existing summarization approaches. The first one comes from the python library sumy, which implements several popular summarization approaches from literature. The second example uses gensim's summarizer implementation. 

## Summarization with Sumy

In [5]:
!pip3 install sumy #install sumy

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [6]:
#Code to summarize a given webpage using Sumy's TextRank implementation. 
from sumy.parsers.html import HtmlParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.text_rank import TextRankSummarizer

url = "https://en.wikipedia.org/wiki/Automatic_summarization"
parser = HtmlParser.from_url(url, Tokenizer("english"))
summarizer = TextRankSummarizer()
for sentence in summarizer(parser.document, 5):
    print(sentence)


For text, extraction is analogous to the process of skimming, where the summary (if available), headings and subheadings, figures, the first and last paragraphs of a section, and optionally the first and last sentences in a paragraph are read before one chooses to read the entire document in detail.
Instead of trying to learn explicit features that characterize keyphrases, the TextRank algorithm [7] exploits the structure of the text itself to determine keyphrases that appear "central" to the text in the same way that PageRank selects important Web pages.
Once the graph is constructed, it is used to form a stochastic matrix, combined with a damping factor (as in the "random surfer model"), and the ranking over vertices is obtained by finding the eigenvector corresponding to eigenvalue 1 (i.e., the stationary distribution of the random walk on the graph).
While the goal of a brief summary is to simplify information search and cut the time by pointing to the most relevant source document

Clearly there are other summarizers and options in sumy. We leave their exploration as an exercise to you!
#TODO: can add a few more examples. 

## Summarization example with Gensim

In [7]:
!pip3 install gensim #installation of the library

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


Gensim does not have a HTML parser like sumy. So, let us use the example text from Chapter 5 (nlphistory.txt) to see what its summarized version looks like! 

In [8]:
from gensim.summarization import summarize
text = open("nlphistory.txt").read()
print(summarize(text))


Some notably successful natural language processing systems developed in the 1960s were SHRDLU, a natural language system working in restricted "blocks worlds" with restricted vocabularies, and ELIZA, a simulation of a Rogerian psychotherapist, written by Joseph Weizenbaum between 1964 and 1966.
This was due to both the steady increase in computational power (see Moore's law) and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing.[3] Some of the earliest-used machine learning algorithms, such as decision trees, produced systems of hard if-then rules similar to existing hand-written rules.
However, part-of-speech tagging introduced the use of hidden Markov models to natural language processing, and increasingly, research has focused on statistical models, which make soft, probabilistic dec

#Todo: Explore other options in gensim summarizer, what are possible shortcomings (e.g., sensitive to input's format etc)