**Stepwise processor pattern from Mahmoud Hashemi's Enterprise Software with Python course.**

Python3 version of Mahmoud Hashemi's Wikipedia topic summarizer

References:

https://github.com/mahmoud/espymetrics/blob/d4754e597a2f483e2e0b1c3efc8694774227f907/notebooks/stepwise_demo.ipynb
https://www.safaribooksonline.com/library/view/enterprise-software-with/9781491943755/video239885.html

In [240]:
import json
from urllib.request import urlopen

from IPython.display import Image

In [241]:
class TopicSummarizer(object):
    """
    Our stepwise processor that uses Wikipedia to summarize topics.
    
    Just instantiate with the topic name, call .process(), and get_results()
    """
    
    def __init__(self, topic):
        self.topic = topic
        
    def process(self):
        self._fetch_text()
        self._fetch_thumbnail()
        return self
    
    def get_results(self, as_text=False):
        if as_text:
            return self.topic + ' summary:' + self._text
        return TopicSummary(self.topic, self._thumb_url, self._text)
    
    def _fetch_text(self):
        self._text_api_url = TEXT_URL_TMPL.format(title=self.topic)
        self._text_resp = self._get_url_json(self._text_api_url)
        self._text = list(self._text_resp['query']['pages'].values())[0]['extract']
        
    def _fetch_thumbnail(self):
        self._thumb_api_url = THUMB_URL_TMPL.format(title=self.topic)
        self._thumb_resp = self._get_url_json(self._thumb_api_url)
        self._thumb_url = list(self._thumb_resp['query']['pages'].values())[0]['thumbnail']['source']
        
    def _get_url_json(self, url):
        resp = urlopen(url)
        resp_body = resp.read().decode('utf8')
        return json.loads(resp_body)

class TopicSummary(object):
    def __init__(self, topic, thumb_url, text):
        self.topic = topic
        self.thumb_url = thumb_url
        self.text = text
        
    def __repr__(self):
        cn = self.__class__.__name__
        return '%s(%r, %r, %r)' % (cn, self.topic, self.thumb_url, self.text)
    
TEXT_URL_TMPL = 'https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=2&titles={title}&explaintext=1&exintro=1&format=json'
THUMB_URL_TMPL = 'https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles={title}&format=json'

In [242]:
# Demonstration of the summarizer

summarizer = TopicSummarizer('Coffee')
summarizer.process()
summary = summarizer.get_results()
print(summary)

TopicSummary('Coffee', 'https://upload.wikimedia.org/wikipedia/commons/thumb/4/45/A_small_cup_of_coffee.JPG/50px-A_small_cup_of_coffee.JPG', 'Coffee is a brewed drink prepared from roasted coffee beans, which are the seeds of berries from the Coffea plant. The genus Coffea is native to tropical Africa, and Madagascar, the Comoros, Mauritius and Réunion in the Indian Ocean.')


In [243]:
print(summary.text)
Image(url=summary.thumb_url)

Coffee is a brewed drink prepared from roasted coffee beans, which are the seeds of berries from the Coffea plant. The genus Coffea is native to tropical Africa, and Madagascar, the Comoros, Mauritius and Réunion in the Indian Ocean.


In [244]:
print(summarizer.get_results(as_text=True))

Coffee summary:Coffee is a brewed drink prepared from roasted coffee beans, which are the seeds of berries from the Coffea plant. The genus Coffea is native to tropical Africa, and Madagascar, the Comoros, Mauritius and Réunion in the Indian Ocean.


In [239]:
# Introspection of the summarizer

summarizer.__dict__

{'_text': 'Coffee is a brewed drink prepared from roasted coffee beans, which are the seeds of berries from the Coffea plant. The genus Coffea is native to tropical Africa, and Madagascar, the Comoros, Mauritius and Réunion in the Indian Ocean.',
 '_text_api_url': 'https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=2&titles=Coffee&explaintext=1&exintro=1&format=json',
 '_text_resp': {'batchcomplete': '',
  'query': {'pages': {'604727': {'extract': 'Coffee is a brewed drink prepared from roasted coffee beans, which are the seeds of berries from the Coffea plant. The genus Coffea is native to tropical Africa, and Madagascar, the Comoros, Mauritius and Réunion in the Indian Ocean.',
     'ns': 0,
     'pageid': 604727,
     'title': 'Coffee'}}}},
 '_thumb_api_url': 'https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=Coffee&format=json',
 '_thumb_resp': {'batchcomplete': '',
  'query': {'pages': {'604727': {'ns': 0,
     'pageid': 604727,
     'pa