In [1]:
## input text article
article_text="Machine learning (ML) and artificial intelligence (AI) are becoming dominant problem-solving techniques in many areas of research and industry, not least because of the recent successes of deep learning (DL). However, the equation AI=ML=DL, as recently suggested in the news, blogs, and media, falls too short. These fields share the same fundamental hypotheses: computation is a useful way to model intelligent behavior in machines. What kind of computation and how to program it? This is not the right question. Computation neither rules out search, logical, and probabilistic techniques, nor (deep) (un)supervised and reinforcement learning methods, among others, as computational models do include all of them. They complement each other, and the next breakthrough lies not only in pushing each of them but also in combining them.Big Data is no fad. The world is growing at an exponential rate and so is the size of the data collected across the globe. Data is becoming more meaningful and contextually relevant, breaking new grounds for machine learning (ML), in particular for deep learning (DL) and artificial intelligence (AI), moving them out of research labs into production (Jordan and Mitchell, 2015). The problem has shifted from collecting massive amounts of data to understanding it—turning it into knowledge, conclusions, and actions. Multiple research disciplines, from cognitive sciences to biology, finance, physics, and social sciences, as well as many companies believe that data-driven and “intelligent” solutions are necessary to solve many of their key problems. High-throughput genomic and proteomic experiments can be used to enable personalized medicine. Large data sets of search queries can be used to improve information retrieval. Historical climate data can be used to understand global warming and to better predict weather. Large amounts of sensor readings and hyperspectral images of plants can be used to identify drought conditions and to gain insights into when and how stress impacts plant growth and development and in turn how to counterattack the problem of world hunger. Game data can turn pixels into actions within video games, while observational data can help enable robots to understand complex and unstructured environments and to learn manipulation skills.However, is AI, ML, and DL really synonymous, as recently suggested in the news, blogs, and media? For example, when AlphaGo (Silver et al., 2016) defeated South Korean Master Lee Se-dol in the board game Go in 2016, the terms AI, ML, and DL were used by the media to describe how AlphaGo won. In addition to this, even Gartner's list (Panetta, 2017) of top 10 Strategic Trends for 2018 places (narrow) AI at the very top, specifying it as “consisting of highly scoped machine-learning solutions that target a specific task.”"

## Import Modules

In [2]:
import re
import nltk

## Data Preprocessing

In [3]:
article_text = article_text.lower()
article_text

"machine learning (ml) and artificial intelligence (ai) are becoming dominant problem-solving techniques in many areas of research and industry, not least because of the recent successes of deep learning (dl). however, the equation ai=ml=dl, as recently suggested in the news, blogs, and media, falls too short. these fields share the same fundamental hypotheses: computation is a useful way to model intelligent behavior in machines. what kind of computation and how to program it? this is not the right question. computation neither rules out search, logical, and probabilistic techniques, nor (deep) (un)supervised and reinforcement learning methods, among others, as computational models do include all of them. they complement each other, and the next breakthrough lies not only in pushing each of them but also in combining them.big data is no fad. the world is growing at an exponential rate and so is the size of the data collected across the globe. data is becoming more meaningful and conte

In [4]:
# remove spaces, punctuations and numbers
clean_text = re.sub('[^a-zA-Z]', ' ', article_text)
clean_text = re.sub('\s+', ' ', clean_text)
clean_text

'machine learning ml and artificial intelligence ai are becoming dominant problem solving techniques in many areas of research and industry not least because of the recent successes of deep learning dl however the equation ai ml dl as recently suggested in the news blogs and media falls too short these fields share the same fundamental hypotheses computation is a useful way to model intelligent behavior in machines what kind of computation and how to program it this is not the right question computation neither rules out search logical and probabilistic techniques nor deep un supervised and reinforcement learning methods among others as computational models do include all of them they complement each other and the next breakthrough lies not only in pushing each of them but also in combining them big data is no fad the world is growing at an exponential rate and so is the size of the data collected across the globe data is becoming more meaningful and contextually relevant breaking new 

In [5]:
# split into sentence list
sentence_list = nltk.sent_tokenize(article_text)
sentence_list

['machine learning (ml) and artificial intelligence (ai) are becoming dominant problem-solving techniques in many areas of research and industry, not least because of the recent successes of deep learning (dl).',
 'however, the equation ai=ml=dl, as recently suggested in the news, blogs, and media, falls too short.',
 'these fields share the same fundamental hypotheses: computation is a useful way to model intelligent behavior in machines.',
 'what kind of computation and how to program it?',
 'this is not the right question.',
 'computation neither rules out search, logical, and probabilistic techniques, nor (deep) (un)supervised and reinforcement learning methods, among others, as computational models do include all of them.',
 'they complement each other, and the next breakthrough lies not only in pushing each of them but also in combining them.big data is no fad.',
 'the world is growing at an exponential rate and so is the size of the data collected across the globe.',
 'data is b

In [6]:
import nltk
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Gjay3\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

## Word Frequencies

In [7]:
stopwords = nltk.corpus.stopwords.words('english')

word_frequencies = {}
for word in nltk.word_tokenize(clean_text):
    if word not in stopwords:
        if word not in word_frequencies:
            word_frequencies[word] = 1
        else:
            word_frequencies[word] += 1

In [8]:
maximum_frequency = max(word_frequencies.values())

for word in word_frequencies:
    word_frequencies[word] = word_frequencies[word] / maximum_frequency

## Calculate Sentence Scores

In [9]:
sentence_scores = {}

for sentence in sentence_list:
    for word in nltk.word_tokenize(sentence):
        if word in word_frequencies and len(sentence.split(' ')) < 30:
            if sentence not in sentence_scores:
                sentence_scores[sentence] = word_frequencies[word]
            else:
                sentence_scores[sentence] += word_frequencies[word]

In [10]:
word_frequencies

{'machine': 0.3333333333333333,
 'learning': 0.6666666666666666,
 'ml': 0.5555555555555556,
 'artificial': 0.2222222222222222,
 'intelligence': 0.2222222222222222,
 'ai': 0.6666666666666666,
 'becoming': 0.2222222222222222,
 'dominant': 0.1111111111111111,
 'problem': 0.3333333333333333,
 'solving': 0.1111111111111111,
 'techniques': 0.2222222222222222,
 'many': 0.3333333333333333,
 'areas': 0.1111111111111111,
 'research': 0.3333333333333333,
 'industry': 0.1111111111111111,
 'least': 0.1111111111111111,
 'recent': 0.1111111111111111,
 'successes': 0.1111111111111111,
 'deep': 0.3333333333333333,
 'dl': 0.5555555555555556,
 'however': 0.2222222222222222,
 'equation': 0.1111111111111111,
 'recently': 0.2222222222222222,
 'suggested': 0.2222222222222222,
 'news': 0.2222222222222222,
 'blogs': 0.2222222222222222,
 'media': 0.3333333333333333,
 'falls': 0.1111111111111111,
 'short': 0.1111111111111111,
 'fields': 0.1111111111111111,
 'share': 0.1111111111111111,
 'fundamental': 0.11111111

In [11]:
sentence_scores

{'however, the equation ai=ml=dl, as recently suggested in the news, blogs, and media, falls too short.': 1.777777777777778,
 'these fields share the same fundamental hypotheses: computation is a useful way to model intelligent behavior in machines.': 1.5555555555555558,
 'what kind of computation and how to program it?': 0.5555555555555556,
 'this is not the right question.': 0.2222222222222222,
 'computation neither rules out search, logical, and probabilistic techniques, nor (deep) (un)supervised and reinforcement learning methods, among others, as computational models do include all of them.': 3.2222222222222228,
 'they complement each other, and the next breakthrough lies not only in pushing each of them but also in combining them.big data is no fad.': 1.888888888888889,
 'the world is growing at an exponential rate and so is the size of the data collected across the globe.': 2.0,
 'the problem has shifted from collecting massive amounts of data to understanding it—turning it into

## Text Summarization

In [12]:
# get top 5 sentences
import heapq
summary = heapq.nlargest(5, sentence_scores, key=sentence_scores.get)

print(" ".join(summary))

computation neither rules out search, logical, and probabilistic techniques, nor (deep) (un)supervised and reinforcement learning methods, among others, as computational models do include all of them. large data sets of search queries can be used to improve information retrieval. historical climate data can be used to understand global warming and to better predict weather. the problem has shifted from collecting massive amounts of data to understanding it—turning it into knowledge, conclusions, and actions. the world is growing at an exponential rate and so is the size of the data collected across the globe.
