## Import libraries and dataset

In [1]:
import nltk
nltk.download('stopwords')
import re
import heapq

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Louis.Teo\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
file = open('Apple Acquires AI Startup.txt', "r")
filedata = file.readlines()

In [3]:
# view dataset
print(filedata)

['In an attempt to scale up its AI portfolio, Apple has acquired Spain-based AI video startup — Vilynx for approximately $50 million.\n', '\n', 'Reported by Bloomberg, the AI startup — Vilynx is headquartered in Barcelona, which is known to build software using computer vision to analyse a video’s visual, text, and audio content with the goal of “understanding” what’s in the video. This helps it categorising and tagging metadata to the videos, as well as generate automated video previews, and recommend related content to users, according to the company website.\n', '\n', 'Apple told the media that the company typically acquires smaller technology companies from time to time, and with the recent buy, the company could potentially use Vilynx’s technology to help improve a variety of apps. According to the media, Siri, search, Photos, and other apps that rely on Apple are possible candidates as are Apple TV, Music, News, to name a few that are going to be revolutionised with Vilynx’s tech

## Preprocess text

In [4]:
# join the elements in list to become a string of text
text = " ".join(filedata)
print(text)

In an attempt to scale up its AI portfolio, Apple has acquired Spain-based AI video startup — Vilynx for approximately $50 million.
 
 Reported by Bloomberg, the AI startup — Vilynx is headquartered in Barcelona, which is known to build software using computer vision to analyse a video’s visual, text, and audio content with the goal of “understanding” what’s in the video. This helps it categorising and tagging metadata to the videos, as well as generate automated video previews, and recommend related content to users, according to the company website.
 
 Apple told the media that the company typically acquires smaller technology companies from time to time, and with the recent buy, the company could potentially use Vilynx’s technology to help improve a variety of apps. According to the media, Siri, search, Photos, and other apps that rely on Apple are possible candidates as are Apple TV, Music, News, to name a few that are going to be revolutionised with Vilynx’s technology.
 
 With CE

In [5]:
text = re.sub(r'\[[0-9]*\]',' ',text) # replace references number i.e. [1], [10], [20] with empty space, if any..
text = re.sub(r'\s+',' ',text) # replace one or more spaces with a single space
print(text)

In an attempt to scale up its AI portfolio, Apple has acquired Spain-based AI video startup — Vilynx for approximately $50 million. Reported by Bloomberg, the AI startup — Vilynx is headquartered in Barcelona, which is known to build software using computer vision to analyse a video’s visual, text, and audio content with the goal of “understanding” what’s in the video. This helps it categorising and tagging metadata to the videos, as well as generate automated video previews, and recommend related content to users, according to the company website. Apple told the media that the company typically acquires smaller technology companies from time to time, and with the recent buy, the company could potentially use Vilynx’s technology to help improve a variety of apps. According to the media, Siri, search, Photos, and other apps that rely on Apple are possible candidates as are Apple TV, Music, News, to name a few that are going to be revolutionised with Vilynx’s technology. With CEO Tim Coo

In [6]:
# generate clean text for word histogram
clean_text = text.lower() # text with lower case 
clean_text = re.sub(r'\W',' ',clean_text) # replace character other than [a-zA-Z0-9] with empty space
clean_text = re.sub(r'\d',' ',clean_text) # replace digit with empty space
clean_text = re.sub(r'\s+',' ',clean_text) # replace one or more spaces with a single space

print(clean_text)

in an attempt to scale up its ai portfolio apple has acquired spain based ai video startup vilynx for approximately million reported by bloomberg the ai startup vilynx is headquartered in barcelona which is known to build software using computer vision to analyse a video s visual text and audio content with the goal of understanding what s in the video this helps it categorising and tagging metadata to the videos as well as generate automated video previews and recommend related content to users according to the company website apple told the media that the company typically acquires smaller technology companies from time to time and with the recent buy the company could potentially use vilynx s technology to help improve a variety of apps according to the media siri search photos and other apps that rely on apple are possible candidates as are apple tv music news to name a few that are going to be revolutionised with vilynx s technology with ceo tim cook s vision of the potential of a

## Split text into sentences

In [7]:
# split (tokenize) the sentences
sentences = nltk.sent_tokenize(text)
print(sentences)

['In an attempt to scale up its AI portfolio, Apple has acquired Spain-based AI video startup — Vilynx for approximately $50 million.', 'Reported by Bloomberg, the AI startup — Vilynx is headquartered in Barcelona, which is known to build software using computer vision to analyse a video’s visual, text, and audio content with the goal of “understanding” what’s in the video.', 'This helps it categorising and tagging metadata to the videos, as well as generate automated video previews, and recommend related content to users, according to the company website.', 'Apple told the media that the company typically acquires smaller technology companies from time to time, and with the recent buy, the company could potentially use Vilynx’s technology to help improve a variety of apps.', 'According to the media, Siri, search, Photos, and other apps that rely on Apple are possible candidates as are Apple TV, Music, News, to name a few that are going to be revolutionised with Vilynx’s technology.', 

## Remove stop words

In [8]:
# get stop words list
stop_words = nltk.corpus.stopwords.words('english')

## Build word histogram

In [9]:
# create empty dictionary to house the word count
word_count = {}

# loop through tokenized words, remove stop words and save word count to dictionary
for word in nltk.word_tokenize(text):
    # remove stop words
    if word not in stop_words:
        # save word count to dictionary
        if word not in word_count.keys():
            word_count[word] = 1
        else:
            word_count[word] += 1

In [10]:
# convert word counts to weights (maximum value = 1.0)
for key in word_count.keys():
    word_count[key] = word_count[key]/max(word_count.values())

## Rank sentences based on scores

In [11]:
# create empty dictionary to house sentence score    
sentence_score = {}

# loop through tokennized sentence, only take sentence that has less than 25 words, then add word weight to sentence score
for sentence in sentences:
    # check if word in sentence is in word_count dictionary
    for word in nltk.word_tokenize(sentence.lower()):
        if word in word_count.keys():
            # only take sentence that has less than 25 words
            if len(sentence.split(' ')) < 25:
                # add word weight to sentence score
                if sentence not in sentence_score.keys():
                    sentence_score[sentence] = word_count[word]
                else:
                    sentence_score[sentence] += word_count[word]

## Select top sentences for summary

In [12]:
# display the best 3 sentences for summary             
best_sentences = heapq.nlargest(3, sentence_score, key=sentence_score.get)

In [13]:
print('SUMMARY')
print('------------------------')
for sentence in best_sentences:
    print (sentence)
    print('\n')

SUMMARY
------------------------
With CEO Tim Cook’s vision of the potential of augmented reality, the company could also make use of AI-based tools like Vilynx.


With its habit of quietly purchasing smaller companies, Apple is making a mark in the AI space.


In an attempt to scale up its AI portfolio, Apple has acquired Spain-based AI video startup — Vilynx for approximately $50 million.


