the code flow to generate summarize text:-

Input article → split into sentences → remove stop words → build a similarity matrix → generate rank based on matrix → pick top N sentences for summary.

# nltk ( cosine_Similarity - PageRank )

In [1]:
import nltk
from nltk.corpus import stopwords
from nltk.cluster.util import cosine_distance 
import numpy as np 
import networkx as nx

In [2]:
text = """In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub
          which has been launched to empower the next generation of students with AI-ready skills.
         Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100
          institutions with AI infrastructure, course content and curriculum, developer support,
          development tools and give students access to cloud and AI services.
          As part of the program, the Redmond giant which wants to expand its reach and is
          planning to build a strong developer ecosystem in India with the program will set up the
          core AI infrastructure and IoT Hub for the selected campuses.
          The company will provide AI development tools and Azure AI services such as
          Microsoft Cognitive Services, Bot Services and Azure Machine Learning.
          According to Manish Prakash, Country General Manager-PS, Health and Education,
          Microsoft India, said, "With AI being the defining technology of our time,
          it is transforming lives and industry and the jobs of tomorrow will
          require a different skillset. This will require more collaborations and
          training and working with AI. That’s why it has become more critical than ever for
          educational institutions to integrate new cloud and AI technologies.
          The program is an attempt to ramp up the institutional set-up and build
          capabilities among the educators to educate the workforce of tomorrow."
          The program aims to build up the cognitive skills and in-depth understanding of
          developing intelligent cloud connected solutions for applications across industry.
          Earlier in April this year, the company announced Microsoft Professional
          Program In AI as a learning track open to the public.
          The program was developed to provide job ready skills to programmers who wanted to hone their
          skills in AI and data science with a series of online courses which featured hands-on labs and expert instructors as well.
          This program also included developer-focused AI school that provided a bunch of assets to help build AI skills."""


In [3]:
# remove duplicated spaces
text = ' '.join(text.split())


In [4]:
def tokenize(text):
    
    sentences = nltk.sent_tokenize(text)
    words = [nltk.word_tokenize(sent) for sent in sentences]

    return words

In [5]:
def sentence_similarity(sent1 , sent2 ,stopwords = None):
    if stopwords is None:
        stopwords = []
    sent1 = [w.lower() for w in sent1]
    sent2 = [w.lower() for w in sent2]
    
    all_words = list(set(sent1 + sent2))
    
    vector1 = [0] * len(all_words)
    vector2 = [0] * len(all_words)
    
    # build the vector for the first sentence
    for w in sent1:
        if w not in stopwords:
            vector1[all_words.index(w)] +=1
    # build the vector for the secound sentence
    
    for w in sent2:
        if w not in stopwords:
            vector2[all_words.index(w)] +=1
            
    return 1 - cosine_distance(vector1 , vector2)
        

In [6]:
def build_similarity_matrix(sentences , stopwords):
    # create an empty similarity matrix
    similarity_matrix = np.zeros((len(sentences),len(sentences)))
    for i in range(len(sentences)):
        for j in range(len(sentences)):
            if i == j:  # if the both are same > ignore
                continue
            similarity_matrix[i][j] = sentence_similarity(sentences[i] , sentences[j] ,stopwords)
            
    return similarity_matrix
            
    

In [7]:
stop_words = stopwords.words('english')
summarize_text = []

# Step 1 - tokenize
sentences =  tokenize(text)
sentence_similarity_martix = build_similarity_matrix(sentences, stop_words)
sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_martix)
scores = nx.pagerank(sentence_similarity_graph)
ranked_sentence = sorted(((scores[i],s) for i,s in enumerate(sentences)), reverse=True)   


In [8]:
def generate_summary(text, top_n=5):
    stop_words = stopwords.words('english')
    summarize_text = []

    # Step 1 - tokenize
    sentences =  tokenize(text)

    # Step 2 - Generate Similary Martix across sentences
    sentence_similarity_martix = build_similarity_matrix(sentences, stop_words)

    # Step 3 - Rank sentences in similarity martix
    sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_martix)
    scores = nx.pagerank(sentence_similarity_graph)

    # Step 4 - Sort the rank and pick top sentences
    ranked_sentence = sorted(((scores[i],s) for i,s in enumerate(sentences)), reverse=True)    
#     print("Indexes of top ranked_sentence order are ", ranked_sentence)    

    for i in range(top_n):
        summarize_text.append(" ".join(ranked_sentence[i][1]))

    # Step 5 - output the summarize texr
    return "".join(summarize_text)


In [9]:
summarize_text = generate_summary(text)
print("Summarize Text: \n",summarize_text)

Summarize Text: 
 Envisioned as a three-year collaborative program , Intelligent Cloud Hub will support around 100 institutions with AI infrastructure , course content and curriculum , developer support , development tools and give students access to cloud and AI services .This program also included developer-focused AI school that provided a bunch of assets to help build AI skills .Earlier in April this year , the company announced Microsoft Professional Program In AI as a learning track open to the public .As part of the program , the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses .The company will provide AI development tools and Azure AI services such as Microsoft Cognitive Services , Bot Services and Azure Machine Learning .


In [10]:
print("length Before {} \nlength After {}  ".format(len(text) , len(summarize_text)))

length Before 2016 
length After 877  


# spacy ( Ranking )

In [11]:
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

In [12]:
stopwords = list(STOP_WORDS)


In [13]:
nlp = spacy.load('en_core_web_sm')

In [14]:
doc = nlp(text)

In [15]:
tokens = [token.text for token in doc]

In [16]:
punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [17]:
# get the frequency for each word
word_frequencies = {}
for word in doc:
    if word.text.lower() not in stopwords:
        if word.text.lower() not in punctuation:
            if word.text not in word_frequencies.keys():
                word_frequencies[word.text] = 1
            else:
                word_frequencies[word.text] +=1

In [18]:
max_freqyency = max(word_frequencies.values())
max_freqyency

14

In [19]:
# normalize frequencies
for word in word_frequencies.keys():
    word_frequencies[word] = word_frequencies[word] / max_freqyency 
    

In [20]:
sent_tokens = [sent for sent in doc.sents]


In [21]:
# get score for each sentence
sentence_score = {}
for sent in sent_tokens:
    for word in sent:
        if word.text.lower() in word_frequencies.keys():
            if sent not in sentence_score.keys():
                sentence_score[sent] = word_frequencies[word.text.lower()]
            else:
                sentence_score[sent] += word_frequencies[word.text.lower()] 

In [22]:
from heapq import nlargest

In [23]:
# select 30% sumarization
select_length = int(len(sent_tokens)*0.3)
select_length

3

In [24]:
summary = nlargest(select_length , sentence_score, key =sentence_score.get)
summary

[Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services.,
 The program was developed to provide job ready skills to programmers who wanted to hone their skills in AI and data science with a series of online courses which featured hands-on labs and expert instructors as well.,
 As part of the program, the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses.]

In [25]:
final = [word.text for word in summary]
summary = ' '.join(final)

### review results

In [26]:
print(summary)
print('-'*50)
print("length Before {} \nlength After {}  ".format(len(text) , len(summary)))

Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services. The program was developed to provide job ready skills to programmers who wanted to hone their skills in AI and data science with a series of online courses which featured hands-on labs and expert instructors as well. As part of the program, the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses.
--------------------------------------------------
length Before 2016 
length After 700  


# gensim (built-in Function )

In [27]:
from gensim.summarization.summarizer import summarize

In [28]:
summary = summarize(text) # defult  ratio=0.2
print(summary)
print('-'*50)
print("length Before {} \nlength After {}  ".format(len(text) , len(summary)))

In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills.
Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services.
--------------------------------------------------
length Before 2016 
length After 428  


### gensim ratio 40%

In [29]:
summary = summarize(text , ratio=0.4)
print(summary)
print('-'*50)
print("length Before {} \nlength After {}  ".format(len(text) , len(summary)))

In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills.
Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services.
According to Manish Prakash, Country General Manager-PS, Health and Education, Microsoft India, said, "With AI being the defining technology of our time, it is transforming lives and industry and the jobs of tomorrow will require a different skillset.
The program is an attempt to ramp up the institutional set-up and build capabilities among the educators to educate the workforce of tomorrow." The program aims to build up the cognitive skills and in-depth understanding of developing intelligent cloud connected solutions for applications across industry.
------------

### gensim  ratio 35%

In [30]:
summary = summarize(text ,ratio=0.35)
print(summary)
print('-'*50)
print("length Before {} \nlength After {}  ".format(len(text) , len(summary)))

In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills.
Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services.
The program is an attempt to ramp up the institutional set-up and build capabilities among the educators to educate the workforce of tomorrow." The program aims to build up the cognitive skills and in-depth understanding of developing intelligent cloud connected solutions for applications across industry.
--------------------------------------------------
length Before 2016 
length After 735  
