# 1. Create the word frequency table
we create a dictionary for the word frequency table from the text.
For this, we should only use the words that are not part of the stopWords array.

In [1]:
def _create_frequency_table(text_string) -> dict:

    stopWords = set(stopwords.words("english"))
    words = word_tokenize(text_string)
    ps = PorterStemmer()

    freqTable = dict()
    for word in words:
        word = ps.stem(word)
        if word in stopWords:
            continue
        if word in freqTable:
            freqTable[word] += 1
        else:
            freqTable[word] = 1

    return freqTable

# 2. Tokenize the sentences
Now, we split the text_string in a set of sentences. For this, we will use the inbuilt method from the nltk

In [None]:
sent_tokenize(text)

# 3. Score the sentences: Term frequency
We’re using the Term Frequency method to score each sentence.score a sentence by its words, adding the frequency of every non-stop word in a sentence.

In [2]:
def _score_sentences(sentences, freqTable) -> dict:
    sentenceValue = dict()

    for sentence in sentences:
        word_count_in_sentence = (len(word_tokenize(sentence)))
        for wordValue in freqTable:
            if wordValue in sentence.lower():
                if sentence[:10] in sentenceValue:
                    sentenceValue[sentence[:10]] += freqTable[wordValue]
                else:
                    sentenceValue[sentence[:10]] = freqTable[wordValue]

        sentenceValue[sentence[:10]] = sentenceValue[sentence[:10]] // word_count_in_sentence

    return sentenceValue

# 4. Find the threshold
Here, we are considering the average score of the sentences as a threshold.

In [3]:
def _find_average_score(sentenceValue) -> int:
    sumValues = 0
    for entry in sentenceValue:
        sumValues += sentenceValue[entry]

    # Average value of a sentence from original text
    average = int(sumValues / len(sentenceValue))

    return average

# 5. Generate the summary
Select a sentence for a summarization, If the sentence score is more than the average score.

In [4]:
def _generate_summary(sentences, sentenceValue, threshold):
    sentence_count = 0
    summary = ''

    for sentence in sentences:
        if sentence[:10] in sentenceValue and sentenceValue[sentence[:10]] > (threshold):
            summary += " " + sentence
            sentence_count += 1

    return summary

# let’s summarize(!) the entire algorithm

In [5]:
#text='The generative network generates candidates while the discriminative network evaluates them. The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. The generative networks training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized (are part of the true data distribution)).[1][5]A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically the generator is seeded with randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Backpropagation is applied in both networks so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images.[6] The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.GANs often suffer from a "mode collapse" where they fail to generalize properly, missing entire modes from the input data. For example, a GAN trained on the MNIST dataset containing many samples of each digit, might nevertheless timidly omit a subset of the digits from its output. Some researchers perceive the root problem to be a weak discriminative network that fails to notice the pattern of omission, while others assign blame to a bad choice of objective function. Many solutions have been proposed.[7]'
#text = text.apply(lambda x: " ".join([stemmer.stem(i) for i in re.sub("[^a-zA-Z]", " ", x).split() if i not in words]).lower())
text='In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills. Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services. As part of the program, the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses. The company will provide AI development tools and Azure AI services such as Microsoft Cognitive Services, Bot Services and Azure Machine Learning.According to Manish Prakash, Country General Manager-PS, Health and Education, Microsoft India, said, "With AI being the defining technology of our time, it is transforming lives and industry and the jobs of tomorrow will require a different skillset. This will require more collaborations and training and working with AI. That’s why it has become more critical than ever for educational institutions to integrate new cloud and AI technologies. The program is an attempt to ramp up the institutional set-up and build capabilities among the educators to educate the workforce of tomorrow." The program aims to build up the cognitive skills and in-depth understanding of developing intelligent cloud connected solutions for applications across industry. Earlier in April this year, the company announced Microsoft Professional Program In AI as a learning track open to the public. The program was developed to provide job ready skills to programmers who wanted to hone their skills in AI and data science with a series of online courses which featured hands-on labs and expert instructors as well. This program also included developer-focused AI school that provided a bunch of assets to help build AI skills.'



In [11]:
text

'In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills. Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services. As part of the program, the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses. The company will provide AI development tools and Azure AI services such as Microsoft Cognitive Services, Bot Services and Azure Machine Learning.According to Manish Prakash, Country General Manager-PS, Health and Education, Microsoft India, said, "With AI being the defining technology of our time, it is transforming lives and industry 

In [6]:
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize, sent_tokenize

# 1 Create the word frequency table
#text="the theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages."

freq_table = _create_frequency_table(text)
'''
We already have a sentence tokenizer, so we just need 
to run the sent_tokenize() method to create the array of sentences.
'''

# 2 Tokenize the sentences
sentences = sent_tokenize(text)


# 3 Important Algorithm: score the sentences
sentence_scores = _score_sentences(sentences, freq_table)


# 4 Find the threshold
threshold = _find_average_score(sentence_scores)
print(threshold)


# 5 Important Algorithm: Generate the summary
summary = _generate_summary(sentences, sentence_scores, 1.5 * threshold)

print(summary)

1
 In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills. Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services. Earlier in April this year, the company announced Microsoft Professional Program In AI as a learning track open to the public. This program also included developer-focused AI school that provided a bunch of assets to help build AI skills.


In [7]:
threshold

1

In [14]:
sentence_scores

{'In an atte': 2,
 'Envisioned': 2,
 'As part of': 1,
 'The compan': 1,
 'This will ': 1,
 'That’s why': 1,
 'The progra': 1,
 'Earlier in': 2,
 'This progr': 2}

In [16]:
freq_table

{'In': 2,
 'attempt': 2,
 'build': 5,
 'ai-readi': 2,
 'workforc': 2,
 ',': 14,
 'microsoft': 4,
 'announc': 2,
 'intellig': 3,
 'cloud': 5,
 'hub': 3,
 'ha': 2,
 'launch': 1,
 'empow': 1,
 'next': 1,
 'gener': 2,
 'student': 2,
 'skill': 5,
 '.': 11,
 'envis': 1,
 'three-year': 1,
 'collabor': 2,
 'program': 8,
 'support': 2,
 'around': 1,
 '100': 1,
 'institut': 3,
 'AI': 12,
 'infrastructur': 2,
 'cours': 2,
 'content': 1,
 'curriculum': 1,
 'develop': 6,
 'tool': 2,
 'give': 1,
 'access': 1,
 'servic': 4,
 'As': 1,
 'part': 1,
 'redmond': 1,
 'giant': 1,
 'want': 2,
 'expand': 1,
 'reach': 1,
 'plan': 1,
 'strong': 1,
 'ecosystem': 1,
 'india': 2,
 'set': 1,
 'core': 1,
 'iot': 1,
 'select': 1,
 'campus': 1,
 'compani': 2,
 'provid': 3,
 'azur': 2,
 'cognit': 2,
 'bot': 1,
 'machin': 1,
 'learning.accord': 1,
 'manish': 1,
 'prakash': 1,
 'countri': 1,
 'manager-p': 1,
 'health': 1,
 'educ': 4,
 'said': 1,
 '``': 1,
 'defin': 1,
 'technolog': 2,
 'time': 1,
 'transform': 1,
 'live'