The steps to summarize the biological strategies in research paper (extraction-based approach): 
$\newline$ Using: https://becominghuman.ai/text-summarization-in-5-steps-using-nltk-65b21e352b65
$\newline$ 1. Import libraries needed.
$\newline$ 2. The string library will help deal with string operations and heapq provides a quick function for finding the n largest number in a set. The text variable will store the actual text to be summarized. In this example, it’s left blank for brevity.
$\newline$ 3. The next lines introduce the variable length, which defines the length of the summary in number of sentences provided. If the length of the text isn’t know ahead of time, choosing a dynamic approach provides a lot of flexibility. To count the number of sentences in the text, the number of periods followed by a space is counted.
$\newline$ 4. The text is then processed to focus on the key symbols. The first two lines remove any punctuation in the text. The next line removes any stop words, which are short words that don’t provide much meaning to a sentence, such as “the” or “then.”
$\newline$ 5. Next, a dictionary is created. For every word in the text, the word is stored in the dictionary as a key value and its number of occurrences as its value. This step functionary provides a list of words in the text and the number of times they appear.
$\newline$ 6. To make the number of occurrences unitless, all occurrences are divided by the maximum number of occurrences. This gives all the words a score between 0 and 1, which ranks the importance of the word within the text. Words which appear more frequently are given a higher score and are presumed more important.
$\newline$ 7. The first line simply creates a list of all sentences in the text.
The next step is to create an empty dictionary which will assign each sentence of score. This is accomplished by going through each sentence word by word. For every word in the sentence, the word’s previously defined importance is referenced and then added to the sentence’s score.
The more important words a sentence has, the higher the score that’s assigned to that sentence.
Note that this necessarily creates a bias for longer sentences, which isn’t inherently a bad thing. In fact, returning longer sentences usually provides a little more context than shorter sentence, making the overall summary more comprehensible.

In [1]:
import nltk
import string
from heapq import nlargest

In [2]:
text = "While scanning the water for these hydrodynamic signals at a swimming speed in the order of meters per second, the seal keeps its long and flexible whiskers in an abducted position, largely perpendicular to the swimming direction. Remarkably, the whiskers of harbor seals possess a specialized undulated surface structure, the function of which was, up to now, unknown. Here, we show that this structure effectively changes the vortex street behind the whiskers and reduces the vibrations that would otherwise be induced by the shedding of vortices from the whiskers (vortex-induced vibrations). Using force measurements, flow measurements and numerical simulations, we find that the dynamic forces on harbor seal whiskers are, by at least an order of magnitude, lower than those on sea lion (Zalophus californianus) whiskers, which do not share the undulated structure. The results are discussed in the light of pinniped sensory biology and potential biomimetic applications."
if text.count(". ") > 150:
    length = int(round(text.count(". ")/10, 0))
# Otherwise return five sentences
else:
    length = 1

In [4]:
# Remove punctuation and stopwords:
rmvp = [char for char in text if char not in string.punctuation]
rmvp = ''.join(nopunc)
new_text =[word for word in rmvp.split() if word.lower() not in nltk.corpus.stopwords.words('english')]

In [5]:
word_frequency = {}

for word in new_text:
    if word not in word_frequency:
        word_frequency[word] = 1
    else:
        word_frequency[word] = word_frequency[word] + 1

In [7]:
maxfreq = max(word_frequency.values())
for word in word_frequency.keys():
    word_frequency[word] = (word_frequency[word]/maxfreq)

In [10]:
# Create a list of the sentences in the text
slist = nltk.sent_tokenize(text)
# Create an empty dictionary to store sentence scores
sscore = {}
for sent in slist:
    for word in nltk.word_tokenize(sent.lower()):
        if word in word_frequency.keys():
            if sent not in sscore.keys():
                sscore[sent] = word_frequency[word]
            else:
                sscore[sent] = sscore[sent] + word_frequency[word]

In [11]:
summary = nlargest(length, sscore, key = sscore.get)
summ = ' '.join(summary)
print(summ)

Using force measurements, flow measurements and numerical simulations, we find that the dynamic forces on harbor seal whiskers are, by at least an order of magnitude, lower than those on sea lion (Zalophus californianus) whiskers, which do not share the undulated structure.


In [21]:
######### TRYING ANOTHER METHOD #############
#https://towardsdatascience.com/understand-text-summarization-and-create-your-own-summarizer-in-python-b26a9f09fc70

In [16]:
from nltk.corpus import stopwords
from nltk.cluster.util import cosine_distance
import numpy as np
import networkx as nx

In [17]:
def read_article(file_name):
    file = open(file_name, "r")
    filedata = file.readlines()
    article = filedata[0].split(". ")
    sentences = ["While scanning the water for these hydrodynamic signals at a swimming speed in the order of meters per second, the seal keeps its long and flexible whiskers in an abducted position, largely perpendicular to the swimming direction. Remarkably, the whiskers of harbor seals possess a specialized undulated surface structure, the function of which was, up to now, unknown. Here, we show that this structure effectively changes the vortex street behind the whiskers and reduces the vibrations that would otherwise be induced by the shedding of vortices from the whiskers (vortex-induced vibrations). Using force measurements, flow measurements and numerical simulations, we find that the dynamic forces on harbor seal whiskers are, by at least an order of magnitude, lower than those on sea lion (Zalophus californianus) whiskers, which do not share the undulated structure. The results are discussed in the light of pinniped sensory biology and potential biomimetic applications."]

    for sentence in article:
        print(sentence)
        sentences.append(sentence.replace("[^a-zA-Z]", " ").split(" "))
    sentences.pop() 
    
    return sentences

In [18]:
def sentence_similarity(sent1, sent2, stopwords=None):
    if stopwords is None:
        stopwords = []
 
    sent1 = [w.lower() for w in sent1]
    sent2 = [w.lower() for w in sent2]
 
    all_words = list(set(sent1 + sent2))
 
    vector1 = [0] * len(all_words)
    vector2 = [0] * len(all_words)
 
    # build the vector for the first sentence
    for w in sent1:
        if w in stopwords:
            continue
        vector1[all_words.index(w)] += 1
 
    # build the vector for the second sentence
    for w in sent2:
        if w in stopwords:
            continue
        vector2[all_words.index(w)] += 1
 
    return 1 - cosine_distance(vector1, vector2)

In [19]:
def build_similarity_matrix(sentences, stop_words):
    # Create an empty similarity matrix
    similarity_matrix = np.zeros((len(sentences), len(sentences)))
 
    for idx1 in range(len(sentences)):
        for idx2 in range(len(sentences)):
            if idx1 == idx2: #ignore if both are same sentences
                continue 
            similarity_matrix[idx1][idx2] = sentence_similarity(sentences[idx1], sentences[idx2], stop_words)

    return similarity_matrix

In [20]:
def generate_summary(file_name, top_n=5):
    stop_words = stopwords.words('english')
    summarize_text = []

    # Step 1 - Read text anc split it
    sentences =  read_article(file_name)

    # Step 2 - Generate Similary Martix across sentences
    sentence_similarity_martix = build_similarity_matrix(sentences, stop_words)

    # Step 3 - Rank sentences in similarity martix
    sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_martix)
    scores = nx.pagerank(sentence_similarity_graph)

    # Step 4 - Sort the rank and pick top sentences
    ranked_sentence = sorted(((scores[i],s) for i,s in enumerate(sentences)), reverse=True)    
    print("Indexes of top ranked_sentence order are ", ranked_sentence)    

    for i in range(top_n):
        summarize_text.append(" ".join(ranked_sentence[i][1]))

    # Step 5 - Offcourse, output the summarize texr
    print("Summarize Text: \n", ". ".join(summarize_text))

# let's begin
generate_summary( "Abstract_textextraction.txt", 2)

While scanning the water for these hydrodynamic signals at a swimming speed in the order of meters per second, the seal keeps its long and flexible whiskers in an abducted position, largely perpendicular to the swimming direction
Remarkably, the whiskers of harbor seals possess a specialized undulated surface structure, the function of which was, up to now, unknown
Here, we show that this structure effectively changes the vortex street behind the whiskers and reduces the vibrations that would otherwise be induced by the shedding of vortices from the whiskers (vortex-induced vibrations)
Using force measurements, flow measurements and numerical simulations, we find that the dynamic forces on harbor seal whiskers are, by at least an order of magnitude, lower than those on sea lion (Zalophus californianus) whiskers, which do not share the undulated structure
The results are discussed in the light of pinniped sensory biology and potential biomimetic applications.
Indexes of top ranked_sente