## Importing Packages
### Dependency¶
#### 1. nltk, pandas,numpy, networkx required
#### 2. Need !wget http://nlp.stanford.edu/data/glove.6B.zip -> !unzip glove*.zip

In [1]:
import nltk
nltk.download('punkt') # one time execution
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk.stem.porter import PorterStemmer
from nltk.tokenize import sent_tokenize
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import pandas as pd
import networkx as nx

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\vishw\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\vishw\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


## Extraction

In [2]:
df = pd.read_pickle("./df.pkl")

## Word Embeddings

In [3]:
# Extract word vectors
word_embeddings = {}
f = open("glove.6B.100d.txt", encoding='utf-8')
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    word_embeddings[word] = coefs
f.close()

## Preprocessing, Similarity Matrix, TextRank

In [4]:
def rank_sentence(sentences):
    # remove punctuations, numbers and special characters
    clean_sentences = pd.Series(sentences).str.replace("[^a-zA-Z]", " ")
    
    # make alphabets lowercase
    clean_sentences = [s.lower() for s in clean_sentences]
    
    # function to remove stopwords
    def remove_stopwords(sen):
        sen_new = " ".join([i for i in sen if i not in stop_words])
        return sen_new
    
    stop_words = stopwords.words('english')
    # remove stopwords from the sentences
    clean_sentences = [remove_stopwords(r.split()) for r in clean_sentences]
    
    sentence_vectors = []
    for i in clean_sentences:
        if len(i) != 0:
            v = sum([word_embeddings.get(w, np.zeros((100,))) for w in i.split()])/(len(i.split())+0.001)
        else:
            v = np.zeros((100,))
        sentence_vectors.append(v)
        
    # similarity matrix
    sim_mat = np.zeros([len(sentences), len(sentences)])
    
    for i in range(len(sentences)):
        for j in range(len(sentences)):
            if i != j:
                  sim_mat[i][j] = cosine_similarity(sentence_vectors[i].reshape(1,100), sentence_vectors[j].reshape(1,100))[0,0]
                    
    nx_graph = nx.from_numpy_array(sim_mat)
    scores = nx.eigenvector_centrality(nx_graph, max_iter = 100)
    
    return scores

## One Line Summary

In [5]:
cnt = 1
for s in df['text']:
    sentences = sent_tokenize(s)
    scores = rank_sentence(sentences)

    ranked_sentences = sorted(((scores[i],s) for i,s in enumerate(sentences)), reverse=True)
    print(cnt, " : ", ranked_sentences[0][1])
    cnt += 1

1  :  This has implications for the estimated transmissibility of the coronavirus and as such, these potential scenarios should be explored.
2  :  We designed and screened a group of CRISPR RNAs (crRNAs) targeting conserved viral regions and identified functional crRNAs for cleaving SARS-CoV-2.
3  :  There is no vaccine or approved treatment for this emerging infection; therefore, the objective of this paper is to design a multi epitope peptide vaccine against 2019-nCoV using immunoinformatics approach.
4  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
5  :  There is an imminent need to better understand this new virus and to develop ways to control its spread.
6  :  There is an imminent need to better understand this new virus and to develop ways to control its spread.
7  :  Yes

I understand that all clinical trials and any other prospective interventional s

54  :  was supported by NKFIH KKP 129877.
55  :  about $1.95$ billion people, taking approximately $85$ million lives.
56  :  The number of infections and the number of fatalities in the 2019 novel coronavirus epidemics follows a remarkably regular trend.
57  :  We postulate that the versatility of cell receptor binding strategies has immediate implications on therapeutic strategies.
58  :  Where quarantine is deemed necessary, officials should quarantine for no longer than necessary; provide clear rationale for quarantine and information about protocols; and ensure sufficient supplies are provided.
59  :  Yes

The protocol has been released as a white paper online.
60  :  Yes

The metereological data, models, or code generated or used during the study are available from the corresponding author (Jing Chen) by request.
61  :  We compared and contrasted the Mpro for COVID-19 with a highly similar SARS protein.
62  :  Yes

The data that support the findings of this study are openly avail

107  :  Yes

The datasets used and analysed during the current study are available from the corresponding author on reasonable request
108  :  The travel quarantine of Wuhan delayed the overall epidemic progression by only 3 to 5 days in Mainland China, but has a more marked effect at the international scale, where case importations were reduced by nearly 80% until mid February.
109  :  Yes The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.
110  :  新型冠状病毒肺炎在中国武汉地区流行并蔓延，部分病例需要进行体外膜氧合辅助（extracorporeal membrane oxygenation，ECMO）。为了指导ECMO在新型冠状病毒肺炎救治中的应用，中国医师协会体外生命支持专业委员会组织国内相关专家对ECMO应用时机及辅助模式的选择方面制定了有关建议。本建议书以既往ECMO相关临床研究及国际体外生命支持组织推荐建议为基础，并结合新型冠状病毒肺炎特点对ECMO模式选择进行了相关推荐。.
111  :  When the transmission period decreases from 4 days to 2 days, the outbreak finished early, but the peak of the epidemic has increased, and the total number of patients has not changed much.
112  :  instantaneous reproduction numbers <1) wa

161  :  Wuhan’s migrants have a large proportion of middle-aged and high-risk individuals.
162  :  Whether GX_P2X uses angiotensin-converting enzyme 2 (ACE2) as the cell receptor was investigated by using small interfering RNA (siRNA) -mediated silencing of ACE2.
163  :  To determine the epidemiology of 2019 novel coronavirus disease (COVID-19) in a remote region of China, far from Wuhan, we analyzed the epidemiology of COVID-19 in Gansu Province.
164  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
165  :  The model places the peak in Italy around March 20$^{\rm th}$ 2020, with a maximum number of confirmed infected individuals of about 16,000.
166  :  Up to March 3, 2020, SARS-CoV-2 has infected more than 89,000 people in China and other 66 countries across six continents.
167  :  Yes I understand that all clinical trials and any other prospective interventio

223  :  To facilitate rapid determination of outbreak risk, we propose a reformulation of a classic result from random network theory that relies on contact tracing data to simultaneously determine the first moment ($R_0$) and the higher moments (representing the heterogeneity) in the distribution of secondary infections.
224  :  Yes

No data are used.
225  :  Yes

The line lists used in this study are freely available online at the website of the Laboratory for the Modeling of Biological + Socio-technical Systems of Northeastern University and Google Drive.
226  :  We used a simple logistic growth model that fitted very well with all data reported until the time of writing .
227  :  Yes None
228  :  Yes

The work only publically available data.
229  :  The emergence of a novel, highly pathogenic coronavirus, 2019-nCoV, in China, and its rapid national and international spread pose a global health emergency.
230  :  Yes

The data used to support the findings of this study were provided

285  :  Utilizing two complementary sequencing techniques, we here present a high-resolution map of the SARS-CoV-2 transcriptome and epitranscriptome.
286  :  We report the epidemiological and clinical features of the first patient with 2019-nCoV pneumonia imported into Korea from Wuhan.
287  :  We compare this structure with previously reported models of Nsp15 from SARS and MERS coronaviruses.
288  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
289  :  Yes

The manuscript includes no data.
290  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
291  :  Together, our results identify potential therapeutic options for treatment of MERS-CoV infections and could provide a basis for a wider range of coronaviruses, including the currently emer

337  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
338  :  The transmission potential of 2019-nCoV has been modelled and studied in several recent research works.
339  :  We identified four B cell epitopes, two MHC class-I and nine MHC class-II binding T-cell epitopes, which showed highly antigenic features.
340  :  Yes

The data used to support the findings of this study are included within the article.
341  :  ZYLX201806), National Key R&D Program of China (No.2017YFA0103000), Medical Science Research Project Support by Bethune Charitable Foundation ### Author Declarations All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
342  :  Yes

The data used to support the findings of this study are available from the correspo

389  :  Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
390  :  We applaud the rapid release to the public of the genome sequence of the new virus by Chinese virologists, but we also believe that increased transparency on disease reporting and data sharing with international colleagues are crucial for curbing the spread of this newly emerging virus to other parts of the world.
391  :  We estimated the doubling time, basic reproduction number (R0) and time-varying reproduction number (Rt) of NCP and SARS.
392  :  Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
393  :  Yet, a strong positive correlation between divergence in genic and nongenic markers, and their association with environmental factors suggests that adaptive divergence is reducing 

442  :  cumulative infections) estimated at 983006 (95%CrI: 759475-1296258) in Wuhan City, raising the proportion of infected individuals to 9.8% (95%CrI: 7.6-13.0%).
443  :  cumulative infections) estimated at 983006 (95%CrI: 759475-1296258) in Wuhan City, raising the proportion of infected individuals to 9.8% (95%CrI: 7.6-13.0%).
444  :  We conducted statistical modelling to derive the delay-adjusted asymptomatic proportion of infections, along with the infections’ timeline.
445  :  Yes The present study relies on published data and access information to essential components of the data are available from the corresponding author.
446  :  received funding from the Japan Agency for Medical Research and Development (AMED) [grant number: JP18fk0108050]; the Japan Society for the Promotion of Science (JSPS) KAKENHI [grant numbers, H.N.
447  :  is a Wellcome Trust clinical career development fellow, supported by grant number 205228/Z/16/Z.
448  :  Yes

This was a review.
449  :  We will a

503  :  Till date, no vaccine or completely effective drug is available for the cure of COVID-19.
504  :  Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
505  :  We found that macrophages frequently communicate with the CoVs targets through chemokine and phagocytosis signaling, highlighting the importance of tissue macrophages in immune defense and immune pathogenesis.
506  :  Yes

No additional data available.
507  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
508  :  Yes

The imaging or algorithm data used in this study are available upon request.
509  :  Yes None
510  :  Yes

The data used to support the findings of this study are available from the corresponding author upon request.
511  :  has received research funding from Sanofi Pa

556  :  Yes

The data used to support the findings of this study are available from the corresponding author upon request.
557  :  Yes

all original data were saved in Renmin Hospital of Wuhan University, Wuhan, China
558  :  were noted in patients who died with 3 days while PaO2 (54.75 vs 67.45mmHg), CD3% (51.57 vs 60.43%) and CD8% (16.42 vs 23.42%) were significantly depressed.
559  :  Yes

The author promised that all data are availability in the manuscript.
560  :  Yes

The data are available upon request to the authors.
561  :  To examine the growth rate of the outbreak, we aimed to present the first study to report the reproduction number of COVID-19 in South Korea.
562  :  Yes We obtained the daily series of confirmed cases of COVID-19 in South Korea from January 20, 2020 to February, 26, 2020 that are publicly available from the Korea Centers for Disease Control and Prevention (KCDC) <https://www.cdc.go.kr/board/board.es?mid=a30402000000&bid=0030>
563  :  Up to March 3, 2020, S

604  :  is supported by NSF grants 1610429 and 1633381.
605  :  This is the second epidemiological report for coronavirus disease (COVID-19), previously known as novel coronavirus (2019-nCoV), reported in Australia as at 19:00 Australian Eastern Daylight Time [AEDT] 8 February 2020.
606  :  This impedes global research cooperation, which is essential for tackling public health emergencies, and requires unimpeded access to data, analysis tools, and computational infrastructure.
607  :  With the appropriate measures, this number can be brought down to ~7,000-13,000 people.
608  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
609  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
610  :  Viral subgenomic fragments were generated using viral 

661  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
662  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
663  :  Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
664  :  Yes

N/A
665  :  Yes The clinical data of the patients used in the study came from the central hospital of Wuhan and were approved by relevant departments of the hospital.Although the data included in the study included basic information about the patients, laboratory tests and imaging results, and had no privacy implications, the data could not be released at this time, as required by the hospital.
666  :  Yes

The data used to support the findings

712  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
713  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
714  :  There are 22, 4, 2 variations in P, S, and N at the level of amino acid residues.
715  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
716  :  We also proposed the drug combination of DAA and HTA was a promising strategy for anti-virus treatment and proved that S312 showed more advantageous than Oseltamivir to treat advanced influenza diseases in severely infected animals.
717  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an IC

770  :  Yes

The data that support the findings of this study are available from the corresponding author on reasonable request.
771  :  Yes

The data that support the findings of this study are available from the corresponding author on reasonable request.
772  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
773  :  Yes

Individual participant data that underlie the results reported in this article are available after deidentification for investigational purpose.
774  :  We employed a Bayesian framework to infer the time-calibrated phylogeny and the epidemic dynamics represented by the effective reproductive number (Re) changing over time from 33 genomic sequences available from GISAID.
775  :  Yes The data can be obtained directly from the CDCC reports.
776  :  Yes I understand that all clinical trials and any other prospective interventional studies must be 

826  :  Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov.
827  :  Together, these data suggest that HCoV-19 originated from multiple naturally occurring recombination events among those viruses present in bats and other wildlife species.
828  :  The novel coronavirus has the characteristics of rapid transmission, atypical clinical symptoms, and easy to affect both lungs, leading to missed diagnosis and misdiagnosis, as well as difficult to detection and assessment at early stage.
829  :  They are almost identical to each other and share 79.5% sequence identify to SARS-CoV.
830  :  They are almost identical to each other and share 79.5% sequence identify to SARS-CoV.
831  :  The extended TM7 in each BAT1 clamps CLD of ACE2.
832  :  Using a newly reported epidemiological determinants for early 2019-nCoV, the estimated basic reproduction number is in the range [2.2,3.0