In [35]:
'''
Automatic summarization using Gensim module.
This summarizer is based on the improved "TextRank" algorithm,
and uses "BM25 ranking function".

To install:
conda install -c anaconda gensim=0.12.4

Extract articles from on of the trusted websites: 
http://www.psychiatrictimes.com/

'''

from gensim.summarization import summarize, keywords
import lxml.html as html

# using of NLTK for removing of stop words, stemming and lemmatization
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
from nltk.stem import WordNetLemmatizer

In [36]:
base_url = 'http://www.psychiatrictimes.com'
# interested only in schizophrenia topic
schizophrenia_path = base_url + '/schizophrenia'
    
main_page = html.parse(schizophrenia_path)
url_xpath = '//div[contains(@class, "pane-content-arguments-panel-pane-")]//div[contains(@class, "field-name-title")]//a/@href'
# will extract articles which posted on the main page (so recent articles)
articles_urls = main_page.getroot().xpath(url_xpath)


In [37]:
articles = []
titles = []
authors = []
publication_dates = []

for url in articles_urls:
    page = html.parse(base_url + url)
    
    article_title = page.getroot().xpath('//div[contains(@class, "pane-page-title")]//h1/text()')
    titles.append(article_title)
    
    article_author = page.getroot().xpath('//div[@class="article-author"]//a/text()')
    authors.append(article_author)
    
    publication_date = page.getroot().xpath('//div[contains(@class, "article-info")]//div[@class="pane-content"]/text()')
    publication_dates.append(publication_date)
    
    full_article_text = page.getroot().xpath('//div[contains(@class, "field-name-body")]//p/text()')
    full_article_text = ''.join(full_article_text)
    articles.append(full_article_text)

In [38]:
print('********** Example of one of the article: *********')
print('The title is: {}'.format(titles[0][0]))
print('The authors are: {}'.format(authors[0][0]))
print('The date of publication is: {}'.format(publication_dates[0][0]))
print(articles[0])

********** Example of one of the article: *********
The title is: Adjunctive Topiramate in People With Schizophrenia
The authors are: Brian Miller, MD, PhD, MPH
The date of publication is: 
    September 29, 2016

Many patients with schizophrenia experience residual symptoms despite currently available treatments that affect quality of life and overall function. Treatment with a variety of different agents—as adjuncts to antipsychotics—has either failed to show consistent, robust effects on psychopathology, or needs replication in larger studies.By contrast, several pharmacologic strategies, including adjunctive topiramate, have been successful in reducing antipsychotic-induced weight gain.Topiramate is approved by the US FDA as an anti-epileptic and anti-migraine treatment. In patients with epilepsy and obesity and/or type 2 diabetes mellitus, topiramate has been associated with weight loss and improved glucose homeostasis, potentially through appetite reduction.Previous quantitative 

In [39]:
# download corpuses
nltk.download("stopwords")
nltk.download('wordnet')

list_words_from_article = articles[0].split()

# remove stop-words
filtered_article_words = [word for word in list_words_from_article if word not in stopwords.words('english')]

wnl = WordNetLemmatizer()

# lemmatization process
lemmatization_words = []
for word in filtered_article_words:
    lemm_word = wordnet_lemmatizer.lemmatize(word)
    lemmatization_words.append(lemm_word)
    print(lemm_word)






[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Ksenia\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\Ksenia\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
Many
patient
schizophrenia
experience
residual
symptom
despite
currently
available
treatment
affect
quality
life
overall
function.
Treatment
variety
different
agents—as
adjunct
antipsychotics—has
either
failed
show
consistent,
robust
effect
psychopathology,
need
replication
larger
studies.By
contrast,
several
pharmacologic
strategies,
including
adjunctive
topiramate,
successful
reducing
antipsychotic-induced
weight
gain.Topiramate
approved
US
FDA
anti-epileptic
anti-migraine
treatment.
In
patient
epilepsy
obesity
and/or
type
2
diabetes
mellitus,
topiramate
associated
weight
loss
improved
glucose
homeostasis,
potentially
appetite
reduction.Previous
quantitative
review
topira

In [40]:
modified_article = ' '.join(lemmatization_words)
print(modified_article)

Many patient schizophrenia experience residual symptom despite currently available treatment affect quality life overall function. Treatment variety different agents—as adjunct antipsychotics—has either failed show consistent, robust effect psychopathology, need replication larger studies.By contrast, several pharmacologic strategies, including adjunctive topiramate, successful reducing antipsychotic-induced weight gain.Topiramate approved US FDA anti-epileptic anti-migraine treatment. In patient epilepsy obesity and/or type 2 diabetes mellitus, topiramate associated weight loss improved glucose homeostasis, potentially appetite reduction.Previous quantitative review topiramate’s effect weight antipsychotic-treated patient included schizophrenia, review effect psychopathology focused patient treated clozapine. conducted meta-analysis randomized trial topiramate, given adjunct antipsychotics, patient schizophrenia. In systematic search PubMed/MEDLINE, researcher looked published study a

In [41]:
print (' ******* Extracted Keywords ******* ')
keywords =  keywords(modified_article)
print(keywords)

 ******* Extracted Keywords ******* 
topiramate
trials
patients
antipsychotics
effect
effects
antipsychotic weight
patient schizophrenia
randomized trial
mean
panss
adjunct
including adjunctive
treatment
search
glucose
scores
included
different
difference
total score
placebo
reduction
psychopathology
anti
bprs
control
paresthesia
paresthesias
outcome
studies
study
smd
version retrieved
scale
discontinuation
review
treated


In [42]:
# ratio (default = 0.2) - to specify what fraction of sentences in the original text should be returned as output
# word_count - to specify the maximum amount of words in the summary
# "split" option if need in a list of strings instead of a single string

print(' ******* Summary ******* ')
summary = summarize(articles[0], word_count=100)
print(summary)

 ******* Summary ******* 
In a systematic search of PubMed/MEDLINE, the researchers looked for all published studies of antipsychotic augmentation with topiramate in patients with schizophrenia-spectrum disorders (both randomized, placebo-controlled trials or open-label trials with an untreated control group).The primary outcome was change in total score on either the Positive and Negative Syndrome Scale (PANSS) or the Brief Psychiatric Rating Scale (BPRS).
There was a trend for more paresthesias with topiramate use (relative risk = 2.0), but otherwise no difference in adverse effects reported in at least 3 trials.The authors found evidence that adjunctive topiramate was associated with significantly greater reductions in psychopathology (particularly in clozapine-treated patients) and body weight.Other than an increase in paresthesias, there were no differences in adverse effects or all-cause discontinuation between topiramate and placebo.


In [43]:
print('********** Example of one of the article: *********')
print('The title is: {}'.format(titles[3][0]))
print('The authors are: {}'.format(authors[3][0]))
print('The date of publication is: {}'.format(publication_dates[3][0]))
print(articles[3])

********** Example of one of the article: *********
The title is: The Virus Connection: How Viruses Affect Psychiatric Pathologies
The authors are: Jacqueline A. Hobbs, MD, PhD, DFAPA
The date of publication is: 
    September 20, 2016
 
September 20, 2016March 20, 2018
This activity offers CE credits for:1. Physicians (CME)2. Other

To understand the pathogenic impact of viruses on mental illness.

At the end of this CME activity, participants should be able to:
• Name viruses that can infect the brain
• Discuss the neurodevelopmental hypothesis of schizophrenia
• Recognize genetic and environmental factors that may contribute to the pathogenesis of mental illness
• Describe possible mechanisms by which viruses may contribute to mental illness

This continuing medical education activity is intended for psychiatrists, psychologists, primary care physicians, physician assistants, nurse practitioners, and other health care professionals who seek to improve their care for patients with me

In [44]:
print(' ******* Summary ******* ')
summary = summarize(articles[3], word_count=200)
print(summary)

 ******* Summary ******* 
This continuing medical education activity is intended for psychiatrists, psychologists, primary care physicians, physician assistants, nurse practitioners, and other health care professionals who seek to improve their care for patients with mental health disorders.
This activity has been planned and implemented in accordance with the Essential Areas and policies of the Accreditation Council for Continuing Medical Education (ACCME) through the joint providership of CME Outfitters, LLC, and .
CME Outfitters, LLC, is accredited by the ACCME to provide continuing medical education for physicians.
Faculty must disclose to the participants any relationships with commercial companies whose products or devices may be mentioned in faculty presentations, or with the commercial supporter of this CME/CE activity.
CME Outfitters, LLC, has evaluated, identified, and attempted to resolve any potential conflicts of interest through a rigorous content validation procedure, us