# Analyzing food and drug interactions through PubMed abstracts

In [1]:
import pubmed.downloader as pb

The _pubmed.downloader_ module implements a _PubMedQuery_ class which searches the PubMed database for a search term and returns at most _max_\__results_ abstracts. 

In [2]:
search_term = 'ACE inhibitor'
max_results = 3
query = pb.PubMedQuery(search_term, max_results)

After instantiating a _PubMedQuery_ instance, _PubMedQuery.id_\__getter_ should be called to read the ids of the articles of the search results.

In [3]:
ids = query.id_getter()
print ids

28576479,28573254,28571891


In [4]:
query.abstract_getter(ids)

{0: u'Aldosterone breakthrough (ABT) is the condition in which angiotensin converting enzyme inhibitors (ACEIs) and/or angiotensin receptor blockers fail to effectively suppress the activity of the renin angiotensin aldosterone system. The objective of this study was to determine if ABT occurs in dogs with naturally occurring myxomatous mitral valve disease\xa0receiving an ACEI, using the urine aldosterone to creatinine ratio (UAldo:C) as a measure of renin angiotensin aldosterone system activation.',
 1: u'This study includes 39\xa0dogs with myxomatous mitral valve disease. A UAldo:C cut-off definition (derived from a normal population of healthy, adult, and client-owned dogs) was used to determine the prevalence of ABT in this population. Spearman analysis and univariate logistic regression were used to evaluate the relationship between UAldo:C and ABT (yes/no) and eight variables (age, serum K(+) concentration, serum creatinine concentration, ACEI therapy duration and ACEI dosage, f

Becuase of pairing IDs with abstracts is a little bit tricky as some search result files are differently formatted than others, _abstract_\__getter_ just assigns a generic ID (a counter) to the abstracts.

If we went through the IDs one-by-one, the ids could get paired correctly with the abstracts, and then we would have ID and abstract text pairing...but then it would take very long and we are likely not going to use this information.

## Downloading abstracts to local jsons

The _download_\_all_\__abstracts_ method saves _max_\__results_ abstracts to a json file as below.

In [5]:
%%time
# reset the counter before calling the download_all_abstracts method as we have already called it once above
pb.PubMedQuery.COUNT = 0
max_results = 500
pb.download_all_abstracts(search_term, max_results)

Saving to pbabstract1.json
500/50407 downloaded
Saving to pbabstract2.json
1000/50407 downloaded
Saving to pbabstract3.json
1500/50407 downloaded
Saving to pbabstract4.json
2000/50407 downloaded
Saving to pbabstract5.json
2500/50407 downloaded
Saving to pbabstract6.json
3000/50407 downloaded
Saving to pbabstract7.json
3500/50407 downloaded
Saving to pbabstract8.json
4000/50407 downloaded
Saving to pbabstract9.json
4500/50407 downloaded
Saving to pbabstract10.json
5000/50407 downloaded
Saving to pbabstract11.json
5500/50407 downloaded
Saving to pbabstract12.json
6000/50407 downloaded
Saving to pbabstract13.json
6500/50407 downloaded
Saving to pbabstract14.json
7000/50407 downloaded
Saving to pbabstract15.json
7500/50407 downloaded
Saving to pbabstract16.json
8000/50407 downloaded
Saving to pbabstract17.json
8500/50407 downloaded
Saving to pbabstract18.json
9000/50407 downloaded
Saving to pbabstract19.json
9500/50407 downloaded
Saving to pbabstract20.json
10000/50407 downloaded
Saving to

### Adding Jython Code for Parser
<https://github.com/vpekar/stanford-parser-in-jython>

1. Download the parser from http://nlp.stanford.edu/downloads/lex-parser.shtml
2. Unpack into a local dir, put the path to stanford-parser.jar into the classpath for jython
3. Put the path to englishPCFG.ser.gz as an arg to StanfordParser

In [13]:
!java -jar jython.jar stanford.py

Traceback (innermost last):
  (no code object) at line 0
  File "stanford.py", line 127
	        tag = 'Z' if parent == None else parent.value()
	                  ^
SyntaxError: invalid syntax


In [18]:
try:
    assert 'java' in sys.platform
except AssertionError:
    raise Exception("The script should be run from Jython!")

Exception: The script should be run from Jython!

### Code for Training Sentiment Analysis
<https://nlp.stanford.edu/sentiment/code.html>