In [1]:
# -*- coding: utf-8 -*-

# REST client for PubMed API


[Europe PubMed Central](https://europepmc.org/RestfulWebService) offers a REST API.

For example, http://www.ebi.ac.uk/europepmc/webservices/rest/search?query=p53&format=json gives:

    resultList: {
        result: [
            {
            id: "26174762",
            pmid: "26174762",
            title: "Combining intracellular antibodies to restore function of mutated p53 in cancer.",
            authorString: "Chan G, Jordaan G, Nishimura RN, Weisbart RH.",
            journalTitle: "Int J Cancer",
            [...]
            doi: "10.1002/ijc.29685"
            }
            [...]
        }
    }
    
Let's crawl that API programatically:

In [4]:
from __future__ import print_function
import requests # pip install requests

API_BASE = 'http://www.ebi.ac.uk/europepmc/webservices/rest/search?query={}&format=json&resulttype=core&pageSize=1000'

query = 'p53' # an example of a protein, but you can specify anything...

res = requests.get(API_BASE.format(query))
json = res.json()
json[u'resultList'][u'result'][0] # inspect 1st result to see what fields are available.

{u'abstractText': u'TP53 is a tumor suppressor gene that is mutated in 50% of cancers, and its function is tightly regulated by the E3 ligase, Mdm2. Both p53 and Mdm2 are localized in the cell nucleus, a site that is impervious to therapeutic regulation by most antibodies. We identified a cell-penetrating lupus monoclonal anti-DNA antibody, mAb 3E10, that targets the nucleus, and we engineered mAb 3E10 to function as an intranuclear transport system to deliver therapeutic antibodies into the nucleus as bispecific single chain Fv (scFv) fragments. Bispecific scFvs composed of 3E10 include PAb421 (3E10-PAb421) that binds p53 and restores the function of mutated p53, and 3G5 (3E10-3G5) that binds Mdm2 and prevents destruction of p53 by Mdm2. We documented the therapeutic efficacy of these bispecific scFvs separately in previous studies. In this study, we show that combination therapy with 3E10-PAb421 and 3E10-3G5 augments growth inhibition of cells with p53 mutations compared to the effec

In [7]:
import string
def sanitize(r, s):
    try:
        return string.replace(r[s].encode('utf-8'), '\t', ' ')
    except:
        return ''

results = json[u'resultList'][u'result']
with open('pubmed_results.tsv', 'w') as outf:
    for i, r in enumerate(results):
        print(i, r['title'][:60], 'http://www.ncbi.nlm.nih.gov/pubmed/{}'.format(r['pmid']))
        outf.write('{}\t{}\t{}\t{}\n'.format(
                sanitize(r, 'pmid'),
                sanitize(r, 'title'),
                sanitize(r, 'authorString'),
                sanitize(r, 'abstractText')))

0 Combining intracellular antibodies to restore function of mu http://www.ncbi.nlm.nih.gov/pubmed/26174762
1 PDL1 Regulation by p53 via miR-34. http://www.ncbi.nlm.nih.gov/pubmed/26577528
2 p14(ARF) Prevents Proliferation of Aneuploid Cells by Induci http://www.ncbi.nlm.nih.gov/pubmed/25752701
3 Intelligent DNA machine for the ultrasensitive colorimetric  http://www.ncbi.nlm.nih.gov/pubmed/26291961
4 Reactivation of p53 by a Cytoskeletal Sensor to Control the  http://www.ncbi.nlm.nih.gov/pubmed/26464464
5 Calcitonin Gene-Related Peptide Improves Hypoxia-Induced Inf http://www.ncbi.nlm.nih.gov/pubmed/26430901
6 Sevoflurane Preconditioning Confers Neuroprotection via Anti http://www.ncbi.nlm.nih.gov/pubmed/26463923
7 Gallic acid induces apoptosis in human cervical epithelial c http://www.ncbi.nlm.nih.gov/pubmed/26059022
8 S-Adenosylmethionine Affects ERK1/2 and Stat3 Pathways and I http://www.ncbi.nlm.nih.gov/pubmed/26174106
9 Synergistic therapeutic effects of Schiff's base cross-linke 