# Jupyter Notebooks

By Tom Hohenstein

You use Jupyter Notebooks to: 

+ Execute code 
+ Document and explain your work using markdown 
+ Share your work with others 
+ Present your work


Let's look at an example... 

In [9]:
my_list = [1, 3, 6, 5, 7, 8, 10]
for i in my_list: 
    x = i**2
    print("the square of " + str(i) + " is " + str(x))

the square of 1 is 1
the square of 3 is 9
the square of 6 is 36
the square of 5 is 25
the square of 7 is 49
the square of 8 is 64
the square of 10 is 100


## Converting NCBI XML to Bibtex 



### Import Modules and Libraries 

In [25]:
# work with xml like json -> https://github.com/martinblech/xmltodict
import xmltodict

# make simple http requests -> http://docs.python-requests.org/en/master/
import requests 

# work with json -> https://docs.python.org/2/library/json.html
import json 

In [27]:
x = [1, 2, 3, 4]
for i in x: 
    print(i * 2)

2
4
6
8


### Functions for our work

### getDOI 

getDOI takes an article's title and author to query the Crossref API to obtain a DOI. The function returns the DOI from the first result Crossref gives us. 

In [29]:
def getDOI(title): 
    url = "http://api.crossref.org/works?"
    a = "query.author=Wong"
    title.replace(" ", "+")
    t = "&query.title=" + title
    query = url + a + t 
    r = requests.get(query)
    j = json.loads(r.text)
    doi = j["message"]["items"][0]["DOI"]
    return doi

### doi2bib

doi2bib takes a doi and returns the item's metadata as bibtex. I used [Joshua Ryan Smith's gist](https://gist.github.com/jrsmith3/5513926) for this function. 


In [30]:
###  from https://gist.github.com/jrsmith3/5513926 

def doi2bib(doi):
    url = "http://dx.doi.org/" + doi
    headers = {"accept": "application/x-bibtex"}
    r = requests.get(url, headers = headers)
    return r.text

### Let's do some work! 

### Let's load the xml file we have 

In [31]:
### note the file must be in the same folder as this notebook 
with open("WongJY_NCBI.xml") as f: 
    doc = xmltodict.parse(f.read())

### Alright, now we can loop through our xml and:

+ find the title for the current item 
+ get the doi using getDOI  
+ get the bibtex data using doi2bib 
+ open a file "wong.bib" and append our bibtex data


In [21]:
for item in doc["DocumentSummarySet"]["DocumentSummary"]:
    title = item['Item']["Title"]   
    doi = getDOI(title)
    bibtex = doi2bib(doi)
    with open("wong.bib", "a") as myfile:
        myfile.write(bibtex)
  

### Let's check our work 

A little unix here but the gist is I'm counting the number of "@" in "wong.bib"


In [32]:
%%bash 
grep -o "@" wong.bib | wc -l

     186


## Thank you : ) 

In [29]:
#
# Things to ignore below!!!
#
# aka - these are not part of the presenation but format the presentation 

In [23]:
%%html
<style>
    body.rise-enabled:after {
      content: url(pres/bu-library-logo.svg);
      position: fixed;
      bottom: 3.5em;
      left: 3.5em; 
    } 
    .fa-4x {
        font-size: 2em;
    }
    #help_b, #help_b:before {
        position: fixed;
        top: 0.5em !important;
        right: 0.6em !important;
        opacity: 0.6;
    }
    .rise-enabled .reveal .slide-number{
        display: none; 
    }
</style> 