# Ex 1. Interacting with online services

In this exercise, we will perform some basic requests obtaining information from the online services.

## Table of Contents

- [Part A: Getting online](#Part-A:-Getting-online)

- [Part B: Obtaining metadata from Crossref](#Part-B:-Obtaining-metadata-from-Crossref)

- [Part C: Obtaining a list of DOIs](#PART-C:-Obtaining-a-list-of-DOIs)

Get everything ready before we start the exercise.

In [None]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append("../modules/orcid-python")
sys.path.append("../modules/pyalm")

import os
#os.environ['PLOS_API_KEY'] = 'user api key'

import requests
import time
import orcid

## Part A: Getting online

Back to [Table of Contents](#Table-of-Contents).

The first part of the exercise is to simply to successfully request a page online and recieve back a 200 HTTP response. We imported the requests library above and will download the google.com page. This exercise is largely to ensure that you have a properly functioning internet connection! 

<div class="alert alert-success">
An example of using the `requests` library is shown in the first notebook from the book chapter. Adapt that code to download the homepage of google.com and show that the HTTP status response code is 200.
</div>

NOTE FOR EHMAD/CHRISTINA: I'm going to start with all the code in there and then take some it out. That at least proves that it works!

In [None]:
# First give the URL that you want to obtain
url = 'http://google.com'

# Then set up the request. Look at example Notebook #1 for this session and copy the names of object that the
# request response will go into.
response = requests.get(url)

In [None]:
assert response.status_code == 200 # If you get the code above working this assertion should pass

## Part B: Obtaining metadata from Crossref

Back to [Table of Contents](#Table-of-Contents).

Now we have a working connection we will obtain some bibliographic metadata from Crossref. In the example notebook we obtained data from a single DOI. First we will replicate that for a different DOI 

<div class="alert alert-success">
Obtain the Crossref metadata for the DOI 10.1038/171740a0 Obtain the names of the authors of this article and its title. 
</div>

In [None]:
DOI = '10.1038/171740a0'
query = 'works/'
urlbase = 'http://api.crossref.org/'

# Now write the code the obtain the relevant JSON metadata, check that the status code is 200.
# In a second cell below uncomment the code to look at the JSON output, then select the author's surnames to print

url = urlbase+query+DOI
response = requests.get(url)
assert response.status_code == 200

In [None]:
j = response.json()
j

In [None]:
# Now use your knowledge of python dictionaries to print the surname of each of the authors and its title
surname1 = j['message']['author'][0]['family']
surname2 = j['message']['author'][1]['family']
title = j['message']['title'][0]
print surname1
print surname2
print title

NOTE FOR EHMAD/CHRISTINA - The authors surnames are Franklin and Gosling (possibly in capitals). This is the Rosalind Franklin article that provided the evidence backing up Watson and Crick's paper on the structure of DNA. Not sure how to set up the marking for this but they should be able to pull out the surnames by inspecting the JSON.

## PART C: Obtaining a list of DOIs

Back to [Table of Contents](#Table-of-Contents).

In this part of the exercise you will do something not covered in the class notes, obtain a set of DOIs from the Crossref search API. In the notes we show collecting DOIs from ORCID. In the second set of exercises we will show an example using a publisher API. 

To prepare for this you will need to look at the API documentation at https://github.com/CrossRef/rest-api-doc/blob/master/rest_api.md. In particular look at the section on queries. You will see that quite sophisticated queries are possible. You may want to test some examples and look at the results below. Think carefully about how you want to construct the URL for your query as you do this. 

NOTE: Not too sure what to test here, just make sure they get back enough results? Probably the top ones should stay the top depending on the query but more an exercise in getting them playing with the parameters than anything else.


In [None]:
# Construct a URL to search and then from the JSON response collect the DOIs

url = 'http://api.crossref.org/works?query=Martin+Karplus'
response = requests.get(url)
j = response.json()

In [None]:
# Now collect the DOIs from the JSON response
dois = []
for item in j['message']['items']:
    dois.append(item['DOI'])

len(dois)

In [None]:
# There should be 20 DOIs because that is the default number of results that the Crossref API returns
assert len(dois) == 20