### Request JSON data from an API endpoint

In this section, we will use the **requests** module to obtain JSON data from an api endpoint supplied by the UCSF Profiles application.  Because python can easily retrieve JSON data as a dictionary structure, we will be able to use techniques for python dictionaries to parse and analyze the information. 

### A note on this data...

The service we will use in this section is an api provided by the UCSF Profiles application.

http://profiles.ucsf.edu/about/ForDevelopers.aspx

You might want to take a few minutes to visit this site, follow the "learn more and get started" link, and read about the service, the api, and how different apps are using it.  

For this tutorial, we'll use the sample JSON link provided by the documentation:

http://api.profiles.ucsf.edu/json/v2/?ProfilesURLName=kirsten.bibbins-domingo&source=YourURLGoesHere&publications=full

First, let's start by importing the requests module to retrieve data from the web.  

In [1]:
import requests

We will use on a small part of this library - you may want to look through the tutorial and documentation at http://docs.python-requests.org/en/master/.  

You can pass request parameters to the web service through the URL itself or through a payload.  Both will work, though the payload approach can be useful when the amount of data you need to send to the URL becomes unwieldy.  Both approaches are shown below, though the direct inclusion in the URL method is commented out.

In [2]:
#url = 'http://api.profiles.ucsf.edu/json/v2/?ProfilesURLName=kirsten.bibbins-domingo&source=YourURLGoesHere&publications=full'
url = 'http://api.profiles.ucsf.edu/json/v2/'
payload = {'ProfilesURLName': 'kirsten.bibbins-domingo', 'source':'YourURLGoesHere', 'publications':'full'}

In [3]:
# r = requests.get(url)
r = requests.get(url, payload)

The requests module provides a method to view the resulting url from a payload - this can also be useful if you'd like to look at the page directly in your browser.  

In [4]:
r.url

'http://api.profiles.ucsf.edu/json/v2/?ProfilesURLName=kirsten.bibbins-domingo&source=YourURLGoesHere&publications=full'

And convert it to JSON

In [5]:
data = r.json()

In [6]:
#uncomment to see the raw JSON response
#data

We can check the type to verify that the data we retrieved is now stored in python as a dictionary

In [7]:
type(data)

dict

You can investigate the keys by visually inspecting the document, but JSON responses can be long, nested, and complex.  An easier way is to list the keys.

In [8]:
data.keys()

dict_keys(['api_notes', 'Profiles'])

Let's take a look at the information data in the 'Profiles' section of this dictionary.  

In [9]:
profiles = data['Profiles']

As before, a dictionary can contain either primitives or objects, including other data structures such as lists or other dictionaries

In [10]:
#commented out for length
#profiles

Looking at this data, we can see the [] denoting a list.  Let's check the type and length

In [11]:
print(type(profiles))
print(len(profiles))

<class 'list'>
1


There is only one element to this list, so let's grab it and take a look...

In [12]:
profile = profiles[0]

In [13]:
#profile

And in this case, we're back to dealing with a dictionary.  So far, we have a list containing dictionaries nested in a list.  Let's check the keys again.

In [14]:
profile.keys()

dict_keys(['ResearchActivitiesAndFunding', 'FirstName', 'PublicationCount', 'Address', 'Publications', 'Keywords', 'Title', 'Department', 'MediaLinks_beta', 'AwardOrHonors', 'ClinicalTrials', 'Titles', 'LastName', 'ProfilesURL', 'NIHGrants_beta', 'Email', 'Narrative', 'School', 'PhotoURL', 'WebLinks_beta', 'Education_Training', 'GlobalHealth_beta', 'Twitter_beta', 'Name', 'SlideShare_beta', 'Videos', 'FreetextKeywords'])

We're back to a list, again, of length one (feel free to check as an exercise).  Let's grab the first (and only) element here...

In [15]:
profileData = profiles[0]

... and take a look at the keys

In [16]:
profileData.keys()

dict_keys(['ResearchActivitiesAndFunding', 'FirstName', 'PublicationCount', 'Address', 'Publications', 'Keywords', 'Title', 'Department', 'MediaLinks_beta', 'AwardOrHonors', 'ClinicalTrials', 'Titles', 'LastName', 'ProfilesURL', 'NIHGrants_beta', 'Email', 'Narrative', 'School', 'PhotoURL', 'WebLinks_beta', 'Education_Training', 'GlobalHealth_beta', 'Twitter_beta', 'Name', 'SlideShare_beta', 'Videos', 'FreetextKeywords'])

At this point, you might want to try looking into different elemets of this list.  Some are data points, containing no further nested data, others contain lists or dictionaries.  we'll take a closer look at **ResearchActivitiesAndFunding**.

In [17]:
researchAndFunding = profileData['ResearchActivitiesAndFunding']

This key maps to a list, this time with a larger (and, it turns out, variable) number of entries.  

In [18]:
len(researchAndFunding)

18

Let's take a look at the first

In [19]:
researchAndFundingData = researchAndFunding[0]

In [20]:
researchAndFundingData

{'EndDate': '2024-06-30',
 'Role': 'Co-Principal Investigator',
 'SponsorAwardID': 'TL4GM118986',
 'StartDate': '2014-09-26',
 'Sponsor': 'NIH/NIGMS',
 'Title': 'SF BUILD: Enabling full representation in science'}

You can see, from the key and value pairs, that this dictionary does not contain any further nested data, each key maps to a string value.

From this final node in the JSON tree, we can get at the metadata for a particular publication, including the publication Title.

In [21]:
researchAndFundingData['Title']

'SF BUILD: Enabling full representation in science'

***Exercise*** 

Try writing a loop to get just the research and funding data titles for a researcher