# Welcome to Session 7 - Getting Web-Based Data

Much useful data is available online and can be accessed with a script. Application Programming Interfaces (APIs) make data available using URLs that are requested (accessed) by the script. The data are delivered in a predicable format for use.

## Requesting Data

### The requests() function

To make a request using a URL, we must use the Python requests function. It is not a core function, so we must import it.

This has two basic aspects:
1) The URL, which specifies the API address and the specific API application, or feature, that is requested
2) Data which forms the context of our request, to pass to the API

We'll use the [World Register of Marine Species (WoRMS) API](https://www.marinespecies.org/aphia.php?p=webservice) which is called Aphia.

In [None]:
import requests

# We'll use the API to get the currently accepted species ID for Sciaenops ocellatus
# The API URL root (used for all Aphia API applications) is https://www.marinespecies.org/rest
# The sepcific API application to get the accepted species ID is /AphiaIDByName/{ScientificName}

sciname = 'Sciaenops ocellatus'
aphiaID = requests.get('https://www.marinespecies.org/rest/AphiaIDByName/{0}'.format(sciname)).json()
print(aphiaID)



Let's take a closer look at our request string

requests.get('https://www.marinespecies.org/rest/AphiaIDByName/{0}'.format(sciname)).json()

1) requests.get()
    * We're using the 'get' attribute of the requests function - this means we're using a URL to get the data
2) 'https://www.marinespecies.org/rest/AphiaIDByName/{0}'.format(sciname)
    * We're dynamically constructing the URL.
    * We've written most of it as a string, but we're using a placeholder {0} for the species name.
    * The .format() method (of the preceding string) is where we provide the data for the placeholder.
3) .json()
    * The API will return the data as a JSON ("JavaScript Object Notation") object. So we specify the .json() method for the requests() function.
    * This 'unpacks' the JSON data into a familar Python data structure of nested lists and dictionaries.


In this case, the API only returns a single item, so unpacking it gives us the AphiaID integer

In [None]:
# Now let's use the AphiaID number to get vernaculars (common names) for Sciaenops ocellatus
# The specific API application to get the vernaculars for a given AphiaID is /AphiaVernacularsByAphiaID/{ID}


speciesVernacs = requests.get('https://www.marinespecies.org/rest/AphiaVernacularsByAphiaID/{0}'.format(aphiaID)).json()
print(speciesVernacs)

Here we have a chunk of data. We see square brackets and curly braces. We have lists and dictionaries.

#### Activity 1

We have a list of dictionaries, with each dictionary representing a vernacular. Iterate over the list and print the following for each dictionary:

"A vernacular for {species name} in {language} is {vernacular}"
e.g. "A vernacular for Sciaenops ocellatus in English is red drum"

Hint: Remember how we constructed the URL using a placeholder and the .format() method?
* You'll need to use three placeholders in sequential order (0,1,2)
* In the .format() section you'll need to list the three variables (two are dictionary references) separated by commas in the same order that you want to substitute them for the placeholders. 

In [None]:
# Tackle Activity 1 here





### Activity 2

Now let's make another request using the AphiaID to get the full species classification

The specific API application for this is /AphiaClassificationByAphiaID/{ID}



In [33]:
# Tackle Activity 2 here





{'AphiaID': 1, 'rank': 'Superdomain', 'scientificname': 'Biota', 'child': {'AphiaID': 2, 'rank': 'Kingdom', 'scientificname': 'Animalia', 'child': {'AphiaID': 1821, 'rank': 'Phylum', 'scientificname': 'Chordata', 'child': {'AphiaID': 146419, 'rank': 'Subphylum', 'scientificname': 'Vertebrata', 'child': {'AphiaID': 1828, 'rank': 'Infraphylum', 'scientificname': 'Gnathostomata', 'child': {'AphiaID': 152352, 'rank': 'Parvphylum', 'scientificname': 'Osteichthyes', 'child': {'AphiaID': 10194, 'rank': 'Gigaclass', 'scientificname': 'Actinopterygii', 'child': {'AphiaID': 843664, 'rank': 'Class', 'scientificname': 'Actinopteri', 'child': {'AphiaID': 293496, 'rank': 'Subclass', 'scientificname': 'Teleostei', 'child': {'AphiaID': 1517577, 'rank': 'Order', 'scientificname': 'Eupercaria incertae sedis', 'child': {'AphiaID': 125558, 'rank': 'Family', 'scientificname': 'Sciaenidae', 'child': {'AphiaID': 159334, 'rank': 'Genus', 'scientificname': 'Sciaenops', 'child': {'AphiaID': 159335, 'rank': 'Spe

This is a mess of nested dictionaries!

If we look closely, we see that we have a dictionary with four variables (AphiaID, rank, scientific name, and child. The first three have integer or string values, but the value of child is itself a dictionary. It has four variables of the same types, with child being a dictionary. And so on, all the way through the classification down to the species dictionary, which has a child with a value of None.

### Question [n]

Question content [question is to reflect on theory just learned]

In [None]:
#Answer walkthrough is executable code

### Activity [n]

Activity description/task [activity is to exercise theory just learned]

In [None]:
#Tackle Activity [n] here [code space]




### Quiz

[hyperlink to a quiz here. Perhaps we can use Google Forms quizzes with multi-choice questions to help solidify learning]

### Challenge [this is homework or to do if the class finishes early]

Challenge description [challenge is to consolidate and practice content learned during this session]

In [None]:
#Tackle the challenge here [code space]




### Resources [web resources for reference or reading to help expand knowledge on this section's content]