---
syncID: 
title: "Querying Taxonomy Data with NEON API and Python"
description: "Querying the 'taxonomy/' NEON API endpoint with Python and navigating the response"
dateCreated: 2020-04-24
authors: Maxwell J. Burner
contributors: 
estimatedTime:
packagesLibraries: requests, json, pandas
topics: api
languagesTool: python
dataProduct:
code1: 
tutorialSeries: python-neon-api-series
urlTitle: neon_api_taxonomy
---

In this tutorial we will learn to query the *taxonomy/* endpoint of the NEON API using Python.

<div id="ds-objectives" markdown="1">

### Objectives
After completing this tutorial, you will be able to:

* Query the taxonomy endpoint of the NEON API to obtain taxonomic data
* Search NEON taxonomic data using different criteria
* Use the various options of the taxonomy endpoint to customized the results of a call
* Navigate the data returned by a call to the taxonomy endpoint of the NEON API
* Navigate the parent-child relationships between NEON locations


### Install Python Packages

* **requests**
* **json** 
* **pandas**



</div>

In this tutorial we will learn to use Python and the *taxonomy/* endpoint of the NEON API to query information from NEON's taxonomic data. 

NEON maintains a great deal of taxonomic data, used in species identification during field research. NEON taxonomy data can be obtained through the API, or through an interactive interface called the [Taxon Viewer](http://data.neonscience.org/static/taxon.html). Just as the *locations/* endpoint can provide more context for a location referenced in NEON studies, the *taxonomy/* endpoint can provide additional information on species identified in NEON observational data.




## Making the Request

Unlike other endpoints, the *locations/* endpoint does not take a single target in its URL. Instead, the query can make use of a number of different options, which are named and assigend values in the url. Each option is assigned a value with an equals, for example 'family=Pineceae'; these are placed after a '?' at the end of the URL, linked to each other by '&' signs.

Each call must have one of the following options, but cannot use multiple:
* **taxonTypeCode**, a four-letter that indicates which NEON taxonomy is being queried, such as FISH or BIRD
* One of the major taxonomic ranks from genus through kingdom
* **scientificName** a specific name of format genus + specific epithet + (authority); this is used to search for one exact results

In addition, any number of the following options can also be added to modify the results of the query:
* **verbose** takes a 'true' for a more detailed response or 'false' for a shorter response
* **offset** takes an integer indicating the number of starting rows of the list of results to skip; the default is 50
* **limit** takes an integer indicating the maximum length of the list returned; the default is 50

Let's request data on up to 20 members of the Pine family, skipping none, with the short response.

In [43]:
import requests
import json

In [44]:
#Choose values for each option
SERVER = 'http://data.neonscience.org/api/v0/'
FAMILY = 'Pinaceae'
OFFSET = 0
LIMIT = 20
VERBOSE = 'false'

In [45]:
#Create 'options' portion of API call
OPTIONS = '?family={family}&offset={offset}&limit={limit}&verbose={verbose}'.format(
    family = FAMILY,
    offset = OFFSET,
    limit = LIMIT,
    verbose = VERBOSE)

#Print out the completed options string. This is what comes after the endpoint in the taxonomy API call
print(OPTIONS)

?family=Pinaceae&offset=0&limit=20&verbose=false


In [46]:
#Make request
pine_req = requests.get(SERVER+'taxonomy/'+OPTIONS)
pine_json = pine_req.json()

## Navigating the Response

Unlike most API call responses, the taxonomy JSON at the uppermost level has more elements that just 'data'. The other elements include information on how many species entry were returned, and how many species entries could have been returned (if offset was zero and limit was infinity). 

It also includes the API url that could get the 'previous' set of entries (if offset was not zero) matching the parameters other than offset and limit, and of the same number as this set; and the API url that could get the next set of entries (if limit was not infinity, and the limit parameter resulted in some entries being excluded). These urls could be used to effectively break up a larger API call into several segments; we ask for a smaller set than we actually want, then use the "next" url to get the next set of entries in a seperate call.

In [47]:
#Print out values in the top level of the pine_json, other than the 'data' entry.
for key in pine_json.keys():
    if(key != 'data'):
        print(key,':',pine_json[key])

count : 20
total : 131
prev : None
next : https://data.neonscience.org/api/v0/taxonomy?family=Pinaceae&verbose=false&offset=20&limit=20


 Within the 'data' attribute is list with entries for each species returned by the call. Each species entry is a dictionary with atttributes for:

- The full taxonomy, with a seperate attribute for each taxonomic level
- The NEON taxonomy the data was obtained from (taxonTypeCode)
- The short taxon code used by NEON (taxonID, acceptedTaxonID)
- The author of the scientific name
- The common/vernbacular name, if any
- The reference text used (nameAccordingToID)

In [9]:
#Print data for one species in the result
sample = pine_json['data'][7]
for key in sample.keys():
    print("%28s:   %20s" % (key, sample[key]))

               taxonTypeCode:                  PLANT
                     taxonID:                   ABFI
             acceptedTaxonID:                   ABFI
          dwc:scientificName:   Abies firma Siebold & Zucc.
dwc:scientificNameAuthorship:        Siebold & Zucc.
               dwc:taxonRank:                species
          dwc:vernacularName:               Momi fir
       dwc:nameAccordingToID:   http://plants.usda.gov (accessed 8/25/2014)
                 dwc:kingdom:                Plantae
                  dwc:phylum:          Coniferophyta
                   dwc:class:              Pinopsida
                   dwc:order:                Pinales
                  dwc:family:               Pinaceae
                   dwc:genus:                  Abies
             gbif:subspecies:                   None
                gbif:variety:                   None


The "dwc" at the beginning of many atttribute names indicates that the terms used for each field are matched to those used by Darwin Core, an official standard maintained for biodiversity reference.

In [10]:
#Print vernacular and species names
for species in pine_json['data']:
    print(species['dwc:vernacularName'],'|',species['dwc:scientificName'])

silver fir | Abies alba Mill.
Pacific silver fir | Abies amabilis (Douglas ex Loudon) Douglas ex Forbes
balsam fir | Abies balsamea (L.) Mill.
balsam fir | Abies balsamea (L.) Mill. var. balsamea
balsam fir | Abies balsamea (L.) Mill. var. phanerolepis Fernald
bristlecone fir | Abies bracteata (D. Don) D. Don ex Poit.
white fir | Abies concolor (Gord. & Glend.) Lindl. ex Hildebr.
Momi fir | Abies firma Siebold & Zucc.
Fraser fir | Abies fraseri (Pursh) Poir.
grand fir | Abies grandis (Douglas ex D. Don) Lindl.
Nikko fir | Abies homolepis Siebold & Zucc.
fir | Abies sp.
fir | Abies spp.
subalpine fir | Abies lasiocarpa (Hook.) Nutt.
corkbark fir | Abies lasiocarpa (Hook.) Nutt. var. arizonica (Merriam) Lemmon
subalpine fir | Abies lasiocarpa (Hook.) Nutt. var. lasiocarpa
Sierra white fir | Abies lowiana (Gordon & Glend.) A. Murray bis
California red fir | Abies magnifica A. Murray bis
California red fir | Abies magnifica A. Murray bis var. magnifica
Shasta red fir | Abies magnifica A. M

## Using Taxon Type Code

Let's try another, using taxonTypeCode this time. We'll look through some of the NEON Fish Taxonomy, but try the verbose description.

In [11]:
#Set options
SERVER = 'http://data.neonscience.org/api/v0/'
TAXONCODE = 'FISH'
OFFSET = 0
LIMIT = 20
VERBOSE = 'true'

In [12]:
#Create 'options' portion of API call
OPTIONS = '?taxonTypeCode={taxoncode}&offset={offset}&limit={limit}&verbose={verbose}'.format(
    taxoncode = TAXONCODE,
    offset = OFFSET,
    limit = LIMIT,
    verbose = VERBOSE)
print(OPTIONS)

?taxonTypeCode=FISH&offset=0&limit=20&verbose=true


In [13]:
#Make request
fish_req = requests.get(SERVER+'taxonomy/'+OPTIONS)
fish_json = fish_req.json()

In [14]:
#Print data for one species in the result
sample = fish_json['data'][7]
for key in sample.keys():
    print("%28s:   %20s" % (key, sample[key]))

               taxonTypeCode:                   FISH
                     taxonID:                 ACHSPP
             acceptedTaxonID:                 ACHSPP
          dwc:scientificName:         Achiridae spp.
dwc:scientificNameAuthorship:                   None
               dwc:taxonRank:                 family
          dwc:vernacularName:                   None
       taxonProtocolCategory:                 target
       dwc:nameAccordingToID:   http://www.itis.gov/ITISWebService/services/ITISService/getFullRecordFromTSN?tsn=202070 (accessed 08/31/2017)
    dwc:nameAccordingToTitle:   The Integrated Taxonomic Information System on-line database http://www.itis.gov/ITISWebService/services/ITISService/getFullRecordFromTSN?tsn=202070, accessed 31 Aug 2017
                 dwc:kingdom:               Animalia
             gbif:subkingdom:                   None
           gbif:infrakingdom:                   None
          gbif:superdivision:                   None
               gbif

This is a more verbose entry, so there are more attributes, though many lack values. The 'gbif' attributes indicate terms matched to those used by the Global Biodiversity Forum.

In [15]:
#Print common and scientific name for each fish
for species in fish_json['data']:
    print(species['dwc:vernacularName'],'|', species['dwc:scientificName'])

None | Acanthogobius flavimanus
None | Acantharchus pomotis
None | Acantharchus sp.
None | Acanthogobius sp.
None | Acantharchus spp.
None | Acanthogobius spp.
None | Achiridae sp.
None | Achiridae spp.
None | Acipenser brevirostrum
None | Acipenser fulvescens
None | Acipenser medirostris
None | Acipenser oxyrinchus
Gulf Sturgeon | Acipenser oxyrinchus desotoi
Atlantic sturgeon | Acipenser oxyrinchus oxyrinchus
Gulf Sturgeon | Acipenser oxyrhynchus desotoi
None | Acipenser oxyrhynchus
None | Acipenser sp.
None | Acipenseridae sp.
None | Acipenseriformes sp.
None | Acipenser spp.


## Finding a Specific Species

Many NEON data products, such as the land bird breeding counts used earlier, include species idetnification data in the form of species name. We can use the NEON *taxonomy/* endpoint to search for a specific species mentioned in a NEON study. Let's look at the 2018-06 Lower Teakettle Bird Counts again, and get more detail on one of the observed species.

In [16]:
import pandas as pd

In [17]:
#Establish target for API search
SITECODE = 'TEAK'
PRODUCTCODE = 'DP1.10003.001'

In [18]:
#Get data on available files
bird_request = requests.get(SERVER+'data/'+PRODUCTCODE+'/'+SITECODE+'/'+'2018-06')
bird_json = bird_request.json()

In [38]:
#Get url for basic count data
for file in bird_json['data']['files']:
    if('count' in file['name']):
        if('basic' in file['name']):
            bird_df = pd.read_csv(file['url'])

In [39]:
#View columns
bird_df.columns

Index(['uid', 'namedLocation', 'domainID', 'siteID', 'plotID', 'plotType',
       'pointID', 'startDate', 'eventID', 'pointCountMinute',
       'targetTaxaPresent', 'taxonID', 'scientificName', 'taxonRank',
       'vernacularName', 'observerDistance', 'detectionMethod',
       'visualConfirmation', 'sexOrAge', 'clusterSize', 'clusterCode',
       'identifiedBy'],
      dtype='object')

The *unique* method for pandas series, which include individual columns of dataframes, returns the series with all duplicate values removed.

In [40]:
#Use pandas .unique method to see what species were observed
bird_df['scientificName'].unique()

array(['Setophaga coronata', 'Regulus satrapa', 'Empidonax oberholseri',
       'Piranga ludoviciana', 'Picoides albolarvatus', 'Junco hyemalis',
       'Geothlypis tolmiei', 'Turdus migratorius', 'Poecile gambeli',
       'Pipilo chlorurus', 'Empidonax hammondii', 'Sitta canadensis',
       'Passerella iliaca', 'Corvus corax', 'Spinus pinus',
       'Vireo cassinii', 'Cyanocitta stelleri', 'Catharus guttatus',
       'Melospiza lincolnii', 'Certhia americana', 'Picidae sp.',
       'Parulidae sp.', 'Aves sp.', 'Oreortyx pictus',
       'Setophaga occidentalis', 'Oreothlypis ruficapilla',
       'Spizella passerina', 'Haemorhous cassinii', 'Picoides villosus',
       'Sitta carolinensis', 'Sphyrapicus ruber',
       'Setophaga coronata auduboni', 'Contopus cooperi',
       'Cardellina pusilla', 'Haemorhous purpureus', 'Empidonax sp.', nan,
       'Spizella pusilla', 'Buteo sp.', 'Passerina amoena',
       'Myadestes townsendi', 'Vireo gilvus', 'Agelaius phoeniceus',
       'Dryocopus p

More information on 'Troglodytes aedon' would be interesting. When using a scientific name in a taxonomy API call, we replace any spaces in the name with '%20'; also, remember to capitalize the genus name, but not the species name.

In [41]:
#Make request 
aedon_request = requests.get(SERVER+'taxonomy/'+'?scientificname=Troglodytes%20aedon')
aedon_json = aedon_request.json()

Because only a single result was returned, count and total entries will be one, and there will be no urls for the previous or next batch of entries.

It is important to note that the data element is still treated as a list; it is simply a list with only one element.

In [42]:
#Print elements of JSON other than data
for key in aedon_json.keys():
    if(key != 'data'):
        print(key,':',aedon_json[key])
print()

#Print elements of species dict in data list
for key in aedon_json['data'][0].keys():
    print(key,':',aedon_json['data'][0][key])

count : 1
total : 1
prev : None
next : None

taxonTypeCode : BIRD
taxonID : HOWR
acceptedTaxonID : HOWR
dwc:scientificName : Troglodytes aedon
dwc:scientificNameAuthorship : Vieillot
dwc:taxonRank : species
dwc:vernacularName : House Wren
dwc:nameAccordingToID : doi: 10.1642/AUK-15-73.1
dwc:kingdom : Animalia
dwc:phylum : Chordata
dwc:class : Aves
dwc:order : Passeriformes
dwc:family : Troglodytidae
dwc:genus : Troglodytes
gbif:subspecies : None
gbif:variety : None
