The FWS ECOS folks put out a new REST service that provides a whole lot more than the very old xquery service that we based some work on in the past. It should enable us to replace everything we are currently doing and add more functionality within the pysppin project. This notebook explores the dynamics of the new service.

The [new service](https://ecos.fws.gov/ecp/report/ad-hoc-documentation?catalogId=species&reportId=species#document) is based on a tool called pullReports that seems to essentially just expose certain related tables from the relational database backend for the ECOS ESA system. It returns a data structure that is reasonable enough but that could be optimized for pulling immediately into things like dataframes for processing.

The following codeblocks access the query service and reformat the data into an array of key value pairs.

In [1]:
import requests
import pandas as pd

In [2]:
# Simple example query for listed species
listed_spp = requests.get("https://ecos.fws.gov/ecp/pullreports/catalog/species/report/species/export?filter=%2Fspecies%40status_category%20%3D%20%27Listed%27&format=json").json()


In [3]:
# Column definition from meta section
listed_spp["meta"]["columns"]

[{'tablePath': '/species', 'displayName': 'Common Name', 'id': 'cn'},
 {'tablePath': '/species', 'displayName': 'Scientific Name', 'id': 'sn'},
 {'tablePath': '/species',
  'displayName': 'ESA Listing Status',
  'id': 'status'},
 {'tablePath': '/species', 'displayName': 'Entity Description', 'id': 'desc'},
 {'tablePath': '/species',
  'displayName': 'ESA Listing Date',
  'id': 'listing_date'}]

In [4]:
# Number of data records
len(listed_spp["data"])

2403

In [5]:
# Example record
listed_spp["data"][0]

["Abbott's booby",
 {'value': 'Papasula (=Sula) abbotti',
  'url': 'https://ecos.fws.gov/ecp/species/1470'},
 'Endangered',
 'Wherever found',
 '06-14-1976']

This set of functions pulls together the column names as keys with the values from the data records to return a new array that can be readily pulled into various types of dataframes. This is pretty clunky at this point, and more experimentation is needed to run through all the various data that can be returned and deal with any other types of data structures. I'm not entirely sure why they decided to put the species scientific name together with an identifier/URL into the one "column" for sn, but I broke that out here into separate data fields in order to keep as simple a data structure as possible. I'm sure there are also better ways to do this than with a loop, I just didn't have time to figure that out right away. I added an option to use either the displayName or id from the column name metadata

In [6]:
def build_data_dict(record, column_meta, field_name_type="displayName"):
    new_record = dict()

    for index, value in enumerate(record):
        column_name = column_meta[index][field_name_type]
        if type(value) == dict:
            for k in value.keys():
                new_record[f"{column_name}_{k}"] = value[k]
        elif type(value) == str:
            new_record[column_name] = value

    return new_record

def build_data_list(record_list, column_meta, field_name_type="displayName"):
    dataset = [build_data_dict(r, column_meta, field_name_type) for r in record_list]
    return dataset

In [7]:
# Build an array of key/value pair objects
listed_spp_array = build_data_list(listed_spp["data"], listed_spp["meta"]["columns"])

In [8]:
# Show some of the data
listed_spp_array[:5]

[{'Common Name': "Abbott's booby",
  'Scientific Name_value': 'Papasula (=Sula) abbotti',
  'Scientific Name_url': 'https://ecos.fws.gov/ecp/species/1470',
  'ESA Listing Status': 'Endangered',
  'Entity Description': 'Wherever found',
  'ESA Listing Date': '06-14-1976'},
 {'Common Name': 'Aboriginal Prickly-apple',
  'Scientific Name_value': 'Harrisia (=Cereus) aboriginum (=gracilis)',
  'Scientific Name_url': 'https://ecos.fws.gov/ecp/species/2833',
  'ESA Listing Status': 'Endangered',
  'Entity Description': 'Wherever found',
  'ESA Listing Date': '11-25-2013'},
 {'Common Name': 'Acklins ground iguana',
  'Scientific Name_value': 'Cyclura rileyi nuchalis',
  'Scientific Name_url': 'https://ecos.fws.gov/ecp/species/27',
  'ESA Listing Status': 'Threatened',
  'Entity Description': 'Wherever found',
  'ESA Listing Date': '06-22-1983'},
 {'Common Name': 'Acuna Cactus',
  'Scientific Name_value': 'Echinomastus erectocentrus var. acunensis',
  'Scientific Name_url': 'https://ecos.fws.go

In [9]:
# Put the data into a Pandas dataframe
df_listed_spp = pd.DataFrame(listed_spp_array)

In [10]:
# Show what it looks like
df_listed_spp.head(5)

Unnamed: 0,Common Name,ESA Listing Date,ESA Listing Status,Entity Description,Scientific Name_url,Scientific Name_value
0,Abbott's booby,06-14-1976,Endangered,Wherever found,https://ecos.fws.gov/ecp/species/1470,Papasula (=Sula) abbotti
1,Aboriginal Prickly-apple,11-25-2013,Endangered,Wherever found,https://ecos.fws.gov/ecp/species/2833,Harrisia (=Cereus) aboriginum (=gracilis)
2,Acklins ground iguana,06-22-1983,Threatened,Wherever found,https://ecos.fws.gov/ecp/species/27,Cyclura rileyi nuchalis
3,Acuna Cactus,10-31-2013,Endangered,Wherever found,https://ecos.fws.gov/ecp/species/5785,Echinomastus erectocentrus var. acunensis
4,Addax,09-02-2005,Endangered,Wherever found,https://ecos.fws.gov/ecp/species/1486,Addax nasomaculatus


In [11]:
# List numbers of records by listing status values
df_listed_spp[["ESA Listing Status","Scientific Name_url"]].groupby("ESA Listing Status").count()

Unnamed: 0_level_0,Scientific Name_url
ESA Listing Status,Unnamed: 1_level_1
Endangered,1850
"Experimental Population, Non-Essential",65
Similarity of Appearance (Threatened),12
Threatened,476
