# SGCN National List

The full [national list](https://www1.usgs.gov/csas/swap/national_list.html) of SGCN species across 2005 and 2015 represents a relatively complex query that needs to sum up the total states reporting each species. There may be some way to drive everything with some feature of the Elasticsearch index on the full original data that I haven't figured out yet, but I was only able to come up with a SQL statement to drive this.
```sql
 SELECT scientificname_accepted AS scientificname,
    (array_agg(taxonomicauthorityid_accepted ORDER BY sgcn_year DESC))[1] AS taxonomicauthorityid,
    (array_agg(commonname_submitted ORDER BY sgcn_year DESC))[1] AS commonname,
    (array_agg(taxonomicgroup_submitted ORDER BY sgcn_year DESC))[1] AS taxonomicgroup,
    sum(((sgcn_year = 2005))::integer) AS sgcn2005,
    sum(((sgcn_year = 2015))::integer) AS sgcn2015
   FROM sgcn.sgcn
  WHERE taxonomicauthorityid_accepted <> ''
  GROUP BY scientificname_accepted
```  
Running that live is way too costly on the system, so I built a view in GC2 using this select statement and indexed that in Elasticsearch as sgcn_nationallist. This results in a much more responsive query. This query selects only those records where there is an accepted taxonomic authority ID, which is the basic definition of what ends up on the national list.

## UPDATE
The whole SGCN system has been completely reengineered, but I tried to keep the basic final output in something close to the state that has been built against so far for the SWAP app. The sgcn_nationallist view and ElasticSearch index should be identical to what they were before, but the underlying data are all new. Here are a couple of caveats:

* The commonname now comes mostly from English vernacular names discovered in ITIS when those are available. Otherwise, there is currently a query that pulls common name from the originally submitted information. This will be improved to be a dynamic conditional query later, but I ran out of time for the moment.
* The taxonomic group still comes from what the states originally submitted, so it is blank for some entries. This will be improved once Abby provides a mapping from ITIS taxonomic levels to some logical grouping that we want to put the national list into.
* The underlying data from the states is also all new here. I built a whole new process that reads directly from the source data repository in ScienceBase and processes source files into records in the new sgcn.sgcn table (new sgcn schema in the GC2 instance). Those are then processed using a different method of checking taxonomy against name authorities. Currently, the final data only include the most solid matches on ITIS. WoRMS taxonomic checks have not been completed to fill in some of the blanks, and the ITIS matching algorithm can be improved to find additional matches. I took a fairly conservative approach on the matching process, so there will likely be additional matches found in future to expand out the "SGCN National List."

In [1]:
import requests
from IPython.display import display

In [2]:
#Class to render tables
class ListTable(list):
    def _repr_html_(self):
        html = ["<table>"]
        for row in self:
            html.append("<tr>")
            
            for col in row:
                html.append("<td>{0}</td>".format(col))
            
            html.append("</tr>")
        html.append("</table>")
        return ''.join(html)

This query returns results from the Elasticsearch index for the sgcn_nationallist view. It only calls the first 100 results, so that will need to be paginated for the SWAP online app. I included the taxonomic authority ID as a reference. Those IDs to ITIS or WoRMS return a machine-readable response and are not content negotiable, so if we want to include them in the UI, we would need to translate the ID into something for humans.

In [5]:
sgcnNationalListURL = "https://gc2.mapcentia.com/api/v1/elasticsearch/search/bcb/sgcn/sgcn_nationallist?size=25&from=25"
sgcnNationalList = requests.get(sgcnNationalListURL).json()

tableNationalList = ListTable()
tableNationalList.append(['Scientific Name', 'Common Name', '2005', '2015', 'Taxonomic Group', 'Taxonomic Authority ID/Link'])

for hit in sgcnNationalList['hits']['hits']:
    tableNationalList.append([hit['_source']['properties']['scientificname'], hit['_source']['properties']['commonname'], hit['_source']['properties']['sgcn2005'], hit['_source']['properties']['sgcn2015'], hit['_source']['properties']['taxonomicgroup'], hit['_source']['properties']['taxonomicauthorityid']])

display(tableNationalList)

0,1,2,3,4,5
Scientific Name,Common Name,2005,2015,Taxonomic Group,Taxonomic Authority ID/Link
Acupalpus rectangulus,no common name,1,0,,http://services.itis.gov/?q=tsn:932424
Acyrtops beryllinus,emerald clingfish,1,0,Fish,http://services.itis.gov/?q=tsn:164472
Adalia bipunctata,twospotted lady beetle,0,1,Insects,http://services.itis.gov/?q=tsn:114341
Adlumia fungosa,mountain-fringe,0,1,Plants,http://services.itis.gov/?q=tsn:18897
Adrityla cucullata,A Millipede,1,0,Myriapods,http://services.itis.gov/?q=tsn:570545
Aeoloplides rotundipennis,Grasshopper,1,0,,http://services.itis.gov/?q=tsn:657836
Aerodramus vanikorensis bartschi,Mariana Gray Swiftlet,2,0,,http://services.itis.gov/?q=tsn:554964
Aeropedellus clavatus,Club-horned Grasshopper,1,1,Insects,http://services.itis.gov/?q=tsn:657840
Aeshna canadensis,Canada Darner,5,4,Insects,http://services.itis.gov/?q=tsn:185977
