This process registeres unique species names from the SGCN source data into the Taxonomic Information Registry. The process is all based on pulling unique species names that are then examined via TIR processes to find matches with taxonomic authorities. Those decisions on taxonomic matching are used to create a nationally synthesized list of taxa that states have listed as Species of Greatest Conservation Need.

Registration consists of a set of key/value pairs that are inserted into the registration property of the TIR table. An hstore column in PostgreSQL of key/value pairs is used in order to accommodate different registration vectors having varying attributes. Every registration has the following:
* source - Logical name specifying the source of the registration ("SGCN" in this case)
* registrationDate - Date/time stamp of the registration

Most TIR registrations will have a "scientificname" property containing the name string used as a primary identifier. Some TIR registrations will have other identifiers that come from source material.

SGCN registrations include a list of common names and taxonomic groups supplied by the state and pulled together with an array_agg function and a DISTINCT operator to create a list of unique values in a string. These values can then be reasoned on in TIR processing.

In [45]:
import requests,datetime,configparser
from IPython.display import display

In [46]:
# Get API keys and any other config details from a file that is external to the code.
config = configparser.RawConfigParser()
config.read_file(open(r'../config/stuff.py'))

dt = datetime.datetime.utcnow().isoformat()

In [60]:
# Build base URL with API key using input from the external config.
def getBaseURL():
    gc2APIKey = config.get('apiKeys','apiKey_GC2_BCB').replace('"','')
    apiBaseURL = "https://gc2.mapcentia.com/api/v1/sql/bcb?key="+gc2APIKey
    return apiBaseURL

In [61]:
# Basic function to insert registration info pairs into TIR
def idsToTIR(recordInfoPairs):
    # Build query string
    insertSQL = "INSERT INTO tir.tir2 (registration) VALUES ('"+recordInfoPairs+"')"
    # Execute query
    response = requests.get(getBaseURL()+"&q="+insertSQL).json()
    return response

In [62]:
q_sgcn = "SELECT scientificname_submitted scientificname, \
    array_to_string(array_agg(DISTINCT CASE WHEN commonname_submitted <> '' THEN commonname_submitted ELSE NULL END),',') commonnames, \
    array_to_string(array_agg(DISTINCT CASE WHEN taxonomicgroup_submitted <> '' THEN taxonomicgroup_submitted ELSE NULL END),',') taxonomicgroups \
    FROM sgcn.sgcn \
    WHERE scientificname_submitted <> '' \
    GROUP BY scientificname_submitted"
r_sgcn = requests.get(getBaseURL()+"&q="+q_sgcn).json()

In [63]:
recordCount = 0
for sgcn in r_sgcn['features']:
    recordInfoPairs = '"registrationDate" => "'+dt+'"'
    recordInfoPairs = recordInfoPairs+',"source"=>"SGCN"'
    recordInfoPairs = recordInfoPairs+',"scientificname"=>"'+sgcn['properties']['scientificname'].replace("\'","''")+'"'
    recordInfoPairs = recordInfoPairs+',"commonnames"=>"'+sgcn['properties']['commonnames'].replace("\'","''")+'"'
    recordInfoPairs = recordInfoPairs+',"taxonomicgroups"=>"'+sgcn['properties']['taxonomicgroups']+'"'
    recordCount = recordCount + 1
    try:
        print (recordInfoPairs)
#        print (sgcnRecord['properties']['scientificname_submitted'], idsToTIR(recordInfoPairs))
        numProcessed = numProcessed + 1
    except:
        print ("Problem with: "+recordInfoPairs)
print ("Unique records processed: "+str(recordCount))

"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abacion tessalatum","commonnames"=>"A millipede","taxonomicgroups"=>"Myriapods,Other Invertebrates"
"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abacion wilhelminae","commonnames"=>"millipede,Millipede","taxonomicgroups"=>"Other Invertebrates"
"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abagrotis barnesi","commonnames"=>"A noctuid moth","taxonomicgroups"=>"Insects"
"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abagrotis brunneipennis","commonnames"=>"Yankee Dart","taxonomicgroups"=>"Insects"
"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abagrotis crumbi benjamini","commonnames"=>"Benjamin''s abagrotis","taxonomicgroups"=>"Insects"
"registrationDate" => "2017-05-26T16:20:01.045624","source"=>"SGCN","scientificname"=>"Abagrotis nefascia",