For WoRMS in the SGCN case, we are only using it as a source for additional taxonomic name matching, although there is additional data available from WoRMS such as species traits that we could use at some future point. Because of this, we only need to pick up the unmatched names from the ITIS cache in order to build the messages for a WoRMS processing queue. A WoRMS message is sent to the cache each time the ITIS process runs and does not find a valid name. This workflow operates the WoRMS processor on all of those messages. For valid WoRMS records, the process sends summary properties to the message queue used to infuse those properties into the SGCN master table to put species onto the "SGCN National List."

In [1]:
import pysgcn
sgcn = pysgcn.sgcn.Sgcn()

from joblib import Parallel, delayed
from tqdm import tqdm

mq = "mq_worms_check"
sppin_source = "worms"

In [2]:
messages = sgcn.sql_mq.get_all_records("mq", mq)
print(len(messages))
messages[0]

496


{'id': 'a58f6f4b5828b20291de32e935f8d85f97801592',
 'date_inserted': '2019-12-23T14:33:00.590789',
 'body': {'source': {'type': 'List of Scientific Names',
   'name_source': 'ITIS Search'},
  'sppin_key': 'Scientific Name:Euphyes niveilinea'}}

In [3]:
%%time
Parallel(n_jobs=8, prefer="threads")(
    delayed(sgcn.process_sppin_source_search_term)
    (
        message_queue=mq,
        sppin_source=sppin_source,
        message_id=message["id"], 
        message_body=message["body"]
    ) for message in tqdm(messages)
)

100%|██████████| 496/496 [00:54<00:00,  9.11it/s]


CPU times: user 8.21 s, sys: 4.95 s, total: 13.2 s
Wall time: 56.3 s


['MESSAGE PROCESSED: Scientific Name:Euphyes niveilinea',
 'MESSAGE PROCESSED: Scientific Name:Stegasta bosquella',
 'MESSAGE PROCESSED: Scientific Name:Incisalia polios',
 'MESSAGE PROCESSED: Scientific Name:Acrolepiopsis leucoscia',
 'MESSAGE PROCESSED: Scientific Name:Zomaria interuptolineana',
 'MESSAGE PROCESSED: Scientific Name:Aterpia approximana',
 'MESSAGE PROCESSED: Scientific Name:Palus delector',
 'MESSAGE PROCESSED: Scientific Name:Elaphe emoryi',
 'MESSAGE PROCESSED: Scientific Name:Palus luteocephalus',
 'MESSAGE PROCESSED: Scientific Name:Paraphlepsius rossi',
 'MESSAGE PROCESSED: Scientific Name:Paraphlepsius electus',
 'MESSAGE PROCESSED: Scientific Name:Tetralopha baptisiella',
 'MESSAGE PROCESSED: Scientific Name:Prionapteryx nebulifera',
 'MESSAGE PROCESSED: Scientific Name:Carectocultus perstrialis',
 'MESSAGE PROCESSED: Scientific Name:Paraphlepsius altus',
 'MESSAGE PROCESSED: Scientific Name:Diacyclops clandestinus',
 'MESSAGE PROCESSED: Scientific Name:Paraphl