GBIF processing follows the same pattern as all of the other SppIn information gatherers. Running locally, I pull all messages and then process them in parallel at a rate that should not break anything. When running in a lambda environment, we will need to similarly throttle the number of concurrent connections we send to the GBIF API.

In [1]:
import pysgcn
sgcn = pysgcn.sgcn.Sgcn()

from joblib import Parallel, delayed
from tqdm import tqdm

mq = "mq_gbif_check"
sppin_source = "gbif"

In [2]:
messages = sgcn.sql_mq.get_all_records("mq", mq)
print(len(messages))
messages[99]

24152


{'id': 'f9dd0b457ec9df8e28662f11b056ddc46b6bb7c0',
 'date_inserted': '2019-12-23T14:12:49.184380',
 'body': {'source': {'type': 'List of Scientific Names',
   'name_source': 'ITIS Search'},
  'sppin_key': 'Scientific Name:Ametropus ammophilus'}}

In [3]:
%%time
Parallel(n_jobs=8, prefer="threads")(
    delayed(sgcn.process_sppin_source_search_term)
    (
        message_queue=mq,
        sppin_source=sppin_source,
        message_id=message["id"], 
        message_body=message["body"]
    ) for message in tqdm(messages)
)

100%|██████████| 24152/24152 [1:14:53<00:00,  5.37it/s] 


CPU times: user 15min 12s, sys: 5min 12s, total: 20min 24s
Wall time: 1h 14min 55s


['MESSAGE PROCESSED: Scientific Name:Sorex merriami',
 'MESSAGE PROCESSED: Scientific Name:Recurvirostra americana',
 'MESSAGE PROCESSED: Scientific Name:Magnipelta mycophaga',
 'MESSAGE PROCESSED: Scientific Name:Melanoplus idaho',
 'MESSAGE PROCESSED: Scientific Name:Pekania pennanti pennanti',
 'MESSAGE PROCESSED: Scientific Name:Martes pennanti',
 'MESSAGE PROCESSED: Scientific Name:Cryptomastix harfordiana',
 'MESSAGE PROCESSED: Scientific Name:Cascadoperla trictura',
 'MESSAGE PROCESSED: Scientific Name:Sweltsa gaufini',
 'MESSAGE PROCESSED: Scientific Name:Melanoplus trigeminus',
 'MESSAGE PROCESSED: Scientific Name:Myotis californicus',
 'MESSAGE PROCESSED: Scientific Name:Carduelis psaltria',
 'MESSAGE PROCESSED: Scientific Name:Spinus psaltria',
 'MESSAGE PROCESSED: Scientific Name:Melanoplus lemhiensis',
 'MESSAGE PROCESSED: Scientific Name:Oreohelix jugalis',
 'MESSAGE PROCESSED: Scientific Name:Leiothlypis virginiae',
 'MESSAGE PROCESSED: Scientific Name:Vermivora virginia