# GBIF API request for species naming explanation

In [10]:
import json
import requests

import pandas as pd

### Introduction: example request for info

The requirements are discussed in the following comment:
https://github.com/LifeWatchINBO/invasive-t0-occurrences/issues/6 and an example is included. 

Let us test the example by making a request:

In [12]:
temp = requests.get("http://api.gbif.org/v1/species/match?verbose=false&kingdom=Plantae&name=Heracleum%20mantegazziaum&strict=false")

In [13]:
temp.json()

{'usageKey': 3034825,
 'scientificName': 'Heracleum mantegazzianum Sommier & Levier',
 'canonicalName': 'Heracleum mantegazzianum',
 'rank': 'SPECIES',
 'status': 'ACCEPTED',
 'confidence': 96,
 'matchType': 'FUZZY',
 'kingdom': 'Plantae',
 'phylum': 'Tracheophyta',
 'order': 'Apiales',
 'family': 'Apiaceae',
 'genus': 'Heracleum',
 'species': 'Heracleum mantegazzianum',
 'kingdomKey': 6,
 'phylumKey': 7707728,
 'classKey': 220,
 'orderKey': 1351,
 'familyKey': 6720,
 'genusKey': 3034824,
 'speciesKey': 3034825,
 'synonym': False,
 'class': 'Magnoliopsida'}

### Development of the building blocks of our small application

Two important functions are available to extract the information:
* A function that can do a request with the given species/kingdom combination: `extract_gbif_species_names_info`
* A function that can iteratively do this for every specie in the provided list on https://github.com/LifeWatchINBO/invasive-t0-occurrences/blob/master/species-list/species-list.tsv `extract_species_information`

Importing them makes it available:

In [14]:
from gbif_species_name_match import (extract_gbif_species_names_info, 
                                     extract_species_information,
                                     extract_gbif_accepted_key)

### Extract info from GBIF about provided species

`extract_gbif_species_names_info` provides the funcionality to request species info from the GBIF API. It can be used as python function or from the command line:

#### Python function:

In [15]:
!cat opiliones.csv

scientificName
Amilenus aurantiacus 
Dicranopalpus ramosus
Homalenotus quadridentatus
Lacinius horridus 
Lacinius ephippiatus 
Leiobunum blackwalli 
Leiobunum rotundum 
Lophopilio palpinalis 
Mitopus morio 
Nelima doriae 
Odiellus spinosus 
Oligolophus hanseni 
Oligolophus tridens 
Paroligolophus agrestis 
Opilio canestrinii 
Opilio parietinus 
Opilio saxatilis 
Phalangium opilio 
Platybunus pinetorum 
Rilaena triangularis 


In [16]:
updated_tsv = extract_species_information("opiliones.csv", 
                                          output=None,                              
                                          api_terms=["usageKey", 
                                                     "scientificName", 
                                                     "canonicalName",
                                                     "status", 
                                                     "rank", 
                                                     "matchType", 
                                                     "confidence"],
                                         )

Using only scientificName as name column for API request.


In [18]:
updated_tsv.head()
    

Unnamed: 0,scientificName,gbifapi_usageKey,gbifapi_scientificName,gbifapi_canonicalName,gbifapi_status,gbifapi_rank,gbifapi_matchType,gbifapi_confidence,gbifapi_acceptedKey,gbifapi_acceptedScientificName
0,Amilenus aurantiacus,4553516,"Amilenus aurantiacus (Simon, 1881)",Amilenus aurantiacus,ACCEPTED,SPECIES,EXACT,99,4553516,"Amilenus aurantiacus (Simon, 1881)"
1,Dicranopalpus ramosus,4553430,"Dicranopalpus ramosus (E.Simon, 1909)",Dicranopalpus ramosus,ACCEPTED,SPECIES,EXACT,99,4553430,"Dicranopalpus ramosus (E.Simon, 1909)"
2,Homalenotus quadridentatus,4553460,"Homalenotus quadridentatus (Cuvier, 1795)",Homalenotus quadridentatus,ACCEPTED,SPECIES,EXACT,99,4553460,"Homalenotus quadridentatus (Cuvier, 1795)"
3,Lacinius horridus,4553408,"Lacinius horridus (Panzer, 1794)",Lacinius horridus,ACCEPTED,SPECIES,EXACT,99,4553408,"Lacinius horridus (Panzer, 1794)"
4,Lacinius ephippiatus,4553401,"Lacinius ephippiatus (C.L.Koch, 1835)",Lacinius ephippiatus,ACCEPTED,SPECIES,EXACT,99,4553401,"Lacinius ephippiatus (C.L.Koch, 1835)"


The available options are:
* output : if None, nothing is written to output, pandas DataFrame is returnes; if string, the output is written to tsv file
* api_terms : Either a list of existing terms or just provide 'all' if interested in all the information 

In [19]:
extract_species_information("opiliones.csv", output=None, api_terms="all")

Using only scientificName as name column for API request.


Unnamed: 0,scientificName,gbifapi_acceptedKey,gbifapi_acceptedScientificName,gbifapi_canonicalName,gbifapi_class,gbifapi_classKey,gbifapi_confidence,gbifapi_family,gbifapi_familyKey,gbifapi_genus,...,gbifapi_orderKey,gbifapi_phylum,gbifapi_phylumKey,gbifapi_rank,gbifapi_scientificName,gbifapi_species,gbifapi_speciesKey,gbifapi_status,gbifapi_synonym,gbifapi_usageKey
0,Amilenus aurantiacus,4553516,"Amilenus aurantiacus (Simon, 1881)",Amilenus aurantiacus,Arachnida,367,99,Phalangiidae,3253416,Amilenus,...,907,Arthropoda,54,SPECIES,"Amilenus aurantiacus (Simon, 1881)",Amilenus aurantiacus,4553516,ACCEPTED,False,4553516
1,Dicranopalpus ramosus,4553430,"Dicranopalpus ramosus (E.Simon, 1909)",Dicranopalpus ramosus,Arachnida,367,99,Phalangiidae,3253416,Dicranopalpus,...,907,Arthropoda,54,SPECIES,"Dicranopalpus ramosus (E.Simon, 1909)",Dicranopalpus ramosus,4553430,ACCEPTED,False,4553430
2,Homalenotus quadridentatus,4553460,"Homalenotus quadridentatus (Cuvier, 1795)",Homalenotus quadridentatus,Arachnida,367,99,Phalangiidae,3253416,Homalenotus,...,907,Arthropoda,54,SPECIES,"Homalenotus quadridentatus (Cuvier, 1795)",Homalenotus quadridentatus,4553460,ACCEPTED,False,4553460
3,Lacinius horridus,4553408,"Lacinius horridus (Panzer, 1794)",Lacinius horridus,Arachnida,367,99,Phalangiidae,3253416,Lacinius,...,907,Arthropoda,54,SPECIES,"Lacinius horridus (Panzer, 1794)",Lacinius horridus,4553408,ACCEPTED,False,4553408
4,Lacinius ephippiatus,4553401,"Lacinius ephippiatus (C.L.Koch, 1835)",Lacinius ephippiatus,Arachnida,367,99,Phalangiidae,3253416,Lacinius,...,907,Arthropoda,54,SPECIES,"Lacinius ephippiatus (C.L.Koch, 1835)",Lacinius ephippiatus,4553401,ACCEPTED,False,4553401
5,Leiobunum blackwalli,4553482,"Leiobunum blackwalli Meade, 1861",Leiobunum blackwalli,Arachnida,367,99,Sclerosomatidae,3253390,Leiobunum,...,907,Arthropoda,54,SPECIES,"Leiobunum blackwalli Meade, 1861",Leiobunum blackwalli,4553482,ACCEPTED,False,4553482
6,Leiobunum rotundum,4553472,"Leiobunum rotundum (Latreille, 1798)",Leiobunum rotundum,Arachnida,367,99,Sclerosomatidae,3253390,Leiobunum,...,907,Arthropoda,54,SPECIES,"Leiobunum rotundum (Latreille, 1798)",Leiobunum rotundum,4553472,ACCEPTED,False,4553472
7,Lophopilio palpinalis,4553391,"Lophopilio palpinalis (J.F.W.Herbst, 1799)",Lophopilio palpinalis,Arachnida,367,99,Phalangiidae,3253416,Lophopilio,...,907,Arthropoda,54,SPECIES,"Lophopilio palpinalis (J.F.W.Herbst, 1799)",Lophopilio palpinalis,4553391,ACCEPTED,False,4553391
8,Mitopus morio,4553358,"Mitopus morio (Fabricus, 1779)",Mitopus morio,Arachnida,367,99,Phalangiidae,3253416,Mitopus,...,907,Arthropoda,54,SPECIES,"Mitopus morio (Fabricus, 1779)",Mitopus morio,4553358,ACCEPTED,False,4553358
9,Nelima doriae,2182032,"Nelima doriae (Canestrinii, 1871)",Nelima doriae,Arachnida,367,99,Sclerosomatidae,3253390,Nelima,...,907,Arthropoda,54,SPECIES,"Nelima doriae (Canestrinii, 1871)",Nelima doriae,2182032,ACCEPTED,False,2182032


#### Command line

When working in the command line, the function will take the first argument as input file and the last argument as file to write it to, both combined with the relative path:

```bash
python gbif_species_name_extraction.py sample.csv sample_dump.csv
```

The terms added to the tsv file are the default list as follows:

```python
["usageKey", "scientificName", "canonicalName", "status", "rank", "matchType", "confidence"]

```

### Extract info of individual species

The `extract_gbif_species_names_info` function is useful for a request of a single species/kingdom combination:

In [20]:
extract_gbif_species_names_info("Dinebra panicea (Retz.) P.M. Peterson & N. Snow var. brachiata (Steud.) P.M. Peterson & N. Snow", kingdom="Plantae")

{'usageKey': 8306067,
 'acceptedUsageKey': 4132112,
 'scientificName': 'Dinebra panicea var. brachiata (Steud.) P.M.Peterson & N.Snow',
 'canonicalName': 'Dinebra panicea brachiata',
 'rank': 'VARIETY',
 'status': 'SYNONYM',
 'confidence': 100,
 'matchType': 'EXACT',
 'kingdom': 'Plantae',
 'phylum': 'Tracheophyta',
 'order': 'Poales',
 'family': 'Poaceae',
 'genus': 'Leptochloa',
 'species': 'Leptochloa panicea',
 'kingdomKey': 6,
 'phylumKey': 7707728,
 'classKey': 196,
 'orderKey': 1369,
 'familyKey': 3073,
 'genusKey': 2703864,
 'speciesKey': 2703872,
 'synonym': True,
 'class': 'Liliopsida'}

The `extract_gbif_accepted_key` fucntion is useful to get the acceptedKey corresponding to any usage key of GBIF:

In [21]:
extract_gbif_accepted_key(3025758)

('', '')