# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive list
4. Attempt to match the state statuses to an IUCN equivalent


### 1. iNaturalist statuses

In [161]:
import sys
import os
projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf
import pandas as pd

sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [162]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])
inatstatuses = filter_state_statuses("Northern Territory|NT NRETAS", " ")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
289,263901,1289379,702203.0,9994,,Atlas of Living Australia,VU,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Chloebia,gouldiae,,2022-10-20T02:55:37Z,Chloebia gouldiae,species,https://www.birds.cornell.edu/clementschecklis...,,,
1251,152435,19250,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Polytelis,alexandrae,,2019-08-27T01:09:01Z,Polytelis alexandrae,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
1249,152433,38633,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Bellatorias,obiri,,2019-04-30T15:19:21Z,Bellatorias obiri,species,http://reptile-database.reptarium.cz/search.ph...,,,
1252,152436,40743,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,,,,,Hipposideros stenotis,,,Narrow-eared Roundleaf Bat,False,[1431118]
1248,152432,41326,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Macroderma,gigas,,2019-08-27T01:58:05Z,Macroderma gigas,species,http://www.catalogueoflife.org/annual-checklis...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1247,152431,73180,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Pezoporus,occidentalis,,2019-08-27T01:09:27Z,Pezoporus occidentalis,species,http://www.birdlife.org/datazone/speciesfactsh...,,,


### 2. iNaturalist taxonomy

In [163]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists
Sensitive list: `geoprivacy` = `obscured`
overrides

In [164]:
# %%script echo skipping # comment this line to download dataset from lists.ala.org.au the web and save locally
# Download lists data. Retrieve binomial and trinomial names from GBIF. Save locally to CSV

sensitivelist = lf.download_ala_list("https://lists.ala.org.au/ws/speciesListItems/dr492?max=10000&includeKVP=true")
sensitivelist = lf.kvp_to_columns(sensitivelist)
sensitivelist.to_csv(sourcedir + "nt-ala-sensitive.csv", index=False)

conservationlist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr651?max=10000&includeKVP=true")
conservationlist = lf.kvp_to_columns(conservationlist)
conservationlist.to_csv(sourcedir + "nt-ala-conservation.csv", index=False)

In [165]:
# Read sensitive list data
sensitivelist = pd.read_csv(sourcedir + "nt-ala-sensitive.csv", dtype=str)
sensitivelist['status'] = 'sensitive'
sensitivelist['nt_geoprivacy'] = 'obscured'
# sensitivelist = sensitivelist.rename(columns={'scientificName':'nt_scientificName'})
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,vernacular name,Criteria,Denature level,generalisation,status,nt_geoprivacy
0,2058921,Macroderma gigas,Ghost Bat,Macroderma gigas,https://biodiversity.org.au/afd/taxa/63bc796a-...,dr492,"[{'key': 'vernacular name', 'value': 'Ghost Ba...",Ghost Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
1,2058928,Hipposideros stenotis,Northern Leaf-nosed Bat,Hipposideros stenotis,https://biodiversity.org.au/afd/taxa/26fe0f53-...,dr492,"[{'key': 'vernacular name', 'value': 'Northern...",Northern Leaf-nosed bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
2,2058925,Hipposideros inornata,Arnhem Leaf-nosed Bat,Hipposideros inornatus,https://biodiversity.org.au/afd/taxa/5d2dab40-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhem L...",Arnhem Leaf-nosed Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
3,2058920,Pezoporus occidentalis,Night Parrot,Pezoporus occidentalis,https://biodiversity.org.au/afd/taxa/c630f3b0-...,dr492,"[{'key': 'vernacular name', 'value': 'Night Pa...",Night Parrot,,Round coordinate value to 0.5 decimal degrees ...,100km,sensitive,obscured
4,2058926,Polytelis alexandrae,Alexandra's Parrot,Polytelis alexandrae,https://biodiversity.org.au/afd/taxa/be7a08f5-...,dr492,"[{'key': 'vernacular name', 'value': 'Princess...",Princess Parrot,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured
5,2058929,Falco hypoleucos,Grey Falcon,Falco (Hierofalco) hypoleucos,https://biodiversity.org.au/afd/taxa/4c73a934-...,dr492,"[{'key': 'vernacular name', 'value': 'Grey Fal...",Grey Falcon,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured
6,2058922,Bellatorias obiri,Arnhem Land Egernia,Bellatorias obiri,https://biodiversity.org.au/afd/taxa/2afc8501-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhemla...",Arnhemland Egernia,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured
7,2058923,Attacus wardi,Atlas Moth,Attacus wardi,https://biodiversity.org.au/afd/taxa/8a05008e-...,dr492,"[{'key': 'vernacular name', 'value': 'Atlas Mo...",Atlas Moth,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured
8,2058924,Ogyris iphis doddi,Dodd’s Azure,Ogyris iphis doddi,https://biodiversity.org.au/afd/taxa/ae3ab4c9-...,dr492,"[{'key': 'vernacular name', 'value': 'Dodd’s A...",Dodd’s Azure,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured
9,2058927,Candalides geminus,,Erina geminus geminus,https://biodiversity.org.au/afd/taxa/14d46baa-...,dr492,"[{'key': 'vernacular name', 'value': 'Twin Dus...",Twin Dusky-blue,eastern sub-population,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured


In [166]:
# Read conservation list data
conservationlist = pd.read_csv(sourcedir + "nt-ala-conservation.csv", dtype=str)
conservationlist['nt_geoprivacy'] = 'open'
# conservationlist = conservationlist.rename(columns={'scientificName':'nt_scientificName'})

conservationlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,status,sourceStatus,taxonRemarks,nt_geoprivacy
0,2640283,Dasyuroides byrnei,Kowari,Dasyuroides byrnei,https://biodiversity.org.au/afd/taxa/c342ff42-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Mammal,open
1,2640455,Phascogale calura,Red-tailed Phascogale,Phascogale calura,https://biodiversity.org.au/afd/taxa/36b436b1-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Mammal,open
2,2640350,Pseudomys fieldi,Shark Bay Mouse,Pseudomys fieldi,https://biodiversity.org.au/afd/taxa/edcf01fa-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Mammal,open
3,2640368,Dasyurus geoffroii,Western Quoll,Dasyurus geoffroii,https://biodiversity.org.au/afd/taxa/a2260672-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Mammal,open
4,2640338,Acacia latzii,Latz's Wattle,Acacia latzii,https://id.biodiversity.org.au/node/apni/2906346,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable,,open
...,...,...,...,...,...,...,...,...,...,...,...
199,2640385,Leipoa ocellata,Malleefowl,Leipoa ocellata,https://biodiversity.org.au/afd/taxa/c44c9098-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Bird,open
200,2640324,Dasyurus hallucatus,Digul [gogo-yimidir],Dasyurus hallucatus,https://biodiversity.org.au/afd/taxa/5d7aeda8-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Mammal,open
201,2640334,Pedionomus torquatus,Plains-wanderer,Pedionomus torquatus,https://biodiversity.org.au/afd/taxa/30b4b2e5-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Bird,open
202,2640315,Vincentrachia desmonda,Saddle Creek Rocksnail,Vincentrachia desmonda,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Invertebrate,open


In [167]:
# Fix duplicate/conflicts detected
conservationlist.at[59, 'nt_geoprivacy'] = 'obscured'
conservationlist.at[159, 'nt_geoprivacy'] = 'obscured'
conservationlist.at[77, 'nt_geoprivacy'] = 'obscured'
conservationlist.at[169, 'nt_geoprivacy'] = 'obscured'
conservationlist.at[176, 'nt_geoprivacy'] = 'obscured'
conservationlist.at[104, 'nt_geoprivacy'] = 'obscured'


In [168]:
# Sensitive list does not have a status.
# Records need status from Conservation list for those species
slist = sensitivelist['name']
slistloc = conservationlist.loc[conservationlist['name'].isin(slist)]
mlist=  sensitivelist.merge(slistloc[['status','name']],how="left",left_on="name",right_on="name")
mlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,vernacular name,Criteria,Denature level,generalisation,status_x,nt_geoprivacy,status_y
0,2058921,Macroderma gigas,Ghost Bat,Macroderma gigas,https://biodiversity.org.au/afd/taxa/63bc796a-...,dr492,"[{'key': 'vernacular name', 'value': 'Ghost Ba...",Ghost Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured,
1,2058928,Hipposideros stenotis,Northern Leaf-nosed Bat,Hipposideros stenotis,https://biodiversity.org.au/afd/taxa/26fe0f53-...,dr492,"[{'key': 'vernacular name', 'value': 'Northern...",Northern Leaf-nosed bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured,
2,2058925,Hipposideros inornata,Arnhem Leaf-nosed Bat,Hipposideros inornatus,https://biodiversity.org.au/afd/taxa/5d2dab40-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhem L...",Arnhem Leaf-nosed Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured,
3,2058920,Pezoporus occidentalis,Night Parrot,Pezoporus occidentalis,https://biodiversity.org.au/afd/taxa/c630f3b0-...,dr492,"[{'key': 'vernacular name', 'value': 'Night Pa...",Night Parrot,,Round coordinate value to 0.5 decimal degrees ...,100km,sensitive,obscured,Endangered
4,2058926,Polytelis alexandrae,Alexandra's Parrot,Polytelis alexandrae,https://biodiversity.org.au/afd/taxa/be7a08f5-...,dr492,"[{'key': 'vernacular name', 'value': 'Princess...",Princess Parrot,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,Vulnerable
5,2058929,Falco hypoleucos,Grey Falcon,Falco (Hierofalco) hypoleucos,https://biodiversity.org.au/afd/taxa/4c73a934-...,dr492,"[{'key': 'vernacular name', 'value': 'Grey Fal...",Grey Falcon,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,Vulnerable
6,2058922,Bellatorias obiri,Arnhem Land Egernia,Bellatorias obiri,https://biodiversity.org.au/afd/taxa/2afc8501-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhemla...",Arnhemland Egernia,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,Endangered
7,2058923,Attacus wardi,Atlas Moth,Attacus wardi,https://biodiversity.org.au/afd/taxa/8a05008e-...,dr492,"[{'key': 'vernacular name', 'value': 'Atlas Mo...",Atlas Moth,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,Vulnerable
8,2058924,Ogyris iphis doddi,Dodd’s Azure,Ogyris iphis doddi,https://biodiversity.org.au/afd/taxa/ae3ab4c9-...,dr492,"[{'key': 'vernacular name', 'value': 'Dodd’s A...",Dodd’s Azure,,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,Endangered
9,2058927,Candalides geminus,,Erina geminus geminus,https://biodiversity.org.au/afd/taxa/14d46baa-...,dr492,"[{'key': 'vernacular name', 'value': 'Twin Dus...",Twin Dusky-blue,eastern sub-population,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured,


In [169]:
sensitivelist['status']  = mlist.status_y.fillna(mlist.status_x)
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,vernacular name,Criteria,Denature level,generalisation,status,nt_geoprivacy
0,2058921,Macroderma gigas,Ghost Bat,Macroderma gigas,https://biodiversity.org.au/afd/taxa/63bc796a-...,dr492,"[{'key': 'vernacular name', 'value': 'Ghost Ba...",Ghost Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
1,2058928,Hipposideros stenotis,Northern Leaf-nosed Bat,Hipposideros stenotis,https://biodiversity.org.au/afd/taxa/26fe0f53-...,dr492,"[{'key': 'vernacular name', 'value': 'Northern...",Northern Leaf-nosed bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
2,2058925,Hipposideros inornata,Arnhem Leaf-nosed Bat,Hipposideros inornatus,https://biodiversity.org.au/afd/taxa/5d2dab40-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhem L...",Arnhem Leaf-nosed Bat,cave roost related records,Round coordinate value to 0.05 decimal degrees...,10km,sensitive,obscured
3,2058920,Pezoporus occidentalis,Night Parrot,Pezoporus occidentalis,https://biodiversity.org.au/afd/taxa/c630f3b0-...,dr492,"[{'key': 'vernacular name', 'value': 'Night Pa...",Night Parrot,,Round coordinate value to 0.5 decimal degrees ...,100km,Endangered,obscured
4,2058926,Polytelis alexandrae,Alexandra's Parrot,Polytelis alexandrae,https://biodiversity.org.au/afd/taxa/be7a08f5-...,dr492,"[{'key': 'vernacular name', 'value': 'Princess...",Princess Parrot,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,Vulnerable,obscured
5,2058929,Falco hypoleucos,Grey Falcon,Falco (Hierofalco) hypoleucos,https://biodiversity.org.au/afd/taxa/4c73a934-...,dr492,"[{'key': 'vernacular name', 'value': 'Grey Fal...",Grey Falcon,nesting records,Round coordinate value to 0.1 decimal degrees ...,10km,Vulnerable,obscured
6,2058922,Bellatorias obiri,Arnhem Land Egernia,Bellatorias obiri,https://biodiversity.org.au/afd/taxa/2afc8501-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhemla...",Arnhemland Egernia,,Round coordinate value to 0.1 decimal degrees ...,10km,Endangered,obscured
7,2058923,Attacus wardi,Atlas Moth,Attacus wardi,https://biodiversity.org.au/afd/taxa/8a05008e-...,dr492,"[{'key': 'vernacular name', 'value': 'Atlas Mo...",Atlas Moth,,Round coordinate value to 0.1 decimal degrees ...,10km,Vulnerable,obscured
8,2058924,Ogyris iphis doddi,Dodd’s Azure,Ogyris iphis doddi,https://biodiversity.org.au/afd/taxa/ae3ab4c9-...,dr492,"[{'key': 'vernacular name', 'value': 'Dodd’s A...",Dodd’s Azure,,Round coordinate value to 0.1 decimal degrees ...,10km,Endangered,obscured
9,2058927,Candalides geminus,,Erina geminus geminus,https://biodiversity.org.au/afd/taxa/14d46baa-...,dr492,"[{'key': 'vernacular name', 'value': 'Twin Dus...",Twin Dusky-blue,eastern sub-population,Round coordinate value to 0.1 decimal degrees ...,10km,sensitive,obscured


In [170]:
slistloc

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,status,sourceStatus,taxonRemarks,nt_geoprivacy
59,2640339,Attacus wardi,Atlas Moth,Attacus wardi,https://biodiversity.org.au/afd/taxa/8a05008e-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable,Invertebrate,obscured
77,2640439,Falco hypoleucos,Grey Falcon,Falco (Hierofalco) hypoleucos,https://biodiversity.org.au/afd/taxa/4c73a934-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable,Bird,obscured
104,2640349,Polytelis alexandrae,Alexandra's Parrot,Polytelis alexandrae,https://biodiversity.org.au/afd/taxa/be7a08f5-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable,Bird,obscured
159,2640427,Bellatorias obiri,Arnhem Land Gorges Skink,Bellatorias obiri,https://biodiversity.org.au/afd/taxa/2afc8501-...,dr651,"[{'key': 'status', 'value': 'Endangered'}, {'k...",Endangered,Endangered,Reptile,obscured
169,2640464,Ogyris iphis doddi,Dodd’s Azure,Ogyris iphis doddi,https://biodiversity.org.au/afd/taxa/ae3ab4c9-...,dr651,"[{'key': 'status', 'value': 'Endangered'}, {'k...",Endangered,Endangered,Invertebrate,obscured
176,2640414,Pezoporus occidentalis,Night Parrot,Pezoporus occidentalis,https://biodiversity.org.au/afd/taxa/c630f3b0-...,dr651,"[{'key': 'status', 'value': 'Endangered'}, {'k...",Endangered,Endangered,Bird,obscured


In [171]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = pd.concat([sensitivelist[['name','status', 'nt_geoprivacy', 'lsid']],
                    conservationlist[['name','status', 'nt_geoprivacy', 'lsid']]]).drop_duplicates()
# retrieve binomial and trinomial names from GBIF
parsednames = lf.gbifparse(statelist)
parsednames.to_csv(sourcedir + "nt-gbif.csv", index=False)
statelist = statelist.merge(parsednames[['scientificName','canonicalName','canonicalNameComplete','type','rankMarker']],how="left",left_on="name",right_on="scientificName")
numfullstatelist = len(statelist.index)
statelist['scientificName'] = statelist['canonicalName']
statelist

Unnamed: 0,name,status,nt_geoprivacy,lsid,scientificName,canonicalName,canonicalNameComplete,type,rankMarker
0,Macroderma gigas,sensitive,obscured,https://biodiversity.org.au/afd/taxa/63bc796a-...,Macroderma gigas,Macroderma gigas,Macroderma gigas,SCIENTIFIC,sp.
1,Hipposideros stenotis,sensitive,obscured,https://biodiversity.org.au/afd/taxa/26fe0f53-...,Hipposideros stenotis,Hipposideros stenotis,Hipposideros stenotis,SCIENTIFIC,sp.
2,Hipposideros inornata,sensitive,obscured,https://biodiversity.org.au/afd/taxa/5d2dab40-...,Hipposideros inornata,Hipposideros inornata,Hipposideros inornata,SCIENTIFIC,sp.
3,Pezoporus occidentalis,Endangered,obscured,https://biodiversity.org.au/afd/taxa/c630f3b0-...,Pezoporus occidentalis,Pezoporus occidentalis,Pezoporus occidentalis,SCIENTIFIC,sp.
4,Polytelis alexandrae,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/be7a08f5-...,Polytelis alexandrae,Polytelis alexandrae,Polytelis alexandrae,SCIENTIFIC,sp.
...,...,...,...,...,...,...,...,...,...
203,Leipoa ocellata,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/c44c9098-...,Leipoa ocellata,Leipoa ocellata,Leipoa ocellata,SCIENTIFIC,sp.
204,Dasyurus hallucatus,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/5d7aeda8-...,Dasyurus hallucatus,Dasyurus hallucatus,Dasyurus hallucatus,SCIENTIFIC,sp.
205,Pedionomus torquatus,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/30b4b2e5-...,Pedionomus torquatus,Pedionomus torquatus,Pedionomus torquatus,SCIENTIFIC,sp.
206,Vincentrachia desmonda,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...,Vincentrachia desmonda,Vincentrachia desmonda,Vincentrachia desmonda,SCIENTIFIC,sp.


In [172]:
# Identify records that won't comply with iNaturalist species names
noncomply = statelist[statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ]
noncomply

Unnamed: 0,name,status,nt_geoprivacy,lsid,scientificName,canonicalName,canonicalNameComplete,type,rankMarker
37,Gleichenia sp. Victoria River,Vulnerable,open,ALA_DR651_28,Gleichenia sp.Victoria,Gleichenia sp.Victoria,Gleichenia sp.Victoria,INFORMAL,sp.
40,Hibbertia sp. South Magela,Vulnerable,open,ALA_DR651_31,Hibbertia sp.South,Hibbertia sp.South,Hibbertia sp.South,INFORMAL,sp.
60,Triodia sp. Matt Wilson,Vulnerable,open,ALA_DR651_51,Triodia sp.Matt,Triodia sp.Matt,Triodia sp.Matt,INFORMAL,sp.
62,Typhonium sp. Sandover,Vulnerable,open,ALA_DR651_53,Typhonium sp.Sandover,Typhonium sp.Sandover,Typhonium sp.Sandover,INFORMAL,sp.
145,Burmannia sp. Bathurst Island,Endangered,open,ALA_DR651_139,Burmannia sp.Bathurst,Burmannia sp.Bathurst,Burmannia sp.Bathurst,INFORMAL,sp.
147,Erythroxylum sp. Cholmondely Creek,Endangered,open,ALA_DR651_141,Erythroxylum sp.Cholmondely,Erythroxylum sp.Cholmondely,Erythroxylum sp.Cholmondely,INFORMAL,sp.
151,Livistona mariae subsp. Mariae,Endangered,open,https://id.biodiversity.org.au/node/apni/8570955,Livistona mariae subsp.,Livistona mariae subsp.,Livistona mariae subsp.,INFORMAL,subsp.


In [173]:
# remove records that do not comply
statelist = statelist[~statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ]
statelist = pd.DataFrame(statelist[['scientificName','status','nt_geoprivacy','lsid']]).drop_duplicates()
statelist

Unnamed: 0,scientificName,status,nt_geoprivacy,lsid
0,Macroderma gigas,sensitive,obscured,https://biodiversity.org.au/afd/taxa/63bc796a-...
1,Hipposideros stenotis,sensitive,obscured,https://biodiversity.org.au/afd/taxa/26fe0f53-...
2,Hipposideros inornata,sensitive,obscured,https://biodiversity.org.au/afd/taxa/5d2dab40-...
3,Pezoporus occidentalis,Endangered,obscured,https://biodiversity.org.au/afd/taxa/c630f3b0-...
4,Polytelis alexandrae,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/be7a08f5-...
...,...,...,...,...
203,Leipoa ocellata,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/c44c9098-...
204,Dasyurus hallucatus,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/5d7aeda8-...
205,Pedionomus torquatus,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/30b4b2e5-...
206,Vincentrachia desmonda,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...


In [174]:
parsednames['type'].unique()

array(['SCIENTIFIC', 'INFORMAL'], dtype=object)

In [175]:
# check for duplicates with conflicting information

In [176]:
dupinformation = statelist.groupby('scientificName').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
dupinformation
#df.groupby('hash').filter(lambda group: len(group) > 1).sort('size', ascending=False)

Unnamed: 0,scientificName,status,nt_geoprivacy,lsid


### 4. Equivalent IUCN statuses

In [177]:
# iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild', 'Extinct'}
statelist.groupby(['status'])['status'].count()

status
Critically Endangered     20
Endangered                49
Extinct                   11
Vulnerable               117
sensitive                  4
Name: status, dtype: int64

In [178]:
iucnStatusMappings = {
    'sensitive':'Vulnerable',
    'critically endangered': 'Critically Endangered',
    'endangered': 'Endangered',
    'extinct': 'Extinct',
    'not listed': 'Vulnerable',
    'vulnerable': 'Vulnerable'
}

### 5. Determine best place ID to use

In [179]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()

place_id  place_name          place_display_name    
9994      Northern Territory  Northern Territory, AU    12
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State sensitive list on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [180]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['scientificName', 'nt_geoprivacy', 'lsid']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,scientificName,nt_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status,geoprivacy,place_id,place_display_name
138,Abrodictyum obscurum,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,,,,
182,Acacia equisetifolia,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,,,,
14,Acacia latzii,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,,,,
139,Acacia peuce,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,,,,
15,Acacia praetermissa,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
155,Xylopia monosperma,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,,,,
61,Zeuxine oblonga,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,,,,
64,Zyzomys maini,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,,,,
165,Zyzomys palatalis,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,,,,


In [181]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = " Territory Parks and Wildlife Conservation Act 1976"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://nt.gov.au/environment"
biesearchurl = "https://bie.ala.org.au/species/" # eg + "https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_url'] =  biesearchurl + mergedstatuses['lsid']
mergedstatuses['new_geoprivacy'] = "obscured"
mergedstatuses['new_place_id'] = '9994'  # Northern Territory
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Threatened')
mergedstatuses

Unnamed: 0,scientificName,nt_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status,...,place_id,place_display_name,new_authority,new_description,new_url,new_geoprivacy,new_place_id,new_username,new_iucn_equivalent,new_status
138,Abrodictyum obscurum,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
182,Acacia equisetifolia,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
14,Acacia latzii,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
139,Acacia peuce,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
15,Acacia praetermissa,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
155,Xylopia monosperma,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
61,Zeuxine oblonga,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Vulnerable,Threatened
64,Zyzomys maini,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Vulnerable,Threatened
165,Zyzomys palatalis,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Vulnerable,Threatened


## Updates

In [182]:
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
6,UPDATE,Bellatorias obiri,152433,38633,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
63,UPDATE,Hipposideros inornatus,152434,74425,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
1,UPDATE,Hipposideros stenotis,152436,40743,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
0,UPDATE,Macroderma gigas,152432,41326,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
3,UPDATE,Pezoporus occidentalis,152431,73180,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
4,UPDATE,Polytelis alexandrae,152435,19250,endangered,Endangered,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


## No status in iNaturalist via straight scientificName match
The NT records that didn't match up to a status in iNaturalist

In [183]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,scientificName,nt_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,,...,,,,,,,,,,
1,Acacia equisetifolia,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
190,Xylopia monosperma,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,,...,,,,,,,,,,
191,Zeuxine oblonga,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
192,Zyzomys maini,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
193,Zyzomys palatalis,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [184]:
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions

Unnamed: 0,scientificName,nt_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
1,Acacia equisetifolia,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
5,Acacia undoolyana,open,https://id.biodiversity.org.au/node/apni/2891259,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,undoolyana,,2022-04-05T02:34:14Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
188,Vidumelon wattii,open,https://biodiversity.org.au/afd/taxa/3aedec6e-...,,,,,,,,...,Mollusca,Gastropoda,Stylommatophora,Camaenidae,Vidumelon,wattii,,2021-10-29T15:37:54Z,species,http://www.catalogueoflife.org/annual-checklis...
191,Zeuxine oblonga,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
192,Zyzomys maini,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
193,Zyzomys palatalis,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [185]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                      'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,ADD,Acacia equisetifolia,,1253756,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
2,ADD,Acacia latzii,,1254327,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
3,ADD,Acacia peuce,,465191,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
4,ADD,Acacia praetermissa,,1254561,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
5,ADD,Acacia undoolyana,,1254884,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
...,...,...,...,...,...,...,...,...,...,...,...,...
188,ADD,Vidumelon wattii,,114966,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
191,ADD,Zeuxine oblonga,,369267,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://id.biod...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
192,ADD,Zyzomys maini,,45377,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
193,ADD,Zyzomys palatalis,,75238,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://bie.ala.org.au/species/https://biodive...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


In [186]:
# write these to the file
pd.concat([updates,additions]).to_csv(sourcedir + "nt.csv", index=False)

In [187]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,scientificName,nt_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,,...,,,,,,,,,,
19,Babingtonia behrii,open,https://id.biodiversity.org.au/taxon/apni/5131...,,,,,,,,...,,,,,,,,,,
20,Baumea arthrophylla,open,https://id.biodiversity.org.au/node/apni/2890775,,,,,,,,...,,,,,,,,,,
22,Bettongia lesueur graii,open,https://biodiversity.org.au/afd/taxa/7a019451-...,,,,,,,,...,,,,,,,,,,
32,Candalides geminus,obscured,https://biodiversity.org.au/afd/taxa/14d46baa-...,,,,,,,,...,,,,,,,,,,
46,Ctenotus rimacola camptris,open,https://biodiversity.org.au/afd/taxa/ef8aa425-...,,,,,,,,...,,,,,,,,,,
54,Dienia montana,open,https://id.biodiversity.org.au/taxon/apni/5140...,,,,,,,,...,,,,,,,,,,
55,Dirutrachia sublevata,open,https://biodiversity.org.au/afd/taxa/75839d82-...,,,,,,,,...,,,,,,,,,,
57,Elaeocarpus miegei,open,https://id.biodiversity.org.au/node/apni/2890315,,,,,,,,...,,,,,,,,,,
58,Eleocharis papillosa,open,https://id.biodiversity.org.au/node/apni/2901498,,,,,,,,...,,,,,,,,,,


### are there any that need to be removed?
NT sensitive list count: 10
NT conservation list count: 204
NT inat statuses count: 12

updates to inat status: 9
additional inat status: 143
NT statuses we can't find a taxon match for in iNaturalist: 62
total:  (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over:  that may need checking

In [188]:
# inat statuses that aren't in added or updated
notaddedupdated = inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]
notaddedupdated

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
289,263901,1289379,702203.0,9994,,Atlas of Living Australia,VU,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Chloebia,gouldiae,,2022-10-20T02:55:37Z,Chloebia gouldiae,species,https://www.birds.cornell.edu/clementschecklis...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1,234788,918383,702203.0,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [189]:
# Stats
numsensitive = len(sensitivelist.index)
numconservation = len(conservationlist.index)
numupdates  = len(updates.index)
numadditions  = len(additions.index)
numnoinatstatus = len(noinatstatus.index)
numunknownToInat = len(unknownToInat.index)
numnotaddedupdated = len(notaddedupdated.index)
numnoncomply = len(noncomply.index)
numcomply = len(statelist.index)
numdupinfo = len(dupinformation.index)
d = {'Sensitive': [numsensitive],
    'Conservation': [numconservation],
    'Statelist merge': [numfullstatelist],
    'Species iNat Comply' : [numcomply],
    'Species iNat non-Comply': [numnoncomply],
    'Duplicate Information': [numdupinfo],
    'Updates': [numupdates],
    'Additions': [numadditions],
    'Not added updated': [numnotaddedupdated],
    'No Inat Status': [numnoinatstatus],
    'Unknown to Inat': [numunknownToInat]}

statsdf = pd.DataFrame(data=d)
statsdf

Unnamed: 0,Sensitive,Conservation,Statelist merge,Species iNat Comply,Species iNat non-Comply,Duplicate Information,Updates,Additions,Not added updated,No Inat Status,Unknown to Inat
0,10,204,208,201,7,0,6,141,6,195,54
