# iNaturalist status updates by state - VIC

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive and conservation list
4. Attempt to match the state statuses to an IUCN equivalent
5. Determine the best placeID to use for this state

**Next steps:**
Establish the changes that need to be made. Read in the sensitive and conservation list for each state.
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

### 1. iNaturalist statuses

In [1]:
import pandas as pd

# projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"

# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [2]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])

inatstatuses = filter_state_statuses(" VIC |Victoria|VICTORIA|Vic","vic.gov.au")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
159,264604,100611,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Threatened,https://www.environment.vic.gov.au/conserving-...,,open,...,Euastacus,armatus,,2022-06-06T16:36:21Z,Euastacus armatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
158,264603,100616,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Euastacus,bispinosus,,2022-06-06T16:26:39Z,Euastacus bispinosus,species,http://www.iucnredlist.org/apps/redlist/details,,,
2371,153834,100619,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,claytoni,,2020-05-28T05:05:59Z,Euastacus claytoni,species,http://www.iucnredlist.org/apps/redlist/details,,,
2388,153867,100620,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,crassus,,2020-05-28T05:04:27Z,Euastacus crassus,species,http://www.iucnredlist.org/apps/redlist/details,,,
3316,265501,100657,3249428,7830,,Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Euastacus,yanga,,2022-06-14T09:17:17Z,Euastacus yanga,species,http://www.iucnredlist.org/apps/redlist/details,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2865,153813,99966,708886,7830,16656,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Engaeus,sternalis,,2022-06-10T13:58:03Z,Engaeus sternalis,species,http://www.iucnredlist.org/apps/redlist/details,,,
2386,153863,99967,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Engaeus,strictifrons,,2020-05-28T05:03:37Z,Engaeus strictifrons,species,http://www.iucnredlist.org/apps/redlist/details,,,
163,264608,99969,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Engaeus,tuberculatus,,2022-07-02T08:00:10Z,Engaeus tuberculatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
2484,153828,99970,708886,7830,16656,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Engaeus,urostrictus,,2022-07-18T14:12:03Z,Engaeus urostrictus,species,http://www.iucnredlist.org/apps/redlist/details,,,


### 2. iNaturalist taxonomy

In [3]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists

Get the ALA Conservation and Sensitive lists


In [5]:
# %%script echo skipping # comment this line to download dataset from lists.ala.org.au the web and save locally
import sys
import os
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf

sensitivelist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr18669?max=10000&includeKVP=true")
sensitivelist = lf.kvp_to_columns(sensitivelist)
sensitivelist['vba_geoprivacy'] = "obscured"
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,taxonID,scientificNameAuthority,primaryDiscipline,speciesGroup,ffgactstatus,vicadvisorystatus,restrictedFlag,modified,extractDate,status,sourceStatus,epbcactStatus,vba_geoprivacy
0,2803999,Engaeus australis,Freshwater Crayfish Or Yabby,Engaeus australis,https://biodiversity.org.au/afd/taxa/feada41f-...,dr18669,"[{'key': 'taxonID', 'value': '1686'}, {'key': ...",1686,"Riek, 1969",Aquatic fauna,"Mussels, decapod crustacea",Critically Endangered,Vulnerable,rest,2013-12-18,2023-01-16,Critically Endangered,Critically Endangered,,obscured
0,2804013,Engaeus fultoni,Otway Burrowing Crayfish,Engaeus fultoni,https://biodiversity.org.au/afd/taxa/7994c955-...,dr18669,"[{'key': 'taxonID', 'value': '1674'}, {'key': ...",1674,"Smith & Schuster, 1913",Aquatic fauna,"Mussels, decapod crustacea",Vulnerable,Vulnerable,rest,2013-12-18,2023-01-16,Vulnerable,Vulnerable,,obscured
0,2804088,Engaeus mallacoota,Mallacoota Burrowing Crayfish,Engaeus mallacoota,https://biodiversity.org.au/afd/taxa/bf6f5d52-...,dr18669,"[{'key': 'taxonID', 'value': '1694'}, {'key': ...",1694,"Horwitz, 1990",Aquatic fauna,"Mussels, decapod crustacea",Critically Endangered,Vulnerable,rest,2013-12-17,2023-01-16,Critically Endangered,Critically Endangered,,obscured
0,2804052,Engaeus phyllocercus,Narracan Burrowing Crayfish,Engaeus phyllocercus,https://biodiversity.org.au/afd/taxa/bb2b1f80-...,dr18669,"[{'key': 'taxonID', 'value': '1695'}, {'key': ...",1695,"Smith & Schuster, 1913",Aquatic fauna,"Mussels, decapod crustacea",Endangered,Endangered,rest,2012-11-07,2023-01-16,Endangered,Endangered,,obscured
0,2803985,Engaeus rostrogaleatus,Strzelecki Burrowing Crayfish,Engaeus rostrogaleatus,https://biodiversity.org.au/afd/taxa/cd66d8b6-...,dr18669,"[{'key': 'taxonID', 'value': '1683'}, {'key': ...",1683,"Horwitz, 1990",Aquatic fauna,"Mussels, decapod crustacea",Endangered,Endangered,rest,2012-11-07,2023-01-16,Endangered,Endangered,,obscured
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
0,2804096,Synamphisopus ambiguus,Phreatoic Isopod,Synamphisopus ambiguus,https://biodiversity.org.au/afd/taxa/bc3b9067-...,dr18669,"[{'key': 'taxonID', 'value': '75168'}, {'key':...",75168,"(Sheard, 1936)",Terrestrial fauna,Invertebrates,Vulnerable,Vulnerable,rest,2010-09-16,2023-01-16,Vulnerable,Vulnerable,,obscured
0,2804078,Synamphisopus doegi,Phreatoic Isopod,Synamphisopus doegi,https://biodiversity.org.au/afd/taxa/fdb51ee6-...,dr18669,"[{'key': 'taxonID', 'value': '75169'}, {'key':...",75169,"Wilson & Keable, 2002",Terrestrial fauna,Invertebrates,Vulnerable,Vulnerable,rest,2012-11-20,2023-01-16,Vulnerable,Vulnerable,,obscured
0,2804004,Varanus rosenbergi,Heath Monitor,Varanus rosenbergi,https://biodiversity.org.au/afd/taxa/a01a6bb4-...,dr18669,"[{'key': 'taxonID', 'value': '12287'}, {'key':...",12287,,Terrestrial fauna,Reptiles,Critically Endangered,Endangered,rest,2020-04-14,2023-01-16,Critically Endangered,Critically Endangered,,obscured
0,2804077,Vermicella annulata,Bandy Bandy,Vermicella annulata,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...,dr18669,"[{'key': 'taxonID', 'value': '12734'}, {'key':...",12734,,Terrestrial fauna,Reptiles,Endangered,Vulnerable,rest,2018-08-03,2023-01-16,Endangered,Endangered,,obscured


In [6]:
conservationlist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr655?max=10000&includeKVP=true")
conservationlist = lf.kvp_to_columns(conservationlist)
conservationlist['vba_geoprivacy'] = conservationlist['restrictedFlag'].apply(lambda x: 'obscured' if -pd.isnull(x) else 'open')
conservationlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,taxonID,scientificNameAuthority,primaryDiscipline,...,ffgactstatus,vicadvisorystatus,modified,extractDate,status,sourceStatus,epbcactStatus,restrictedFlag,establishmentMeans,vba_geoprivacy
0,2803160,Ambassis agassizii,Agassiz's Glassfish,Ambassis agassizii,https://biodiversity.org.au/afd/taxa/b0ff773c-...,dr655,"[{'key': 'taxonID', 'value': '4864'}, {'key': ...",4864,"Steindachner, 1867",Aquatic fauna,...,Extinct,Regionally extinct,2013-04-04,2023-01-16,Extinct,Extinct,,,,obscured
0,2803307,Bidyanus bidyanus,Silver Perch,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,dr655,"[{'key': 'taxonID', 'value': '528544'}, {'key'...",528544,"(Mitchell, 1838)",Aquatic fauna,...,Endangered,Vulnerable,2016-05-24,2023-01-16,Endangered,Endangered,Critically Endangered,,,obscured
0,2802393,Chelodina expansa,Broad-shelled Turtle,Chelodina (Macrochelodina) expansa,https://biodiversity.org.au/afd/taxa/fc7d0724-...,dr655,"[{'key': 'taxonID', 'value': '5133'}, {'key': ...",5133,"Gray, 1857",Aquatic fauna,...,Endangered,Endangered,2014-11-20,2023-01-16,Endangered,Endangered,,,,obscured
0,2803025,Craterocephalus fluviatilis,Murray Hardyhead,Craterocephalus fluviatilis,https://biodiversity.org.au/afd/taxa/50568ccf-...,dr655,"[{'key': 'taxonID', 'value': '4784'}, {'key': ...",4784,"McCulloch, 1912",Aquatic fauna,...,Critically Endangered,Critically endangered,2013-04-04,2023-01-16,Critically Endangered,Critically Endangered,Endangered,,,obscured
0,2802906,Emydura macquarii,Southern River Turtles,Emydura macquarii,https://biodiversity.org.au/afd/taxa/39c22a1e-...,dr655,"[{'key': 'taxonID', 'value': '5135'}, {'key': ...",5135,,Aquatic fauna,...,Critically Endangered,Vulnerable,2013-07-02,2023-01-16,Critically Endangered,Critically Endangered,,,,obscured
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
0,2802846,Varanus varius,Lace Monitor,Varanus varius,https://biodiversity.org.au/afd/taxa/6338346a-...,dr655,"[{'key': 'taxonID', 'value': '12283'}, {'key':...",12283,,Terrestrial fauna,...,Endangered,Endangered,2013-04-29,2023-01-16,Endangered,Endangered,,,,obscured
0,2803060,Vermicella annulata,Bandy Bandy,Vermicella annulata,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...,dr655,"[{'key': 'taxonID', 'value': '12734'}, {'key':...",12734,,Terrestrial fauna,...,Endangered,Vulnerable,2018-08-03,2023-01-16,Endangered,Endangered,,rest,,open
0,2803837,Victaphanta compacta,Otway Black Snail,Victaphanta compacta,https://biodiversity.org.au/afd/taxa/e9582432-...,dr655,"[{'key': 'taxonID', 'value': '15050'}, {'key':...",15050,"(Cox & Hedley, 1912)",Terrestrial fauna,...,Endangered,Endangered,2010-12-02,2023-01-16,Endangered,Endangered,,rest,,open
0,2802531,Xenus cinereus,Terek Sandpiper,Xenus cinereus,https://biodiversity.org.au/afd/taxa/4090ad27-...,dr655,"[{'key': 'taxonID', 'value': '10160'}, {'key':...",10160,,Terrestrial fauna,...,Endangered,Endangered,2010-12-02,2023-01-16,Endangered,Endangered,,,,obscured


In [9]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = pd.concat([sensitivelist[['taxonID', 'name', 'status', 'vba_geoprivacy', 'lsid']],
                       conservationlist[['taxonID', 'name', 'status', 'vba_geoprivacy', 'lsid']]]).drop_duplicates()
statelist = statelist.rename(columns={'taxonID':'vba_taxonID','name':'vba_name','status':'vba_status'})
statelist

Unnamed: 0,vba_taxonID,vba_name,vba_status,vba_geoprivacy,lsid
0,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
0,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
0,1694,Engaeus mallacoota,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf6f5d52-...
0,1695,Engaeus phyllocercus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bb2b1f80-...
0,1683,Engaeus rostrogaleatus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/cd66d8b6-...
...,...,...,...,...,...
0,12283,Varanus varius,Endangered,obscured,https://biodiversity.org.au/afd/taxa/6338346a-...
0,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...
0,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...
0,10160,Xenus cinereus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/4090ad27-...


In [14]:
# check for duplicates with conflicting information
statelist.groupby('vba_taxonID').filter(lambda x: len(x) > 1)


Unnamed: 0,vba_taxonID,vba_name,vba_status,vba_geoprivacy,lsid
0,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
0,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
0,1694,Engaeus mallacoota,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf6f5d52-...
0,1695,Engaeus phyllocercus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bb2b1f80-...
0,1683,Engaeus rostrogaleatus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/cd66d8b6-...
...,...,...,...,...,...
0,75168,Synamphisopus ambiguus,Vulnerable,open,https://biodiversity.org.au/afd/taxa/bc3b9067-...
0,75169,Synamphisopus doegi,Vulnerable,open,https://biodiversity.org.au/afd/taxa/fdb51ee6-...
0,12287,Varanus rosenbergi,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/a01a6bb4-...
0,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...


Use the GBIF names parser to clean up the names

In [21]:
# %%script echo skipping # comment this line to run the gbif parser again the web and save a file locally

import requests

namesonly = alasensitivelist['name']
url = "https://api.gbif.org/v1/parser/name"
headers = {'content-type' : 'application/json'}
data = namesonly.to_json(orient="values")
params = {'name':data}
r = requests.post(url=url,data=data,headers=headers)
results = pd.read_json(r.text)
results.to_csv(sourcedir + "wa-gbif.csv")
results

Unnamed: 0,scientificName,type,genusOrAbove,specificEpithet,parsed,parsedPartially,canonicalName,canonicalNameComplete,canonicalNameWithMarker,rankMarker,strain,infraSpecificEpithet,authorship,notho,bracketAuthorship,bracketYear,cultivarEpithet
0,Abildgaardia pachyptera,SCIENTIFIC,Abildgaardia,pachyptera,True,False,Abildgaardia pachyptera,Abildgaardia pachyptera,Abildgaardia pachyptera,sp.,,,,,,,
1,Abutilon sp. Hamelin (A.M. Ashby 2196),INFORMAL,Abutilon,,True,False,Abutilon spec.,Abutilon spec. Hamelin,Abutilon spec. Hamelin,sp.,Hamelin,,,,,,
2,Abutilon sp. Onslow (F. Smith s.n. 10/9/61),INFORMAL,Abutilon,,True,False,Abutilon spec.,Abutilon spec. Onslow,Abutilon spec. Onslow,sp.,Onslow,,,,,,
3,Abutilon sp. Pritzelianum (S. van Leeuwen 5095),INFORMAL,Abutilon,,True,False,Abutilon spec.,Abutilon spec. Pritzelianum,Abutilon spec. Pritzelianum,sp.,Pritzelianum,,,,,,
4,Abutilon sp. Quobba (H. Demarz 3858),INFORMAL,Abutilon,,True,False,Abutilon spec.,Abutilon spec. Quobba,Abutilon spec. Quobba,sp.,Quobba,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4450,Zephyrarchaea mainae,SCIENTIFIC,Zephyrarchaea,mainae,True,False,Zephyrarchaea mainae,Zephyrarchaea mainae,Zephyrarchaea mainae,sp.,,,,,,,
4451,Zephyrarchaea marki,SCIENTIFIC,Zephyrarchaea,marki,True,False,Zephyrarchaea marki,Zephyrarchaea marki,Zephyrarchaea marki,sp.,,,,,,,
4452,Zephyrarchaea melindae,SCIENTIFIC,Zephyrarchaea,melindae,True,False,Zephyrarchaea melindae,Zephyrarchaea melindae,Zephyrarchaea melindae,sp.,,,,,,,
4453,Zephyrarchaea robinsi,SCIENTIFIC,Zephyrarchaea,robinsi,True,False,Zephyrarchaea robinsi,Zephyrarchaea robinsi,Zephyrarchaea robinsi,sp.,,,,,,,


In [19]:
%%script echo skipping single gbif parser test
import requests
t = requests.get(url="https://api.gbif.org/v1/parser/name?name=Calandrinia sp. Berry Springs (M.O. Parker 855) PN)")
t.text

<Response [200]>

Merge the parsed names back into the dataset

In [54]:
alasensitivelist = pd.read_csv(sourcedir + "wa-ala.csv", dtype=str)
parsednames = pd.read_csv(sourcedir + "wa-gbif.csv", dtype=str)
alasensitivelist = alasensitivelist.merge(parsednames[['scientificName','canonicalName','canonicalNameComplete','type','rankMarker']],how="left",left_on="name",right_on="scientificName")

***Report - unsuccessful parsed names with important statuses*** - these are being excluded iNaturalist won't accept the names

In [55]:
alasensitivelist[(alasensitivelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) & (alasensitivelist['sourceStatus'].isin(['CD','CR','EN','VU'])))][['name','scientificName_x','scientificName_y','lsid','canonicalName','canonicalNameComplete','type','rankMarker']]
#cols for debugging
#alasensitivelist[['name','scientificName_x','scientificName_y','lsid','canonicalName','canonicalNameComplete','type','rankMarker']]


Unnamed: 0,name,scientificName_x,scientificName_y,lsid,canonicalName,canonicalNameComplete,type,rankMarker
370,Andersonia sp. Saxatilis (F. & J. Hort 3324),Andersonia sp. Saxatilis (F. & J.Hort 3324),Andersonia sp. Saxatilis (F. & J. Hort 3324),https://id.biodiversity.org.au/node/apni/2900142,Andersonia spec.,Andersonia spec. Saxatilis,INFORMAL,sp.
926,Chamelaucium sp. Cataby (G.J. Keighery 11009),Chamelaucium sp. Cataby (G.J.Keighery 11009),Chamelaucium sp. Cataby (G.J. Keighery 11009),https://id.biodiversity.org.au/node/apni/2911451,Chamelaucium spec.,Chamelaucium spec. Cataby,INFORMAL,sp.
1138,Darwinia sp. Mt Heywood (R. Davis 11066),Darwinia sp. Mt Heywood (R.Davis 11066),Darwinia sp. Mt Heywood (R. Davis 11066),https://id.biodiversity.org.au/node/apni/2910287,Darwinia spec.,Darwinia spec. Mt Heywood,INFORMAL,sp.
1384,Eremophila glabra subsp. Scaddan (C. Turley s....,Eremophila glabra subsp. Scaddan (C.Turley s.n...,Eremophila glabra subsp. Scaddan (C. Turley s....,https://id.biodiversity.org.au/node/apni/2908854,Eremophila glabra subsp.,Eremophila glabra subsp. Scaddan,INFORMAL,subsp.
2006,Grevillea sp. Gillingarra (R.J. Cranfield 4087),Grevillea sp. Gillingarra (R.J.Cranfield 4087),Grevillea sp. Gillingarra (R.J. Cranfield 4087),ALA_DR656_926,Grevillea spec.,Grevillea spec. Gillingarra,INFORMAL,sp.
2223,Hypocalymma angustifolium subsp. Hutt River (S...,Hypocalymma angustifolium subsp. Hutt River (S...,Hypocalymma angustifolium subsp. Hutt River (S...,https://id.biodiversity.org.au/node/apni/2917460,Hypocalymma angustifolium subsp.,Hypocalymma angustifolium subsp. Hutt River,INFORMAL,subsp.
2234,Hypocalymma sp. Cascade (R. Bruhn 20896),Hypocalymma sp. Cascade (R.Bruhn 20896),Hypocalymma sp. Cascade (R. Bruhn 20896),https://id.biodiversity.org.au/node/apni/2886939,Hypocalymma spec.,Hypocalymma spec. Cascade,INFORMAL,sp.
2349,Lambertia orbifolia subsp. Scott River Plains ...,Lambertia orbifolia subsp. Scott River Plains ...,Lambertia orbifolia subsp. Scott River Plains ...,https://id.biodiversity.org.au/node/apni/2894654,Lambertia orbifolia subsp.,Lambertia orbifolia subsp. Scott River Plains,INFORMAL,subsp.
2508,Leucopogon sp. Manypeaks (A.S. George 6488),Leucopogon sp. Manypeaks (A.S.George 6488),Leucopogon sp. Manypeaks (A.S. George 6488),https://id.biodiversity.org.au/node/apni/2918123,Leucopogon spec.,Leucopogon spec. Manypeaks,INFORMAL,sp.
2614,Melaleuca sp. Wanneroo (G.J. Keighery 16705),Melaleuca sp. Wanneroo (G.J.Keighery 16705),Melaleuca sp. Wanneroo (G.J. Keighery 16705),https://id.biodiversity.org.au/node/apni/2903320,Melaleuca spec.,Melaleuca spec. Wanneroo,INFORMAL,sp.


Prepare final list for matching to inaturalist names

In [56]:
alasensitivelist = alasensitivelist[~alasensitivelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ] # remove 543 INFORMAL, 3 CULTIVAR, 14 HYBRID
alasensitivelist['wa_geoprivacy'] = 'obscured'
alasensitivelist['wa_taxonID'] = alasensitivelist['taxonId']#.apply(lambda x: int(float(x)))
alasensitivelist['wa_scientificName'] = alasensitivelist['canonicalName']
alasensitivelist['wa_status'] = alasensitivelist['status']
statelist = pd.DataFrame(alasensitivelist[['wa_taxonID','wa_scientificName','wa_status','wa_geoprivacy','lsid']])
statelist

Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,wa_geoprivacy,lsid
0,50593,Abildgaardia pachyptera,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/name/apni/51389644
6,14044,Acacia adinophylla,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2898130
7,44442,Acacia adjutrices,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/taxon/apni/5128...
8,16110,Acacia alata platyptera,"Priority 4: Rare, Near Threatened",obscured,https://id.biodiversity.org.au/node/apni/2904348
9,13074,Acacia alexandri,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2904701
...,...,...,...,...,...
4450,,Zephyrarchaea mainae,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/61b8777b-...
4451,,Zephyrarchaea marki,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/c135a409-...
4452,,Zephyrarchaea melindae,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/df8d4917-...
4453,,Zephyrarchaea robinsi,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/038e56d8-...


### 4. Equivalent IUCN statuses

In [57]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild','Extinct'}
statelist.groupby(['wa_status'])['wa_status'].count()

wa_status
Conservation Dependent                 6
Critically Endangered                210
Endangered                           194
Extinct                               37
Migratory                             94
Other Specially Protected              4
Priority 1: Poorly-known species     885
Priority 2: Poorly-known species     827
Priority 3: Poorly-known species     981
Priority 4: Rare, Near Threatened    397
Vulnerable                           260
Name: wa_status, dtype: int64

In [58]:
# these will be used to populate the iucn_equivalent field
iucnStatusMappings = {
    'conservation dependent': 'Vulnerable',
    'critically endangered': 'Critically Endangered',
    'endangered':'Endangered',
    'extinct':'Extinct',
    'migratory':'Vulnerable',
    'other specially protencted':'Vulnerable',
    'priority 1: poorly-known species':'Data Deficient',
    'priority 2: poorly-known species':'Data Deficient',
    'priority 3: poorly-known species':'Data Deficient',
    'priority 4: rare, near threatened':'Vulnerable',
    'vulnerable':'Vulnerable',
    'not evaluated':'Not Evaluated',
    'data deficient':'Data Deficient',
    'least concern':'Least Concern',
    'special least concern':'Least Concern',
    'near threatened':'Near Threatened',
    'extinct in the wild':'Extinct in the Wild',
}

### 5. Determine best place ID to use

In [61]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 6827 - note for extract


place_id  place_name         place_display_name   
                                                        1
6744      Australia          Australia                  1
6827      Western Australia  Western Australia, AU    982
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State lists on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [64]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['wa_taxonID','wa_scientificName','wa_status','wa_geoprivacy','lsid']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='wa_scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,wa_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,iucn,authority,status,geoprivacy,place_id,place_display_name
1,14044,Acacia adinophylla,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2898130,152923,Acacia adinophylla,898581,708886,,40,WA Department of Environment and Convservation,endangered,obscured,6827,"Western Australia, AU"
2,44442,Acacia adjutrices,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/taxon/apni/5128...,153375,Acacia adjutrices,898583,708886,,40,WA Department of Environment and Convservation,endangered,obscured,6827,"Western Australia, AU"
4,13074,Acacia alexandri,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2904701,153652,Acacia alexandri,898592,708886,,40,WA Department of Environment and Convservation,endangered,obscured,6827,"Western Australia, AU"
5,14046,Acacia ampliata,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2919087,153206,Acacia ampliata,827789,708886,,40,WA Department of Environment and Convservation,endangered,obscured,6827,"Western Australia, AU"
6,14047,Acacia amyctica,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2902736,153042,Acacia amyctica,898602,708886,,40,WA Department of Environment and Convservation,endangered,obscured,6827,"Western Australia, AU"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3871,,Turnix varius scintillans,Endangered,obscured,https://biodiversity.org.au/afd/taxa/a5f510c9-...,,,,,,,,,,,
3872,,Tursiops aduncus,Migratory,obscured,https://biodiversity.org.au/afd/taxa/0cfe42e3-...,,,,,,,,,,,
3873,,Tyto novaehollandiae kimberli,Priority 1: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/d1a27333-...,,,,,,,,,,,
3874,,Tyto novaehollandiae novaehollandiae,Priority 3: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/44488be2-...,,,,,,,,,,,


In [66]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
#mergedstatuses['new_description'] = "Listed as sensitive - refer to https://www.dpaw.wa.gov.au/plants-and-animals/threatened-species-and-communities"
# url is either a florabase url or bie page
florabaseurl = "https://florabase.dpaw.wa.gov.au/browse/profile/"
biesearchurl = "https://bie.ala.org.au/species/" # eg + "https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_url'] = mergedstatuses.apply(lambda x: biesearchurl + x['lsid'] if pd.isna(x['wa_taxonID']) else florabaseurl + x['wa_taxonID'],axis=1)
floradescrurl = "Listed as Confidential - refer to https://www.dpaw.wa.gov.au/plants-and-animals/threatened-species-and-communities/threatened-plants"
faunadescrurl = "Listed as Confidential - refer to https://www.dpaw.wa.gov.au/plants-and-animals/threatened-species-and-communities/threatened-animals"
mergedstatuses['new_description'] = mergedstatuses.apply(lambda x: faunadescrurl if pd.isna(x['wa_taxonID']) else floradescrurl,axis=1)
mergedstatuses['new_authority'] = "WA Deparment of Biodiversity, Conservation and Attractions"
mergedstatuses.rename(columns={'wa_geoprivacy':'new_geoprivacy'},inplace=True)
mergedstatuses['new_place_id'] = '6827'  # Queensland, AU
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['wa_status'].fillna('Sensitive')
mergedstatuses

Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,new_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,...,geoprivacy,place_id,place_display_name,new_url,new_description,new_authority,new_place_id,new_username,new_iucn_equivalent,new_status
1,14044,Acacia adinophylla,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2898130,152923,Acacia adinophylla,898581,708886,,...,obscured,6827,"Western Australia, AU",https://florabase.dpaw.wa.gov.au/browse/profil...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Endangered,Priority 1: Poorly-known species
2,44442,Acacia adjutrices,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/taxon/apni/5128...,153375,Acacia adjutrices,898583,708886,,...,obscured,6827,"Western Australia, AU",https://florabase.dpaw.wa.gov.au/browse/profil...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Endangered,Priority 3: Poorly-known species
4,13074,Acacia alexandri,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2904701,153652,Acacia alexandri,898592,708886,,...,obscured,6827,"Western Australia, AU",https://florabase.dpaw.wa.gov.au/browse/profil...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Endangered,Priority 3: Poorly-known species
5,14046,Acacia ampliata,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2919087,153206,Acacia ampliata,827789,708886,,...,obscured,6827,"Western Australia, AU",https://florabase.dpaw.wa.gov.au/browse/profil...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Endangered,Priority 1: Poorly-known species
6,14047,Acacia amyctica,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2902736,153042,Acacia amyctica,898602,708886,,...,obscured,6827,"Western Australia, AU",https://florabase.dpaw.wa.gov.au/browse/profil...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Endangered,Priority 2: Poorly-known species
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3871,,Turnix varius scintillans,Endangered,obscured,https://biodiversity.org.au/afd/taxa/a5f510c9-...,,,,,,...,,,,https://bie.ala.org.au/species/https://biodive...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Vulnerable,Endangered
3872,,Tursiops aduncus,Migratory,obscured,https://biodiversity.org.au/afd/taxa/0cfe42e3-...,,,,,,...,,,,https://bie.ala.org.au/species/https://biodive...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Vulnerable,Migratory
3873,,Tyto novaehollandiae kimberli,Priority 1: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/d1a27333-...,,,,,,...,,,,https://bie.ala.org.au/species/https://biodive...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Vulnerable,Priority 1: Poorly-known species
3874,,Tyto novaehollandiae novaehollandiae,Priority 3: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/44488be2-...,,,,,,...,,,,https://bie.ala.org.au/species/https://biodive...,Listed as Confidential - refer to https://www....,"WA Deparment of Biodiversity, Conservation and...",6827,peggydnew,Vulnerable,Priority 3: Poorly-known species


## Updates

In [69]:
# those that need to be updated - we found a status
mergedstatuses[mergedstatuses['status_id'].notnull()][['wa_scientificName','wa_status','status_id','taxon_id','status','new_geoprivacy','geoprivacy','authority','user_id']]

Unnamed: 0,wa_scientificName,wa_status,status_id,taxon_id,status,new_geoprivacy,geoprivacy,authority,user_id
1,Acacia adinophylla,Priority 1: Poorly-known species,152923,898581,endangered,obscured,obscured,WA Department of Environment and Convservation,708886
2,Acacia adjutrices,Priority 3: Poorly-known species,153375,898583,endangered,obscured,obscured,WA Department of Environment and Convservation,708886
4,Acacia alexandri,Priority 3: Poorly-known species,153652,898592,endangered,obscured,obscured,WA Department of Environment and Convservation,708886
5,Acacia ampliata,Priority 1: Poorly-known species,153206,827789,endangered,obscured,obscured,WA Department of Environment and Convservation,708886
6,Acacia amyctica,Priority 2: Poorly-known species,153042,898602,endangered,obscured,obscured,WA Department of Environment and Convservation,708886
...,...,...,...,...,...,...,...,...,...
3894,Zephyrarchaea melindae,Vulnerable,153596,828667,vulnerable,obscured,obscured,WA Department of Environment and Convservation,708886
3895,Zephyrarchaea robinsi,Vulnerable,153176,828668,vulnerable,obscured,obscured,WA Department of Environment and Convservation,708886
3314,Zeuxine oblonga,Priority 2: Poorly-known species,169907,369267,NT,obscured,,Atlas of Living Australia,702203
3315,Zeuxine oblonga,Priority 2: Poorly-known species,153758,369267,NT,obscured,obscured,WA Department of Environment and Convservation,708886


In [70]:
# updates - create the data frame
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,UPDATE,Acacia adinophylla,152923,898581,Priority 1: Poorly-known species,Endangered,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
2,UPDATE,Acacia adjutrices,153375,898583,Priority 3: Poorly-known species,Endangered,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
4,UPDATE,Acacia alexandri,153652,898592,Priority 3: Poorly-known species,Endangered,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
5,UPDATE,Acacia ampliata,153206,827789,Priority 1: Poorly-known species,Endangered,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
6,UPDATE,Acacia amyctica,153042,898602,Priority 2: Poorly-known species,Endangered,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
...,...,...,...,...,...,...,...,...,...,...,...,...
3894,UPDATE,Zephyrarchaea melindae,153596,828667,Vulnerable,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3895,UPDATE,Zephyrarchaea robinsi,153176,828668,Vulnerable,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3314,UPDATE,Zeuxine oblonga,169907,369267,Priority 2: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3315,UPDATE,Zeuxine oblonga,153758,369267,Priority 2: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....


## No status in iNaturalist via straight scientificName match
The WA records that didn't match up to a status in iNaturalist

In [72]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="wa_scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,new_geoprivacy,lsid,status_id,scientificName_x,taxon_id,user_id,description,...,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName_y,taxonRank,references
0,50593,Abildgaardia pachyptera,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/name/apni/51389644,,,,,,...,,,,,,,,,,
1,16110,Acacia alata platyptera,"Priority 4: Rare, Near Threatened",obscured,https://id.biodiversity.org.au/node/apni/2904348,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,alata,platyptera,2019-02-16T06:09:01Z,Acacia alata platyptera,variety,http://www.ubio.org/browser/details.php?nameba...
2,14585,Acacia ancistrophylla lissophylla,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2916096,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,ancistrophylla,lissophylla,2022-03-07T22:20:25Z,Acacia ancistrophylla lissophylla,variety,https://powo.science.kew.org/taxon/urn:lsid:ip...
3,14048,Acacia ancistrophylla perarcuata,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2910813,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,ancistrophylla,perarcuata,2021-07-28T02:21:59Z,Acacia ancistrophylla perarcuata,variety,https://eol.org/pages/50482478
4,14725,Acacia ataxiphylla ataxiphylla,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2903075,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3019,,Turnix varius scintillans,Endangered,obscured,https://biodiversity.org.au/afd/taxa/a5f510c9-...,,,,,,...,Aves,Charadriiformes,Turnicidae,Turnix,varius,scintillans,2018-12-19T06:54:39Z,Turnix varius scintillans,subspecies,http://www.birds.cornell.edu/clementschecklist...
3020,,Tursiops aduncus,Migratory,obscured,https://biodiversity.org.au/afd/taxa/0cfe42e3-...,,,,,,...,Mammalia,Artiodactyla,Delphinidae,Tursiops,aduncus,,2019-11-23T00:16:07Z,Tursiops aduncus,species,http://www.catalogueoflife.org/annual-checklis...
3021,,Tyto novaehollandiae kimberli,Priority 1: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/d1a27333-...,,,,,,...,Aves,Strigiformes,Tytonidae,Tyto,novaehollandiae,kimberli,2018-12-19T08:22:30Z,Tyto novaehollandiae kimberli,subspecies,
3022,,Tyto novaehollandiae novaehollandiae,Priority 3: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/44488be2-...,,,,,,...,Aves,Strigiformes,Tytonidae,Tyto,novaehollandiae,novaehollandiae,2018-12-19T08:22:32Z,Tyto novaehollandiae novaehollandiae,subspecies,


In [73]:
noinatstatus[noinatstatus['id'].notna()] # there's no status but there is a matching inat taxon (id is the taxon id)
# note: "Dendrobium" matches to both genus and section

Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,new_geoprivacy,lsid,status_id,scientificName_x,taxon_id,user_id,description,...,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName_y,taxonRank,references
1,16110,Acacia alata platyptera,"Priority 4: Rare, Near Threatened",obscured,https://id.biodiversity.org.au/node/apni/2904348,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,alata,platyptera,2019-02-16T06:09:01Z,Acacia alata platyptera,variety,http://www.ubio.org/browser/details.php?nameba...
2,14585,Acacia ancistrophylla lissophylla,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2916096,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,ancistrophylla,lissophylla,2022-03-07T22:20:25Z,Acacia ancistrophylla lissophylla,variety,https://powo.science.kew.org/taxon/urn:lsid:ip...
3,14048,Acacia ancistrophylla perarcuata,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2910813,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,ancistrophylla,perarcuata,2021-07-28T02:21:59Z,Acacia ancistrophylla perarcuata,variety,https://eol.org/pages/50482478
6,31784,Acacia barrettiorum,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2917300,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,barrettiorum,,2022-04-07T01:35:57Z,Acacia barrettiorum,species,https://eol.org/pages/49426101
7,41461,Acacia bartlei,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2895538,,,,,,...,Magnoliopsida,Fabales,Fabaceae,Acacia,bartlei,,2022-04-07T01:36:00Z,Acacia bartlei,species,https://eol.org/pages/49426080
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3018,,Turgenitubulus christenseni,Endangered,obscured,https://biodiversity.org.au/afd/taxa/06192667-...,,,,,,...,Gastropoda,Stylommatophora,Camaenidae,Turgenitubulus,christenseni,,2021-10-29T18:32:13Z,Turgenitubulus christenseni,species,http://www.catalogueoflife.org/annual-checklis...
3019,,Turnix varius scintillans,Endangered,obscured,https://biodiversity.org.au/afd/taxa/a5f510c9-...,,,,,,...,Aves,Charadriiformes,Turnicidae,Turnix,varius,scintillans,2018-12-19T06:54:39Z,Turnix varius scintillans,subspecies,http://www.birds.cornell.edu/clementschecklist...
3020,,Tursiops aduncus,Migratory,obscured,https://biodiversity.org.au/afd/taxa/0cfe42e3-...,,,,,,...,Mammalia,Artiodactyla,Delphinidae,Tursiops,aduncus,,2019-11-23T00:16:07Z,Tursiops aduncus,species,http://www.catalogueoflife.org/annual-checklis...
3021,,Tyto novaehollandiae kimberli,Priority 1: Poorly-known species,obscured,https://biodiversity.org.au/afd/taxa/d1a27333-...,,,,,,...,Aves,Strigiformes,Tytonidae,Tyto,novaehollandiae,kimberli,2018-12-19T08:22:30Z,Tyto novaehollandiae kimberli,subspecies,


In [75]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions['scientificName'] = additions['wa_scientificName']
#additions['new_status'] = additions['wa_status']
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,ADD,Acacia alata platyptera,,145423,"Priority 4: Rare, Near Threatened",Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
2,ADD,Acacia ancistrophylla lissophylla,,1361077,Priority 2: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3,ADD,Acacia ancistrophylla perarcuata,,1252488,Priority 3: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
6,ADD,Acacia barrettiorum,,1252534,Priority 2: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
7,ADD,Acacia bartlei,,1252535,Priority 3: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://florabase.dpaw.wa.gov.au/browse/profil...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
...,...,...,...,...,...,...,...,...,...,...,...,...
3018,ADD,Turgenitubulus christenseni,,1242615,Endangered,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3019,ADD,Turnix varius scintillans,,708564,Endangered,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3020,ADD,Tursiops aduncus,,41481,Migratory,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....
3021,ADD,Tyto novaehollandiae kimberli,,732085,Priority 1: Poorly-known species,Vulnerable,"WA Deparment of Biodiversity, Conservation and...",https://bie.ala.org.au/species/https://biodive...,obscured,6827,peggydnew,Listed as Confidential - refer to https://www....


In [82]:
all = pd.concat([updates,additions])
all.to_csv(sourcedir + "wa.csv", index=False )

In [76]:
# what didnt match to a taxon?
noinatstatus[noinatstatus['id'].isna()]

Unnamed: 0,wa_taxonID,wa_scientificName,wa_status,new_geoprivacy,lsid,status_id,scientificName_x,taxon_id,user_id,description,...,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName_y,taxonRank,references
0,50593,Abildgaardia pachyptera,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/name/apni/51389644,,,,,,...,,,,,,,,,,
4,14725,Acacia ataxiphylla ataxiphylla,Priority 3: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2903075,,,,,,...,,,,,,,,,,
5,14687,Acacia ataxiphylla magna,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2905184,,,,,,...,,,,,,,,,,
9,44472,Acacia besleyi,Priority 1: Poorly-known species,obscured,https://id.biodiversity.org.au/taxon/apni/5128...,,,,,,...,,,,,,,,,,
21,14060,Acacia chapmanii chapmanii,Priority 2: Poorly-known species,obscured,https://id.biodiversity.org.au/node/apni/2887942,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3004,,Stygiochiropus sympatricus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/969d8510-...,,,,,,...,,,,,,,,,,
3005,,Stygocyclopia australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/e13aa7e8-...,,,,,,...,,,,,,,,,,
3006,,Thalassarche carteri,Endangered,obscured,https://biodiversity.org.au/afd/taxa/8368ea93-...,,,,,,...,,,,,,,,,,
3009,,Thalassarche impavida,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/710a7945-...,,,,,,...,,,,,,,,,,


In [79]:
noinatstatus[noinatstatus['id'].isna()].groupby('wa_status').size()

wa_status
Conservation Dependent                 1
Critically Endangered                 51
Endangered                            37
Extinct                               11
Migratory                              4
Priority 1: Poorly-known species     354
Priority 2: Poorly-known species     260
Priority 3: Poorly-known species     274
Priority 4: Rare, Near Threatened     67
Vulnerable                            68
dtype: int64

### are there any that need to be removed?
qld list count: 2517
qld inat statuses count: 653

updates to inat status: 570
additional inat status: 1355
qld statuses we can't find a taxon match for in iNaturalist: 606
total: 2531 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 653-570=83 that may need checking against the above

In [80]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]


Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
616,157227,1023268,,6827,,,NT,https://bie.ala.org.au/species/Ptilotus%20daphne,Atlas of Living Australia,obscured,...,Ptilotus,daphne,,2020-02-19T07:06:40Z,Ptilotus daphne,species,http://plantsoftheworldonline.org/,,,
624,157244,1023283,,6827,,,NT,https://bie.ala.org.au/species/Ptilotus%20seri...,Atlas of Living Australia,obscured,...,Ptilotus,sericostachyus,,2020-02-19T07:06:49Z,Ptilotus sericostachyus,species,http://plantsoftheworldonline.org/,,,
664,158009,1042669,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Corinomala,tumida,,2021-10-29T05:06:21Z,Corinomala tumida,species,,,,
1801,153169,107044,708886,6827,16654,WA Department of Environment and Convservation,critically endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Ningbingia,australis,,2021-10-29T09:25:06Z,Ningbingia australis,species,http://www.catalogueoflife.org/annual-checklis...,,,
1947,153332,108173,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Pachysaga,munggai,,2011-08-04T08:56:40Z,Pachysaga munggai,species,http://www.catalogueoflife.org/annual-checklis...,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1872,153220,866315,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Acacia,deltoidea,,2022-04-07T01:42:14Z,Acacia deltoidea,species,,,,
1711,153064,878695,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Youwanjela,wilsoni,,2021-10-29T14:01:16Z,Youwanjela wilsoni,species,https://eol.org/pages/49878898,,,
1753,153112,898346,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euphorbia,occidentaustralica,,2020-02-19T16:40:34Z,Euphorbia occidentaustralica,species,,,,
1690,153038,898351,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euphorbia,inappendiculata,,2020-02-19T16:40:42Z,Euphorbia inappendiculata,species,,,,
