# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive list
4. Attempt to match the state statuses to an IUCN equivalent


### 1. iNaturalist statuses

In [3]:
import pandas as pd
import sys
import os
projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
#projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf

sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"

# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [4]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])
inatstatuses = filter_state_statuses("ACT Government|Australian Capital Territory| ACT, AU", ".act.gov")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
2465,152234,100611,,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,armatus,,2022-06-06T16:36:21Z,Euastacus armatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
270,180956,1130138,3669610.0,12986,,ACT Government,EN,http://www.environment.gov.au/cgi-bin/sprat/pu...,,obscured,...,Prasophyllum,petilum,,2022-08-18T08:30:22Z,Prasophyllum petilum,species,http://www.catalogueoflife.org/annual-checklis...,,,
3305,266735,1255190,5159763.0,12986,,ACT Government,Endangered,https://www.environment.act.gov.au/nature-cons...,Listed as endangered in the ACT,obscured,...,Lepidium,ginninderrense,,2021-07-27T21:58:53Z,Lepidium ginninderrense,species,http://www.catalogueoflife.org/annual-checklis...,,,
2662,152233,138646,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Swainsona,recta,,2022-06-08T16:59:25Z,Swainsona recta,species,,,,
1078,152222,25262,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Pseudophryne,pengilleyi,,2018-11-27T04:54:16Z,Pseudophryne pengilleyi,species,http://research.amnh.org/vz/herpetology/amphib...,,,
1082,152227,36937,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Delma,impar,,2019-06-06T05:58:40Z,Delma impar,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1090,152221,36957,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Aprasia,parapulchella,,2022-04-07T23:19:10Z,Aprasia parapulchella,species,http://reptile-database.reptarium.cz/search.ph...,,,
1076,152230,39441,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Varanus,rosenbergi,,2021-07-28T09:18:05Z,Varanus rosenbergi,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
727,267488,42983,3249428.0,12986,,Australian Government,EN,http://www.environment.gov.au/cgi-bin/sprat/pu...,,open,...,Phascolarctos,cinereus,,2020-04-02T21:20:45Z,Phascolarctos cinereus,species,http://www.catalogueoflife.org/annual-checklis...,,,
1079,152224,45210,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Pseudomys,fumeus,,2019-11-22T22:45:54Z,Pseudomys fumeus,species,http://www.catalogueoflife.org/annual-checklis...,,,


### 2. iNaturalist taxonomy

In [5]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists
Sensitive list: `geoprivacy` = `obscured`
overrides

In [7]:
%%script echo skipping # comment this line to download dataset from lists.ala.org.au the web and save locally
# Download lists data. Retrieve binomial and trinomial names from GBIF. Save locally to CSV

sensitivelist = lf.download_ala_list("https://lists.ala.org.au/ws/speciesListItems/dr2627?max=10000&includeKVP=true")
sensitivelist = lf.kvp_to_columns(sensitivelist)
sensitivelist.to_csv(sourcedir + "act-ala-sensitive.csv", index=False)

conservationlist = lf.download_ala_list("https://lists.ala.org.au/ws/speciesListItems/dr649?max=10000&includeKVP=true")
conservationlist = lf.kvp_to_columns(conservationlist)
conservationlist.to_csv(sourcedir + "act-ala-conservation.csv", index=False)

skipping # comment this line to download dataset from lists.ala.org.au the web and save locally


In [6]:
# Read sensitive list data
sensitivelist = pd.read_csv(sourcedir + "act-ala-sensitive.csv", dtype=str)
#sensitivelist = sensitivelist.rename(columns={'T S Profile I D':'bionetTaxonID'})
#sensitivelist = sensitivelist.rename(columns={'conservation status': 'status'})
sensitivelist['act_geoprivacy'] = 'obscured'
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,generalisation description,vernacular name,conservation status,Authority,category,generalisation,Reason,act_geoprivacy
0,510630,Prasophyllum petilum,Tarengo Leek Orchid,Prasophyllum petilum,https://id.biodiversity.org.au/taxon/apni/5140...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,A leek orchid,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
1,510637,Corunastylis ectopa,,Genoplesium ectopum,https://id.biodiversity.org.au/taxon/apni/5140...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Brindabella Midge Orchid,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
2,510639,Caladenia actensis,Canberra Spider Orchid,Caladenia actensis,https://id.biodiversity.org.au/taxon/apni/5139...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Canberra Spider Orchid,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
3,510635,Lepidium ginninderrense,Ginninderra Peppercress,Lepidium ginninderrense,https://id.biodiversity.org.au/taxon/apni/5126...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Ginninderra Peppercress,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
4,510646,Pterostylis oreophila,Blue-tongued Orchid,Pterostylis oreophila,https://id.biodiversity.org.au/taxon/apni/5141...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Kiandra Greenhood,Critically endangered,Environment Protection and Biodiversity Conser...,High,1km,Disturbance and collection,obscured
5,510644,Swainsona recta,Small Purple-pea,Swainsona recta,https://id.biodiversity.org.au/node/apni/2916272,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Small Purple Pea,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
6,510634,Muehlenbeckia tuggeranong,Tuggeranong Lignum,Muehlenbeckia tuggeranong,https://id.biodiversity.org.au/node/apni/2896218,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Tuggeranong Lignum,Endangered,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured
7,510638,Tympanocryptis pinguicolla,Grassland Earless Dragon,Tympanocryptis pinguicolla,https://biodiversity.org.au/afd/taxa/5bceebc1-...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,,Endangered,ACT Nature Conservation Act 2014,Very High,1km,Disturbance and collection,obscured
8,510627,Pseudophryne pengilleyi,Northern Corroboree Frog,Pseudophryne pengilleyi,https://biodiversity.org.au/afd/taxa/cc5315bf-...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,,Critically endangered,ACT Nature Conservation Act 2014,Very High,1km,Disturbance and collection,obscured
9,510640,Perunga ochracea,Perunga Grasshopper,Perunga ochracea,https://biodiversity.org.au/afd/taxa/ab03e158-...,dr2627,"[{'key': 'generalisation description', 'value'...",Generalise to 0.01 or 1km,Perunga Grasshopper,Vulnerable,ACT Nature Conservation Act 2014,High,1km,Disturbance and collection,obscured


In [8]:
# Read conservation list data
conservationlist = pd.read_csv(sourcedir + "act-ala-conservation.csv", dtype=str)
#conservationlist = conservationlist.rename(columns={'T S Profile I D':'bionetTaxonID'})
conservationlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,category,taxonomicstatus,sourcestatus,status,authority,ngunnawal,comments,synonym,taxonremarks
0,4221235,Anthochaera phrygia,Regent Honeyeater,Anthochaera (Xanthomyza) phrygia,https://biodiversity.org.au/afd/taxa/31869a0e-...,dr649,"[{'key': 'category', 'value': 'Bird'}, {'key':...",Bird,accepted,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),,,,
1,4221247,Aprasia parapulchella,Pink-tailed Worm-lizard,Aprasia parapulchella,https://biodiversity.org.au/afd/taxa/0d74fa05-...,dr649,"[{'key': 'category', 'value': 'Reptile'}, {'ke...",Reptile,accepted,Vulnernable,Vulnernable,Nature Conservation Act 2014 (ACT),Banburung,,,
2,4221214,Bettongia gaimardi,Tasmanian Bettong,Bettongia gaimardi,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,dr649,"[{'key': 'category', 'value': 'Mammal'}, {'key...",Mammal,accepted,Regionally Conservation Dependent,Regionally Conservation Dependent,Nature Conservation Act 2014 (ACT),Ngaluda,Regionally Conservation Dependent,,
3,4221226,Bidyanus bidyanus,Silver Perch,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,dr649,"[{'key': 'category', 'value': 'Fish'}, {'key':...",Fish,accepted,Endangered,Endangered,Nature Conservation Act 2014 (ACT),Dhingur,,,
4,4221220,Bossiaea grayi,Murrumbidgee Bossiaea,Bossiaea grayi,https://id.biodiversity.org.au/node/apni/2910201,dr649,"[{'key': 'category', 'value': 'Plant'}, {'key'...",Plant,accepted,Endangered,Endangered,Nature Conservation Act 2014 (ACT),,,,
5,4221238,Botaurus poiciloptilus,Australasian Bittern,Botaurus poiciloptilus,https://biodiversity.org.au/afd/taxa/dba78701-...,dr649,"[{'key': 'category', 'value': 'Bird'}, {'key':...",Bird,accepted,Endangered,Endangered,Nature Conservation Act 2014 (ACT),,included in ACT Migratory Species Action Plan,,
6,4221241,Caladenia actensis,Canberra Spider Orchid,Caladenia actensis,https://id.biodiversity.org.au/taxon/apni/5139...,dr649,"[{'key': 'category', 'value': 'Plant'}, {'key'...",Plant,accepted,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),,,Arachnorchis actensis,
7,4221261,Calyptorhynchus lathami lathami,Glossy Black-cockatoo,Calyptorhynchus (Calyptorhynchus) lathami lathami,https://biodiversity.org.au/afd/taxa/9823b764-...,dr649,"[{'key': 'category', 'value': 'Bird'}, {'key':...",Bird,accepted,Vulnernable,Vulnernable,Nature Conservation Act 2014 (ACT),,Calyptorhynchus lathami species rank considere...,,
8,4221212,Climacteris picumnus victoriae,Brown Treecreeper (eastern Subspecies),Climacteris (Climacteris) picumnus victoriae,https://biodiversity.org.au/afd/taxa/fe69a214-...,dr649,"[{'key': 'category', 'value': 'Bird'}, {'key':...",Bird,accepted,Vulnernable,Vulnernable,Nature Conservation Act 2014 (ACT),,,,
9,4221255,Corunastylis ectopa,,Genoplesium ectopum,https://id.biodiversity.org.au/taxon/apni/5140...,dr649,"[{'key': 'category', 'value': 'Plant'}, {'key'...",Plant,accepted,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),,,,


In [10]:
#quick fix in status misspelling
conservationlist['status'] = conservationlist['status'].apply(lambda x: 'Vulnerable' if(x == "Vulnernable") else x)
#conservationlist[['a_status','status']].drop_duplicates()

In [11]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = conservationlist[['name','lsid','status','authority']].merge(sensitivelist[['name','lsid','act_geoprivacy']], how="left",on='name')
statelist = statelist[['name','lsid_x','act_geoprivacy','status','authority']]
statelist

Unnamed: 0,name,lsid_x,act_geoprivacy,status,authority
0,Anthochaera phrygia,https://biodiversity.org.au/afd/taxa/31869a0e-...,,Critically Endangered,Nature Conservation Act 2014 (ACT)
1,Aprasia parapulchella,https://biodiversity.org.au/afd/taxa/0d74fa05-...,obscured,Vulnerable,Nature Conservation Act 2014 (ACT)
2,Bettongia gaimardi,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,,Regionally Conservation Dependent,Nature Conservation Act 2014 (ACT)
3,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,,Endangered,Nature Conservation Act 2014 (ACT)
4,Bossiaea grayi,https://id.biodiversity.org.au/node/apni/2910201,obscured,Endangered,Nature Conservation Act 2014 (ACT)
5,Botaurus poiciloptilus,https://biodiversity.org.au/afd/taxa/dba78701-...,,Endangered,Nature Conservation Act 2014 (ACT)
6,Caladenia actensis,https://id.biodiversity.org.au/taxon/apni/5139...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT)
7,Calyptorhynchus lathami lathami,https://biodiversity.org.au/afd/taxa/9823b764-...,,Vulnerable,Nature Conservation Act 2014 (ACT)
8,Climacteris picumnus victoriae,https://biodiversity.org.au/afd/taxa/fe69a214-...,,Vulnerable,Nature Conservation Act 2014 (ACT)
9,Corunastylis ectopa,https://id.biodiversity.org.au/taxon/apni/5140...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT)


In [48]:
# %%script echo skipping # comment this line to download dataset
# retrieve binomial and trinomial names from GBIF
parsednames = lf.gbifparse(statelist)
parsednames.to_csv(sourcedir + "act-gbif.csv", index=False)


In [12]:
parsednames = pd.read_csv(sourcedir + "act-gbif.csv")
statelist = statelist.merge(parsednames,how="left",left_on="name",right_on="scientificName")
statelist = statelist.drop_duplicates()
numfullstatelist = len(statelist.index)
statelist['scientificName'] = statelist['canonicalName']
statelist

Unnamed: 0,name,lsid_x,act_geoprivacy,status,authority,scientificName,type,genusOrAbove,specificEpithet,parsed,parsedPartially,canonicalName,canonicalNameComplete,canonicalNameWithMarker,rankMarker,infraSpecificEpithet
0,Anthochaera phrygia,https://biodiversity.org.au/afd/taxa/31869a0e-...,,Critically Endangered,Nature Conservation Act 2014 (ACT),Anthochaera phrygia,SCIENTIFIC,Anthochaera,phrygia,True,False,Anthochaera phrygia,Anthochaera phrygia,Anthochaera phrygia,sp.,
1,Aprasia parapulchella,https://biodiversity.org.au/afd/taxa/0d74fa05-...,obscured,Vulnerable,Nature Conservation Act 2014 (ACT),Aprasia parapulchella,SCIENTIFIC,Aprasia,parapulchella,True,False,Aprasia parapulchella,Aprasia parapulchella,Aprasia parapulchella,sp.,
2,Bettongia gaimardi,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,,Regionally Conservation Dependent,Nature Conservation Act 2014 (ACT),Bettongia gaimardi,SCIENTIFIC,Bettongia,gaimardi,True,False,Bettongia gaimardi,Bettongia gaimardi,Bettongia gaimardi,sp.,
3,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,,Endangered,Nature Conservation Act 2014 (ACT),Bidyanus bidyanus,SCIENTIFIC,Bidyanus,bidyanus,True,False,Bidyanus bidyanus,Bidyanus bidyanus,Bidyanus bidyanus,sp.,
4,Bossiaea grayi,https://id.biodiversity.org.au/node/apni/2910201,obscured,Endangered,Nature Conservation Act 2014 (ACT),Bossiaea grayi,SCIENTIFIC,Bossiaea,grayi,True,False,Bossiaea grayi,Bossiaea grayi,Bossiaea grayi,sp.,
5,Botaurus poiciloptilus,https://biodiversity.org.au/afd/taxa/dba78701-...,,Endangered,Nature Conservation Act 2014 (ACT),Botaurus poiciloptilus,SCIENTIFIC,Botaurus,poiciloptilus,True,False,Botaurus poiciloptilus,Botaurus poiciloptilus,Botaurus poiciloptilus,sp.,
6,Caladenia actensis,https://id.biodiversity.org.au/taxon/apni/5139...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT),Caladenia actensis,SCIENTIFIC,Caladenia,actensis,True,False,Caladenia actensis,Caladenia actensis,Caladenia actensis,sp.,
7,Calyptorhynchus lathami lathami,https://biodiversity.org.au/afd/taxa/9823b764-...,,Vulnerable,Nature Conservation Act 2014 (ACT),Calyptorhynchus lathami lathami,SCIENTIFIC,Calyptorhynchus,lathami,True,False,Calyptorhynchus lathami lathami,Calyptorhynchus lathami lathami,Calyptorhynchus lathami lathami,infrasp.,lathami
8,Climacteris picumnus victoriae,https://biodiversity.org.au/afd/taxa/fe69a214-...,,Vulnerable,Nature Conservation Act 2014 (ACT),Climacteris picumnus victoriae,SCIENTIFIC,Climacteris,picumnus,True,False,Climacteris picumnus victoriae,Climacteris picumnus victoriae,Climacteris picumnus victoriae,infrasp.,victoriae
9,Corunastylis ectopa,https://id.biodiversity.org.au/taxon/apni/5140...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT),Corunastylis ectopa,SCIENTIFIC,Corunastylis,ectopa,True,False,Corunastylis ectopa,Corunastylis ectopa,Corunastylis ectopa,sp.,


In [13]:

# Identify records that won't comply with iNaturalist species names
noncomply = statelist[statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ]
noncomply

Unnamed: 0,name,lsid_x,act_geoprivacy,status,authority,scientificName,type,genusOrAbove,specificEpithet,parsed,parsedPartially,canonicalName,canonicalNameComplete,canonicalNameWithMarker,rankMarker,infraSpecificEpithet


In [14]:
# remove records that do not comply
statelist = statelist[~statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ]
#statelist = pd.DataFrame(statelist[['scientificName','status','bionet_geoprivacy','lsid']]).drop_duplicates()
statelist

Unnamed: 0,name,lsid_x,act_geoprivacy,status,authority,scientificName,type,genusOrAbove,specificEpithet,parsed,parsedPartially,canonicalName,canonicalNameComplete,canonicalNameWithMarker,rankMarker,infraSpecificEpithet
0,Anthochaera phrygia,https://biodiversity.org.au/afd/taxa/31869a0e-...,,Critically Endangered,Nature Conservation Act 2014 (ACT),Anthochaera phrygia,SCIENTIFIC,Anthochaera,phrygia,True,False,Anthochaera phrygia,Anthochaera phrygia,Anthochaera phrygia,sp.,
1,Aprasia parapulchella,https://biodiversity.org.au/afd/taxa/0d74fa05-...,obscured,Vulnerable,Nature Conservation Act 2014 (ACT),Aprasia parapulchella,SCIENTIFIC,Aprasia,parapulchella,True,False,Aprasia parapulchella,Aprasia parapulchella,Aprasia parapulchella,sp.,
2,Bettongia gaimardi,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,,Regionally Conservation Dependent,Nature Conservation Act 2014 (ACT),Bettongia gaimardi,SCIENTIFIC,Bettongia,gaimardi,True,False,Bettongia gaimardi,Bettongia gaimardi,Bettongia gaimardi,sp.,
3,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,,Endangered,Nature Conservation Act 2014 (ACT),Bidyanus bidyanus,SCIENTIFIC,Bidyanus,bidyanus,True,False,Bidyanus bidyanus,Bidyanus bidyanus,Bidyanus bidyanus,sp.,
4,Bossiaea grayi,https://id.biodiversity.org.au/node/apni/2910201,obscured,Endangered,Nature Conservation Act 2014 (ACT),Bossiaea grayi,SCIENTIFIC,Bossiaea,grayi,True,False,Bossiaea grayi,Bossiaea grayi,Bossiaea grayi,sp.,
5,Botaurus poiciloptilus,https://biodiversity.org.au/afd/taxa/dba78701-...,,Endangered,Nature Conservation Act 2014 (ACT),Botaurus poiciloptilus,SCIENTIFIC,Botaurus,poiciloptilus,True,False,Botaurus poiciloptilus,Botaurus poiciloptilus,Botaurus poiciloptilus,sp.,
6,Caladenia actensis,https://id.biodiversity.org.au/taxon/apni/5139...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT),Caladenia actensis,SCIENTIFIC,Caladenia,actensis,True,False,Caladenia actensis,Caladenia actensis,Caladenia actensis,sp.,
7,Calyptorhynchus lathami lathami,https://biodiversity.org.au/afd/taxa/9823b764-...,,Vulnerable,Nature Conservation Act 2014 (ACT),Calyptorhynchus lathami lathami,SCIENTIFIC,Calyptorhynchus,lathami,True,False,Calyptorhynchus lathami lathami,Calyptorhynchus lathami lathami,Calyptorhynchus lathami lathami,infrasp.,lathami
8,Climacteris picumnus victoriae,https://biodiversity.org.au/afd/taxa/fe69a214-...,,Vulnerable,Nature Conservation Act 2014 (ACT),Climacteris picumnus victoriae,SCIENTIFIC,Climacteris,picumnus,True,False,Climacteris picumnus victoriae,Climacteris picumnus victoriae,Climacteris picumnus victoriae,infrasp.,victoriae
9,Corunastylis ectopa,https://id.biodiversity.org.au/taxon/apni/5140...,obscured,Critically Endangered,Nature Conservation Act 2014 (ACT),Corunastylis ectopa,SCIENTIFIC,Corunastylis,ectopa,True,False,Corunastylis ectopa,Corunastylis ectopa,Corunastylis ectopa,sp.,


In [15]:
parsednames['type'].unique()

array(['SCIENTIFIC'], dtype=object)

In [16]:
# check for duplicates with conflicting information
dupinformation = statelist.groupby('scientificName').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
dupinformation

Unnamed: 0,name,lsid_x,act_geoprivacy,status,authority,scientificName,type,genusOrAbove,specificEpithet,parsed,parsedPartially,canonicalName,canonicalNameComplete,canonicalNameWithMarker,rankMarker,infraSpecificEpithet


### 4. Equivalent IUCN statuses

In [17]:
# iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild'. 'Extinct'}
statelist.groupby(['status'])['status'].count()

status
Critically Endangered                 7
Endangered                           19
Regionally Conservation Dependent     1
Vulnerable                           27
Name: status, dtype: int64

In [18]:
iucnStatusMappings = {
    'critically endangered':'Critically Endangered',
    'vulnerable':'Vulnerable',
    'regionally conservation dependent':'Vulnerable',
    'endangered': 'Endangered'
}

### 5. Determine best place ID to use

In [19]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 12986

place_id  place_name                    place_display_name              
12986     Australian Capital Territory  Australian Capital Territory, AU    20
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State sensitive list on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [20]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['scientificName','status','act_geoprivacy', 'lsid_x','authority']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses

Unnamed: 0,scientificName,status,act_geoprivacy,lsid_x,authority,status_id,taxon_id,user_id,description,iucn,authority_inat,status_inat,geoprivacy,place_id,place_display_name
0,Anthochaera phrygia,Critically Endangered,,https://biodiversity.org.au/afd/taxa/31869a0e-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
1,Aprasia parapulchella,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/0d74fa05-...,Nature Conservation Act 2014 (ACT),152221.0,36957.0,708886.0,,30.0,ACT Government,vulnerable,obscured,12986.0,"Australian Capital Territory, AU"
2,Bettongia gaimardi,Regionally Conservation Dependent,,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
3,Bidyanus bidyanus,Endangered,,https://biodiversity.org.au/afd/taxa/05866f31-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
4,Bossiaea grayi,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2910201,Nature Conservation Act 2014 (ACT),152220.0,795624.0,708886.0,,40.0,ACT Government,endangered,obscured,12986.0,"Australian Capital Territory, AU"
5,Botaurus poiciloptilus,Endangered,,https://biodiversity.org.au/afd/taxa/dba78701-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
6,Caladenia actensis,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
7,Calyptorhynchus lathami lathami,Vulnerable,,https://biodiversity.org.au/afd/taxa/9823b764-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
8,Climacteris picumnus victoriae,Vulnerable,,https://biodiversity.org.au/afd/taxa/fe69a214-...,Nature Conservation Act 2014 (ACT),,,,,,,,,,
9,Corunastylis ectopa,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5140...,Nature Conservation Act 2014 (ACT),,,,,,,,,,


In [21]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = "Nature Conservation Act 2014 (ACT)"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://www.environment.act.gov.au/nature-conservation/conservation-and-ecological-communities/threatened-species-factsheets"
biesearchurl = "https://bie.ala.org.au/species/" # eg + "https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_url'] =  biesearchurl + mergedstatuses['lsid_x']
# mergedstatuses['new_url'] = mergedstatuses.apply(lambda x: biesearchurl + x['lsid'] if pd.isna(x['taxon_id']) else baseurl + x['taxon_id],axis=1)
mergedstatuses['new_geoprivacy'] = mergedstatuses['act_geoprivacy'].apply(lambda x: 'open' if pd.isnull(x) else x)
mergedstatuses['new_place_id'] = '12986'  # Australian Capital Territory
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Threatened')
mergedstatuses

Unnamed: 0,scientificName,status,act_geoprivacy,lsid_x,authority,status_id,taxon_id,user_id,description,iucn,...,place_id,place_display_name,new_authority,new_description,new_url,new_geoprivacy,new_place_id,new_username,new_iucn_equivalent,new_status
0,Anthochaera phrygia,Critically Endangered,,https://biodiversity.org.au/afd/taxa/31869a0e-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Critically Endangered,Critically Endangered
1,Aprasia parapulchella,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/0d74fa05-...,Nature Conservation Act 2014 (ACT),152221.0,36957.0,708886.0,,30.0,...,12986.0,"Australian Capital Territory, AU",Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Vulnerable,Vulnerable
2,Bettongia gaimardi,Regionally Conservation Dependent,,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Vulnerable,Regionally Conservation Dependent
3,Bidyanus bidyanus,Endangered,,https://biodiversity.org.au/afd/taxa/05866f31-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Endangered,Endangered
4,Bossiaea grayi,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2910201,Nature Conservation Act 2014 (ACT),152220.0,795624.0,708886.0,,40.0,...,12986.0,"Australian Capital Territory, AU",Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Endangered,Endangered
5,Botaurus poiciloptilus,Endangered,,https://biodiversity.org.au/afd/taxa/dba78701-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Endangered,Endangered
6,Caladenia actensis,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Critically Endangered,Critically Endangered
7,Calyptorhynchus lathami lathami,Vulnerable,,https://biodiversity.org.au/afd/taxa/9823b764-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Vulnerable,Vulnerable
8,Climacteris picumnus victoriae,Vulnerable,,https://biodiversity.org.au/afd/taxa/fe69a214-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Vulnerable,Vulnerable
9,Corunastylis ectopa,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5140...,Nature Conservation Act 2014 (ACT),,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Critically Endangered,Critically Endangered


## Updates

In [22]:
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,UPDATE,Aprasia parapulchella,152221,36957,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
4,UPDATE,Bossiaea grayi,152220,795624,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
13,UPDATE,Delma impar,152227,36937,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
14,UPDATE,Euastacus armatus,152234,100611,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
24,UPDATE,Lepidium ginninderrense,266735,1255190,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
29,UPDATE,Maccullochella macquariensis,152226,85287,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
30,UPDATE,Macquaria australasica,152235,51333,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
34,UPDATE,Perunga ochracea,152229,762250,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
38,UPDATE,Phascolarctos cinereus,267488,42983,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
41,UPDATE,Prasophyllum petilum,180956,1130138,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...


## No status in iNaturalist via straight scientificName match
The ACT records that didn't match up to a status in iNaturalist

In [23]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,scientificName,status,act_geoprivacy,lsid_x,authority,status_id,taxon_id,user_id,description,iucn,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Anthochaera phrygia,Critically Endangered,,https://biodiversity.org.au/afd/taxa/31869a0e-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Meliphagidae,Anthochaera,phrygia,,2022-06-09T16:08:32Z,species,http://www.birds.cornell.edu/clementschecklist...
1,Bettongia gaimardi,Regionally Conservation Dependent,,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
2,Bidyanus bidyanus,Endangered,,https://biodiversity.org.au/afd/taxa/05866f31-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Actinopterygii,Centrarchiformes,Terapontidae,Bidyanus,bidyanus,,2019-11-24T03:53:09Z,species,http://www.fishbase.org
3,Botaurus poiciloptilus,Endangered,,https://biodiversity.org.au/afd/taxa/dba78701-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Pelecaniformes,Ardeidae,Botaurus,poiciloptilus,,2022-06-09T16:43:19Z,species,http://www.birdlife.org/datazone/speciesfactsh...
4,Caladenia actensis,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Nature Conservation Act 2014 (ACT),,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Caladenia,actensis,,2022-08-18T08:16:27Z,species,
5,Calyptorhynchus lathami lathami,Vulnerable,,https://biodiversity.org.au/afd/taxa/9823b764-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Psittaciformes,Cacatuidae,Calyptorhynchus,lathami,lathami,2018-12-19T06:38:49Z,subspecies,
6,Climacteris picumnus victoriae,Vulnerable,,https://biodiversity.org.au/afd/taxa/fe69a214-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Climacteridae,Climacteris,picumnus,victoriae,2021-11-24T01:01:34Z,subspecies,http://www.birds.cornell.edu/clementschecklist...
7,Corunastylis ectopa,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5140...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
8,Daphoenositta chrysoptera,Vulnerable,,https://biodiversity.org.au/afd/taxa/8bf4b8b0-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Neosittidae,Daphoenositta,chrysoptera,,2020-01-16T00:06:57Z,species,
9,Dasyurus maculatus maculatus,Vulnerable,,https://biodiversity.org.au/afd/taxa/a4ef7496-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,


In [24]:
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions

Unnamed: 0,scientificName,status,act_geoprivacy,lsid_x,authority,status_id,taxon_id,user_id,description,iucn,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Anthochaera phrygia,Critically Endangered,,https://biodiversity.org.au/afd/taxa/31869a0e-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Meliphagidae,Anthochaera,phrygia,,2022-06-09T16:08:32Z,species,http://www.birds.cornell.edu/clementschecklist...
1,Bettongia gaimardi,Regionally Conservation Dependent,,https://biodiversity.org.au/afd/taxa/19c9bfdf-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
2,Bidyanus bidyanus,Endangered,,https://biodiversity.org.au/afd/taxa/05866f31-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Actinopterygii,Centrarchiformes,Terapontidae,Bidyanus,bidyanus,,2019-11-24T03:53:09Z,species,http://www.fishbase.org
3,Botaurus poiciloptilus,Endangered,,https://biodiversity.org.au/afd/taxa/dba78701-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Pelecaniformes,Ardeidae,Botaurus,poiciloptilus,,2022-06-09T16:43:19Z,species,http://www.birdlife.org/datazone/speciesfactsh...
4,Caladenia actensis,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Nature Conservation Act 2014 (ACT),,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Caladenia,actensis,,2022-08-18T08:16:27Z,species,
5,Calyptorhynchus lathami lathami,Vulnerable,,https://biodiversity.org.au/afd/taxa/9823b764-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Psittaciformes,Cacatuidae,Calyptorhynchus,lathami,lathami,2018-12-19T06:38:49Z,subspecies,
6,Climacteris picumnus victoriae,Vulnerable,,https://biodiversity.org.au/afd/taxa/fe69a214-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Climacteridae,Climacteris,picumnus,victoriae,2021-11-24T01:01:34Z,subspecies,http://www.birds.cornell.edu/clementschecklist...
8,Daphoenositta chrysoptera,Vulnerable,,https://biodiversity.org.au/afd/taxa/8bf4b8b0-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Aves,Passeriformes,Neosittidae,Daphoenositta,chrysoptera,,2020-01-16T00:06:57Z,species,
10,Dasyurus viverrinus,Endangered,,https://biodiversity.org.au/afd/taxa/52149285-...,Nature Conservation Act 2014 (ACT),,,,,,...,Chordata,Mammalia,Dasyuromorphia,Dasyuridae,Dasyurus,viverrinus,,2022-06-11T10:30:40Z,species,http://www.catalogueoflife.org/annual-checklis...
11,Eucalyptus aggregata,Vulnerable,,https://id.biodiversity.org.au/node/apni/2900093,Nature Conservation Act 2014 (ACT),,,,,,...,Tracheophyta,Magnoliopsida,Myrtales,Myrtaceae,Eucalyptus,aggregata,,2019-02-12T10:45:19Z,species,http://www.catalogueoflife.org/annual-checklis...


In [25]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
0,ADD,Anthochaera phrygia,,144707,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
1,ADD,Bettongia gaimardi,,42996,Regionally Conservation Dependent,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
2,ADD,Bidyanus bidyanus,,95759,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
3,ADD,Botaurus poiciloptilus,,5032,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
4,ADD,Caladenia actensis,,1127611,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://id.biod...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
5,ADD,Calyptorhynchus lathami lathami,,720267,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
6,ADD,Climacteris picumnus victoriae,,713108,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
8,ADD,Daphoenositta chrysoptera,,979682,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
10,ADD,Dasyurus viverrinus,,40167,Endangered,Endangered,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://biodive...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...
11,ADD,Eucalyptus aggregata,,774108,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://bie.ala.org.au/species/https://id.biod...,open,12986,peggydnew,Listed as Threatened - refer to https://www.en...


In [26]:
# write these to the file
fulllist = pd.concat([updates,additions])
pd.concat([updates,additions]).to_csv(sourcedir + "act.csv", index=False)

In [28]:
# check for duplicates with conflicting information
dupinformation = fulllist.groupby('taxon_name').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
dupinformation

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description


In [29]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,scientificName,status,act_geoprivacy,lsid_x,authority,status_id,taxon_id,user_id,description,iucn,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
7,Corunastylis ectopa,Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5140...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
9,Dasyurus maculatus maculatus,Vulnerable,,https://biodiversity.org.au/afd/taxa/a4ef7496-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
12,Gadopsis bispinosus,Vulnerable,,https://biodiversity.org.au/afd/taxa/347bd80e-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
13,Gentiana baeuerlenii,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2892776,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
20,Litoria aurea,Vulnerable,,https://biodiversity.org.au/afd/taxa/e7adefa4-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
22,Litoria raniformis,Vulnerable,,https://biodiversity.org.au/afd/taxa/6e63311a-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,
24,Mastacomys fuscus mordicus,Vulnerable,,https://biodiversity.org.au/afd/taxa/70e10fcf-...,Nature Conservation Act 2014 (ACT),,,,,,...,,,,,,,,,,


In [30]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1076,152230,39441,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Varanus,rosenbergi,,2021-07-28T09:18:05Z,Varanus rosenbergi,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1086,152232,568018,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Maratus,calcitrans,,2021-08-26T01:12:23Z,Maratus calcitrans,species,,,,
2417,152223,74547,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,,...,Mastacomys,fuscus,,2020-07-20T02:56:32Z,Mastacomys fuscus,species,http://www.iucnredlist.org/details/18563/0,,,
1080,152225,761022,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Keyacris,scurra,,2018-08-08T08:18:31Z,Keyacris scurra,species,http://orthoptera.speciesfile.org,,,
1085,152231,781565,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Cooraboorama,canberrae,,2018-09-27T04:49:50Z,Cooraboorama canberrae,species,http://www.catalogueoflife.org/annual-checklis...,,,


### are there any that need to be removed?
ACT sensitive list count: 25
ACT inat statuses count: 53

updates to inat status: 56
additional inat status: 45
ACT statuses we can't find a taxon match for in iNaturalist: 13
total: 78 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over:  that may need checking

In [31]:
# inat statuses that aren't in added or updated
notaddedupdated = inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]
notaddedupdated

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1076,152230,39441,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Varanus,rosenbergi,,2021-07-28T09:18:05Z,Varanus rosenbergi,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1086,152232,568018,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Maratus,calcitrans,,2021-08-26T01:12:23Z,Maratus calcitrans,species,,,,
2417,152223,74547,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,,...,Mastacomys,fuscus,,2020-07-20T02:56:32Z,Mastacomys fuscus,species,http://www.iucnredlist.org/details/18563/0,,,
1080,152225,761022,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Keyacris,scurra,,2018-08-08T08:18:31Z,Keyacris scurra,species,http://orthoptera.speciesfile.org,,,
1085,152231,781565,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Cooraboorama,canberrae,,2018-09-27T04:49:50Z,Cooraboorama canberrae,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [32]:
# Stats
numsensitive = len(sensitivelist.index)
numconservation = len(conservationlist.index)
numupdates  = len(updates.index)
numadditions  = len(additions.index)
numnoinatstatus = len(noinatstatus.index)
numunknownToInat = len(unknownToInat.index)
numnotaddedupdated = len(notaddedupdated.index)
numnoncomply = len(noncomply.index)
numcomply = len(statelist.index)
numdupinfo = len(dupinformation.index)
d = {'Sensitive': [numsensitive],
    'Conservation': [numconservation],
    'Statelist merge': [numfullstatelist],
    'Species iNat Comply' : [numcomply],
    'Species iNat non-Comply': [numnoncomply],
    'Duplicate Information': [numdupinfo],
    'Updates': [numupdates],
    'Additions': [numadditions],
    'Not added updated': [numnotaddedupdated],
    'No Inat Status': [numnoinatstatus],
    'Unknown to Inat': [numunknownToInat]}

statsdf = pd.DataFrame(data=d)
statsdf

Unnamed: 0,Sensitive,Conservation,Statelist merge,Species iNat Comply,Species iNat non-Comply,Duplicate Information,Updates,Additions,Not added updated,No Inat Status,Unknown to Inat
0,25,54,54,54,0,0,15,32,5,39,7
