# iNaturalist status updates by state - QLD

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
Establish the changes that need to be made. Read in the sensitive and conservation list for each state.
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive and conservation list
4. Attempt to match the state statuses to an IUCN equivalent
5. Determine the best placeID to use for this state


### 1. iNaturalist statuses

In [1]:
import pandas as pd

# projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [2]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])


inatstatuses = filter_state_statuses(" WA |WEST AUST|West Aust|WESTERN AUSTRALIA|Western Australia", ".wa.gov.au")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1990,153386,100948,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Fibulacamptus,bisetosus,,2021-10-28T19:35:25Z,Fibulacamptus bisetosus,species,http://www.iucnredlist.org/apps/redlist/details,,,
2158,153585,101164,708886,6827,16654,WA Department of Environment and Convservation,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Galaxiella,munda,,2019-11-23T07:13:44Z,Galaxiella munda,species,http://www.fishbase.org,,,
1982,153373,101165,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Galaxiella,nigrostriata,,2019-11-23T07:13:25Z,Galaxiella nigrostriata,species,http://www.fishbase.org,,,
2346,153802,101474,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Glacidorbis,occidentalis,,2021-10-29T17:47:10Z,Glacidorbis occidentalis,species,http://www.catalogueoflife.org/annual-checklis...,,,
2019,153420,101509,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Glyphis,garricki,,2019-04-18T19:04:37Z,Glyphis garricki,species,http://www.fishbase.org,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2311,153762,99344,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Diplodactylus,capensis,,2018-11-17T23:47:54Z,Diplodactylus capensis,species,http://reptile-database.reptarium.cz/search.ph...,,,
2039,153443,99634,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Dupucharopa,millestriata,,2021-10-29T15:16:07Z,Dupucharopa millestriata,species,http://www.catalogueoflife.org/annual-checklis...,,,
2185,153613,99973,708886,6827,16654,WA Department of Environment and Convservation,critically endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,,,,,Engaewa pseudoreducta,,,,False,[]
2293,153742,99974,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Engaewa,reducta,,2020-05-28T04:59:18Z,Engaewa reducta,species,http://www.iucnredlist.org/apps/redlist/details,,,


### 2. iNaturalist taxonomy

In [3]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists

Get the ALA Sensitive list: `geoprivacy` = `obscured`
Everything on the WA list is sensitive


In [None]:
%%script echo skipping # comment this line to download dataset from lists.ala.org.au the web and save locally

import sys
import os
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf
alasensitivelist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr18406?max=10000&includeKVP=true")
alasensitivelist = lf.kvp_to_columns(alasensitivelist)
alasensitivelist.to_csv(sourcedir + "wa-ala.csv")

Use the GBIF names parser to clean up the names

In [79]:
%%script echo skipping # comment this line to run the gbif parser again the web and save a file locally

import requests

namesonly = alasensitivelist['name']
url = "https://api.gbif.org/v1/parser/name"
headers = {'content-type' : 'application/json'}
data = namesonly.to_json(orient="values")
params = {'name':data}
r = requests.post(url=url,data=data,headers=headers)
results = pd.read_json(r.text)
results.to_csv(sourcedir + "wa-gbif.csv")
results

In [105]:
alasensitivelist = pd.read_csv(sourcedir + "wa-ala.csv", dtype=str)
parsednames = pd.read_csv(sourcedir + "wa-gbif.csv", dtype=str)
alasensitivelist = alasensitivelist.merge(parsednames[['scientificName','canonicalName']],how="left",left_on="name",right_on="scientificName")
alasensitivelist['wa_geoprivacy'] = 'obscured'
alasensitivelist['waTaxonID'] = alasensitivelist['W A Taxon Id']#.apply(lambda x: int(float(x)))
alasensitivelist['scientificName'] = alasensitivelist['canonicalName']
alasensitivelist['status'] = alasensitivelist['sourceStatus']
alasensitivelist

Unnamed: 0.1,Unnamed: 0,id,name,commonName,scientificName_x,lsid,dataResourceUid,kvpValues,W A Taxon Id,status,...,sensitivityZoneId,county,municipality,verbatimLocality,taxonRemarks,scientificName_y,canonicalName,wa_geoprivacy,waTaxonID,scientificName
0,0,2586820,Tetratheca aphylla subsp. aphylla,,Tetratheca aphylla subsp. aphylla,https://id.biodiversity.org.au/node/apni/2902462,dr18406,"[{'key': 'W A Taxon Id', 'value': '29489'}, {'...",29489,Vulnerable,...,WA,GOLD,KALGOORLIE,Helena & Aurora Ranges,,Tetratheca aphylla subsp. aphylla,Tetratheca aphylla aphylla,obscured,29489,Tetratheca aphylla aphylla
1,0,2589732,Tetratheca harperi,Jackson Tetratheca,Tetratheca harperi,https://id.biodiversity.org.au/node/apni/2896021,dr18406,"[{'key': 'W A Taxon Id', 'value': '4534'}, {'k...",4534,Vulnerable,...,WA,GOLD,KALGOORLIE,Mt Jackson,,Tetratheca harperi,Tetratheca harperi,obscured,4534,Tetratheca harperi
2,0,2587391,Atriplex yeelirrie,,Atriplex yeelirrie,https://id.biodiversity.org.au/taxon/apni/5127...,dr18406,"[{'key': 'W A Taxon Id', 'value': '46173'}, {'...",46173,Vulnerable,...,WA,GOLD,KALGOORLIE,Yeelirrie Stn.,,Atriplex yeelirrie,Atriplex yeelirrie,obscured,46173,Atriplex yeelirrie
3,0,2589866,Tetratheca paynterae subsp. cremnobata,,Tetratheca paynterae subsp. cremnobata,https://id.biodiversity.org.au/node/apni/2891052,dr18406,"[{'key': 'W A Taxon Id', 'value': '23987'}, {'...",23987,Vulnerable,...,WA,GOLD,KALGOORLIE,Die Hardy Ranges,,Tetratheca paynterae subsp. cremnobata,Tetratheca paynterae cremnobata,obscured,23987,Tetratheca paynterae cremnobata
4,0,2588285,Acacia shapelleae,Shapelle's Wattle,Acacia shapelleae,https://id.biodiversity.org.au/taxon/apni/5128...,dr18406,"[{'key': 'W A Taxon Id', 'value': '44470'}, {'...",44470,Vulnerable,...,WA,GOLD,KALGOORLIE,"Bungalbin Hill, Helena and Aurora Range",,Acacia shapelleae,Acacia shapelleae,obscured,44470,Acacia shapelleae
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4199,0,2587070,Leporillus conditor,Greater Stick-nest Rat,Leporillus conditor,https://biodiversity.org.au/afd/taxa/5ebc59f2-...,dr18406,"[{'key': 'status', 'value': 'Conservation Depe...",,Conservation Dependent,...,WA,MWST,,,Conservation Dependent under the Biodiversity ...,Leporillus conditor,Leporillus conditor,obscured,,Leporillus conditor
4200,0,2589588,Phascogale tapoatafa wambenger,South-western Brush-tailed Phascogale,Phascogale tapoatafa wambenger,https://biodiversity.org.au/afd/taxa/05cba699-...,dr18406,"[{'key': 'status', 'value': 'Conservation Depe...",,Conservation Dependent,...,WA,"MWSTWHTB,SCST,SWAN,SWSTWARR",,,Conservation Dependent under the Biodiversity ...,Phascogale tapoatafa wambenger,Phascogale tapoatafa wambenger,obscured,,Phascogale tapoatafa wambenger
4201,0,2587603,Bettongia lesueur subsp. (Barrow and Boodie Is...,Boodie (barrow And Boodie Islands),Bettongia lesueur subsp. (Barrow and Boodie Is...,ALA_DR2201_4202,dr18406,"[{'key': 'status', 'value': 'Conservation Depe...",,Conservation Dependent,...,WA,PILB,,,Conservation Dependent under the Biodiversity ...,Bettongia lesueur subsp. (Barrow and Boodie Is...,Bettongia lesueur subsp.,obscured,,Bettongia lesueur subsp.
4202,0,2588671,Phascogale calura,Red-tailed Phascogale,Phascogale calura,https://biodiversity.org.au/afd/taxa/36b436b1-...,dr18406,"[{'key': 'status', 'value': 'Conservation Depe...",,Conservation Dependent,...,WA,"WHTB,SCST,SWAN,SWST",,,Conservation Dependent under the Biodiversity ...,Phascogale calura,Phascogale calura,obscured,,Phascogale calura


In [106]:
statelist = pd.DataFrame(alasensitivelist[['waTaxonID','scientificName','status','wa_geoprivacy','lsid']])
statelist

Unnamed: 0,waTaxonID,scientificName,status,wa_geoprivacy,lsid
0,29489,Tetratheca aphylla aphylla,Vulnerable,obscured,https://id.biodiversity.org.au/node/apni/2902462
1,4534,Tetratheca harperi,Vulnerable,obscured,https://id.biodiversity.org.au/node/apni/2896021
2,46173,Atriplex yeelirrie,Vulnerable,obscured,https://id.biodiversity.org.au/taxon/apni/5127...
3,23987,Tetratheca paynterae cremnobata,Vulnerable,obscured,https://id.biodiversity.org.au/node/apni/2891052
4,44470,Acacia shapelleae,Vulnerable,obscured,https://id.biodiversity.org.au/taxon/apni/5128...
...,...,...,...,...,...
4199,,Leporillus conditor,Conservation Dependent,obscured,https://biodiversity.org.au/afd/taxa/5ebc59f2-...
4200,,Phascogale tapoatafa wambenger,Conservation Dependent,obscured,https://biodiversity.org.au/afd/taxa/05cba699-...
4201,,Bettongia lesueur subsp.,Conservation Dependent,obscured,ALA_DR2201_4202
4202,,Phascogale calura,Conservation Dependent,obscured,https://biodiversity.org.au/afd/taxa/36b436b1-...


### 4. Equivalent IUCN statuses

In [107]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild','Extinct'}
statelist.groupby(['status'])['status'].count()

status
Conservation Dependent                                                            7
Critically Endangered                                                           217
Endangered                                                                      198
Extinct                                                                          38
Other Specially Protected                                                         8
Priority                                                                        219
Priority 1 - Poorly known Species                                              1072
Priority 2 - Poorly known Species                                               835
Priority 3 - Poorly known species                                               981
Priority 4 - Rare , Near Threatened and other species in need of monitoring     366
Vulnerable                                                                      263
Name: status, dtype: int64

In [20]:
# map these statuses back to the original values
waStatusMappings = {
    'CD & MI': 'Special Conservation Interest',
    'MI & P1': 'P1 - Poorly known species',
    'MI & P3': 'P3 - Poorly known species',
    'Migratory & P4': 'P4 - Rare, Near Threatened and other species in need of monitoring',
    'P4 - Poorly known species': 'P4 - Rare, Near Threatened and other species in need of monitoring'
}
statelist['new_wa_status'] = statelist['status'].str.strip().map(waStatusMappings).fillna(statelist['status'])
#mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
statelist[['status','new_wa_status']].drop_duplicates()

Unnamed: 0,status,new_wa_status
0,P1 - Poorly known species,P1 - Poorly known species
1,P2 - Poorly known species,P2 - Poorly known species
2,P3 - Poorly known species,P3 - Poorly known species
8,P4 - Poorly known species,"P4 - Rare, Near Threatened and other species i..."
16,Threatened Flora,Threatened Flora
131,Presumed Extinct,Presumed Extinct
3859,Vulnerable,Vulnerable
3864,Migratory,Migratory
3866,Critically Endangered,Critically Endangered
3886,Endangered,Endangered


In [8]:
iucnStatusMappings = {
    'critically endangered': 'Critically Endangered',
    'vulnerable':'Vulnerable',
    'not evaluated':'Not Evaluated',
    'data deficient':'Data Deficient',
    'least concern':'Least Concern',
    'special least concern':'Least Concern',
    'near threatened':'Near Threatened',
    'endangered':'Endangered',
    'extinct in the wild':'Extinct in the Wild',
    'extinct':'Extinct',
    'confidential':'Vulnerable'
}

### 5. Determine best place ID to use

In [9]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 7308

place_id  place_name             place_display_name           
                                                                    5
144315    Brisbane City          Brisbane City                      2
153119    South East Queensland  South East Queensland, QL, AU      1
18870     Cairns - Pt B          Cairns - Pt B, QL, AU              1
19232     Yarrabah               Yarrabah, QL, AU                   1
6744      Australia              Australia                         12
7308      Queensland             Queensland, AU                   631
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State lists on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [21]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['waTaxonID','scientificName','new_wa_status','wa_geoprivacy']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,waTaxonID,scientificName,new_wa_status,wa_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status,geoprivacy,place_id,place_display_name
3860,,Abebaioscia troglodytes,Vulnerable,obscured,,,,,,,,,,
0,50593.0,Abildgaardia pachyptera,P1 - Poorly known species,obscured,,,,,,,,,,
1,14112.0,Abutilon sp. Hamelin (A.M. Ashby 2196),P2 - Poorly known species,obscured,,,,,,,,,,
2,14110.0,Abutilon sp. Onslow (F. Smith s.n. 10/9/61),P3 - Poorly known species,obscured,,,,,,,,,,
3,43021.0,Abutilon sp. Pritzelianum (S. van Leeuwen 5095),P3 - Poorly known species,obscured,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3857,29060.0,Zeuxine oblonga,P2 - Poorly known species,obscured,153758,369267,708886,,20,WA Department of Environment and Convservation,NT,obscured,6827,"Western Australia, AU"
3856,29060.0,Zeuxine oblonga,P2 - Poorly known species,obscured,169907,369267,702203,,20,Atlas of Living Australia,NT,,6827,"Western Australia, AU"
3858,36237.0,Zornia areolata,P1 - Poorly known species,obscured,,,,,,,,,,
3859,34478.0,Zornia sp. West Kimberley (C.A. Gardner 9942),P3 - Poorly known species,obscured,,,,,,,,,,


In [11]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = "WA Deparment of Biodiversity, Conservation and Attractions"
mergedstatuses['new_description'] = "Listed as Confidential - refer to https://www.data.qld.gov.au/dataset/queensland-confidential-species"
#mergedstatuses['new_url'] = "https://apps.des.qld.gov.au/species-search/details/?id=" + mergedstatuses['wildnetTaxonID'].astype(str)
florabaseurl = "https://florabase.dpaw.wa.gov.au/browse/profile/"
biesearchurl = "https://bie.ala.org.au/species/https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_url'] = mergedstatuses.apply(lambda x: biesearchurl + x['lsid'] if pd.isna(x['waTaxonID']) else florabaseurl + x['waTaxonID'],axis=1)
floradescrurl = "Listed as Confidential - refer to https://www.dpaw.wa.gov.au/plants-and-animals/threatened-species-and-communities/threatened-plants"
faunadescrurl = "Listed as Confidential - refer to https://www.dpaw.wa.gov.au/plants-and-animals/threatened-species-and-communities/threatened-animals"
mergedstatuses['new_description'] = mergedstatuses.apply(lambda x: faunadescrurl if pd.isna(x['waTaxonID']) else floradescrurl,axis=1)
mergedstatuses.rename(columns={'wildnet_geoprivacy':'new_geoprivacy'},inplace=True)
mergedstatuses['new_place_id'] = '7308'  # Queensland, AU
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Confidential')
mergedstatuses

Unnamed: 0,wildnetTaxonID,scientificName,status,new_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,...,geoprivacy,place_id,place_display_name,new_authority,new_description,new_url,new_place_id,new_username,new_iucn_equivalent,new_status
1676,33360,Abrodictyum brassii,Special least concern,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Least Concern,Special least concern
1677,33358,Abrodictyum caudatum,Special least concern,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Least Concern,Special least concern
1678,33359,Abrodictyum obscurum,Special least concern,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Least Concern,Special least concern
1769,11892,Acacia acrionastes,Near Threatened,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Near Threatened,Near Threatened
1770,14898,Acacia ammophila,Vulnerable,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Vulnerable,Vulnerable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2333,31680,Zieria scopulus,Critically Endangered,open,168010,1097883,58320,,50,Queensland Nature Conservation Act 1992,...,,7308,"Queensland, AU",Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Critically Endangered,Critically Endangered
2334,28217,Zieria vagans,Critically Endangered,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Critically Endangered,Critically Endangered
2335,3296,Zieria verrucosa,Vulnerable,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Vulnerable,Vulnerable
2336,41054,Zieria wilhelminae,Critically Endangered,open,,,,,,,...,,,,Qld Department of Environment and Science,Listed as Confidential - refer to https://www....,https://apps.des.qld.gov.au/species-search/det...,7308,peggydnew,Critically Endangered,Critically Endangered


## Updates

In [12]:
# those that need to be updated - we found a status
mergedstatuses[mergedstatuses['status_id'].notnull()][['scientificName','status','status_inat','new_geoprivacy','geoprivacy','authority','user_id']]

Unnamed: 0,scientificName,status,status_inat,new_geoprivacy,geoprivacy,authority,user_id
1775,Acacia attenuata,Vulnerable,VU,open,,Queensland,702203
1776,Acacia barakulensis,Vulnerable,VU,open,obscured,QLD DEHP,702203
1777,Acacia baueri baueri,Vulnerable,VU,open,,Queensland,702203
1778,Acacia calantha,Near Threatened,NT,open,,Queensland Government,702203
1787,Acacia hockingsii,Vulnerable,VU,open,open,Queensland Nature Conservation Act 1992,3669610
...,...,...,...,...,...,...,...
2318,Zieria bifida,Endangered,EN,open,,Queensland Nature Conservation Act 1992,702203
2321,Zieria collina,Vulnerable,VU,open,open,Nature Conservation Act 1992,702203
2323,Zieria exsul,Critically Endangered,CR,open,,Queensland,702203
2326,Zieria gymnocarpa,Critically Endangered,CR,open,,QLD DEHP,409010


In [13]:
# updates - create the data frame
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1775,UPDATE,Acacia attenuata,159918,898644,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1776,UPDATE,Acacia barakulensis,158182,1038687,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1777,UPDATE,Acacia baueri baueri,159921,1111700,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1778,UPDATE,Acacia calantha,160798,1121183,Near Threatened,Near Threatened,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1787,UPDATE,Acacia hockingsii,264938,1252519,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
...,...,...,...,...,...,...,...,...,...,...,...,...
2318,UPDATE,Zieria bifida,223195,1125311,Endangered,Endangered,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
2321,UPDATE,Zieria collina,262040,537414,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
2323,UPDATE,Zieria exsul,159923,1111558,Critically Endangered,Critically Endangered,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
2326,UPDATE,Zieria gymnocarpa,169061,1244660,Critically Endangered,Critically Endangered,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....


In [14]:
# investigation - which updates are mine (346), which are those not from me (80 not from me)
#updates[updates['user_id']=='708886'][['scientificName','new_status','status_inat','authority','new_authority','description','new_description','geoprivacy','new_geoprivacy']]
#updates[updates['user_id']!='708886'][['user_id','scientificName','new_status','status_inat','authority','new_authority','description','new_description','geoprivacy','new_geoprivacy']]
# those with different statuses
#updates[updates['new_status'].str.lower().str.strip() != updates['status_inat'].str.lower().str.strip()][['scientificName','new_status','status_inat','authority','new_authority','description','new_description','geoprivacy','new_geoprivacy']]
# users who've updated qld statuses who aren't me
#'https://www.inaturalist.org/users/220795','Steven Kurniawidjaja','neontetraploid','US'
#'https://www.inaturalist.org/users/3669610','Craig Robbins','craig-r','AU'
#'https://www.inaturalist.org/users/527710','James Kameron Mitchell','jameskm','US'
#'https://www.inaturalist.org/users/58320','lwnrngr','lwnrngr','NZ'
#'https://www.inaturalist.org/users/702203','Kitty Maurey','kitty12','CA'
#'https://www.inaturalist.org/users/717122','Miguel de Salas','mftasp','TAS'


## No status in iNaturalist via straight scientificName match
The Qld records that didn't match up to a status in iNaturalist

In [15]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,wildnetTaxonID,scientificName,status,new_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,33360,Abrodictyum brassii,Special least concern,open,,,,,,,...,,,,,,,,,,
1,33358,Abrodictyum caudatum,Special least concern,open,,,,,,,...,Tracheophyta,Polypodiopsida,Hymenophyllales,Hymenophyllaceae,Abrodictyum,caudatum,,2022-06-07T08:14:14Z,species,
2,33359,Abrodictyum obscurum,Special least concern,open,,,,,,,...,,,,,,,,,,
3,11892,Acacia acrionastes,Near Threatened,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,acrionastes,,2022-04-06T22:12:50Z,species,http://powo.science.kew.org/
4,14898,Acacia ammophila,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,ammophila,,2022-04-07T01:03:28Z,species,http://powo.science.kew.org/
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1956,9476,Zieria rimulosa,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,rimulosa,,2021-07-28T04:47:47Z,species,https://eol.org/pages/52204449
1957,28217,Zieria vagans,Critically Endangered,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,vagans,,2021-07-28T04:47:46Z,species,https://eol.org/pages/52204430
1958,3296,Zieria verrucosa,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,verrucosa,,2021-07-28T04:47:50Z,species,https://eol.org/pages/49431650
1959,41054,Zieria wilhelminae,Critically Endangered,open,,,,,,,...,,,,,,,,,,


In [16]:
noinatstatus[noinatstatus['id'].notna()] # there's no status but there is a matching inat taxon (id is the taxon id)
# note: "Dendrobium" matches to both genus and section

Unnamed: 0,wildnetTaxonID,scientificName,status,new_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
1,33358,Abrodictyum caudatum,Special least concern,open,,,,,,,...,Tracheophyta,Polypodiopsida,Hymenophyllales,Hymenophyllaceae,Abrodictyum,caudatum,,2022-06-07T08:14:14Z,species,
3,11892,Acacia acrionastes,Near Threatened,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,acrionastes,,2022-04-06T22:12:50Z,species,http://powo.science.kew.org/
4,14898,Acacia ammophila,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,ammophila,,2022-04-07T01:03:28Z,species,http://powo.science.kew.org/
5,3304,Acacia arbiana,Near Threatened,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,arbiana,,2022-04-06T22:14:00Z,species,http://powo.science.kew.org/
6,31097,Acacia argentina,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,argentina,,2022-04-03T23:48:35Z,species,http://powo.science.kew.org/
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1955,27791,Zieria obovata,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,obovata,,2020-02-19T19:36:10Z,species,https://eol.org/pages/52204447
1956,9476,Zieria rimulosa,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,rimulosa,,2021-07-28T04:47:47Z,species,https://eol.org/pages/52204449
1957,28217,Zieria vagans,Critically Endangered,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,vagans,,2021-07-28T04:47:46Z,species,https://eol.org/pages/52204430
1958,3296,Zieria verrucosa,Vulnerable,open,,,,,,,...,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,verrucosa,,2021-07-28T04:47:50Z,species,https://eol.org/pages/49431650


In [17]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,ADD,Abrodictyum caudatum,,451374,Special least concern,Least Concern,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
3,ADD,Acacia acrionastes,,898576,Near Threatened,Near Threatened,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
4,ADD,Acacia ammophila,,898600,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
5,ADD,Acacia arbiana,,898626,Near Threatened,Near Threatened,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
6,ADD,Acacia argentina,,898630,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
...,...,...,...,...,...,...,...,...,...,...,...,...
1955,ADD,Zieria obovata,,956303,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1956,ADD,Zieria rimulosa,,1247621,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1957,ADD,Zieria vagans,,1247619,Critically Endangered,Critically Endangered,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....
1958,ADD,Zieria verrucosa,,1247620,Vulnerable,Vulnerable,Qld Department of Environment and Science,https://apps.des.qld.gov.au/species-search/det...,open,7308,peggydnew,Listed as Confidential - refer to https://www....


In [18]:
# what didnt match to a taxon?
noinatstatus[noinatstatus['id'].isna()]


Unnamed: 0,wildnetTaxonID,scientificName,status,new_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,33360,Abrodictyum brassii,Special least concern,open,,,,,,,...,,,,,,,,,,
2,33359,Abrodictyum obscurum,Special least concern,open,,,,,,,...,,,,,,,,,,
9,41056,Acacia castorum,Vulnerable,open,,,,,,,...,,,,,,,,,,
14,40993,Acacia forsteri,Critically Endangered,open,,,,,,,...,,,,,,,,,,
19,14862,Acacia lauta,Vulnerable,open,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1930,6296,Xanthorrhoea sp. (Cape Bedford M.Gandini AQ601...,Special least concern,open,,,,,,,...,,,,,,,,,,
1937,41972,Xylosma craynii,Vulnerable,open,,,,,,,...,,,,,,,,,,
1940,41347,Zealandia pustulata pustulata,Special least concern,open,,,,,,,...,,,,,,,,,,
1942,40091,Zeuxine attenuata,Least concern,obscured,,,,,,,...,,,,,,,,,,


In [19]:
noinatstatus.groupby('status').size()

status
Critically Endangered    105
Endangered               223
Extinct                    5
Extinct in the wild       23
Least concern             92
Near Threatened          219
Special least concern    738
Vulnerable               465
dtype: int64

### are there any that need to be removed?
qld list count: 2517
qld inat statuses count: 653

updates to inat status: 570
additional inat status: 1355
qld statuses we can't find a taxon match for in iNaturalist: 606
total: 2531 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 653-570=83 that may need checking against the above

In [20]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]


Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
102,159922,1019990,702203,7308,,Queensland,VU,https://www.legislation.qld.gov.au/view/html/i...,,,...,Acacia,baueri,,2022-04-06T22:03:46Z,Acacia baueri,species,http://www.catalogueoflife.org/annual-checklis...,,,
437,234929,104980,3669610,7308,,Queensland Nature Conservation Act 1992,VU,https://apps.des.qld.gov.au/species-search/det...,,,...,Maccullochella,peelii,,2022-01-28T04:34:59Z,Maccullochella peelii,species,http://www.fishbase.org,,,
18,167722,1073668,3669610,7308,,Queensland Government,LC,https://www.data.qld.gov.au/dataset/conservati...,"Least concern, locations still considered conf...",,...,Liparis,coelogynoides,,2022-07-12T13:44:01Z,Liparis coelogynoides,species,http://www.catalogueoflife.org/annual-checklis...,,,
308,264555,1074506,3669610,7308,,Queensland Department of Environment and Science,LC,https://apps.des.qld.gov.au/species-search/det...,,open,...,Eucalyptus,chloroclada,,2022-05-30T08:10:19Z,Eucalyptus chloroclada,species,http://www.catalogueoflife.org/annual-checklis...,,,
2614,161805,1091300,702203,7308,,Nature Conservation Act 1992,EN,http://155.187.2.69/cgi-bin/sprat/public/publi...,,,...,,,,,Plectranthus habrophyllus,,,,False,[1278873]
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
897,154451,937266,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,,,,,Caladenia caerulea,,,blue finger-orchid,False,[552476]
837,154918,954413,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,,,,,Oeceoclades pulchra,,,,False,[1357800]
2815,264331,961868,3669610,7308,,Queensland Nature Conservation Act 1992,LC,https://apps.des.qld.gov.au/species-search/det...,,open,...,Eucalyptus,thozetiana,,2022-05-21T15:39:09Z,Eucalyptus thozetiana,species,http://www.catalogueoflife.org/annual-checklis...,,,
1053,155106,963748,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,Nervilia,simplex,,2020-02-18T23:50:59Z,Nervilia simplex,species,http://sciencepress.mnhn.fr/fr/periodiques/ada...,,,
