# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive list
4. Attempt to match the state statuses to an IUCN equivalent


### 1. iNaturalist statuses

In [233]:
import sys
import os
projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf
import pandas as pd

sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [234]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])
inatstatuses = filter_state_statuses("Northern Territory|NT NRETAS", " ")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
289,263901,1289379,702203.0,9994,,Atlas of Living Australia,VU,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Chloebia,gouldiae,,2022-10-20T02:55:37Z,Chloebia gouldiae,species,https://www.birds.cornell.edu/clementschecklis...,,,
1251,152435,19250,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Polytelis,alexandrae,,2019-08-27T01:09:01Z,Polytelis alexandrae,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
1249,152433,38633,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Bellatorias,obiri,,2019-04-30T15:19:21Z,Bellatorias obiri,species,http://reptile-database.reptarium.cz/search.ph...,,,
1252,152436,40743,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,,,,,Hipposideros stenotis,,,Narrow-eared Roundleaf Bat,False,[1431118]
1248,152432,41326,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Macroderma,gigas,,2019-08-27T01:58:05Z,Macroderma gigas,species,http://www.catalogueoflife.org/annual-checklis...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1247,152431,73180,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Pezoporus,occidentalis,,2019-08-27T01:09:27Z,Pezoporus occidentalis,species,http://www.birdlife.org/datazone/speciesfactsh...,,,


### 2. iNaturalist taxonomy

In [235]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists
Sensitive list: `geoprivacy` = `obscured`
overrides

In [236]:
# NT - No sensitive list
# Retrieve sensitive list ids from lists tool
# sensitivelist = pd.read_csv(listdir + "sensitive-lists/NT-sensitive.csv", dtype='str')  # NT sensitive list
sensitivelist = lf.download_ala_list("https://lists.ala.org.au/ws/speciesListItems/dr492?max=10000&includeKVP=true")
# sensitivelist = lf.kvp_to_columns(sensitivelist)
sensitivelist['bionet_geoprivacy'] = 'obscured'
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,bionet_geoprivacy
0,2058921,Macroderma gigas,Ghost Bat,Macroderma gigas,https://biodiversity.org.au/afd/taxa/63bc796a-...,dr492,"[{'key': 'vernacular name', 'value': 'Ghost Ba...",obscured
1,2058928,Hipposideros stenotis,Northern Leaf-nosed Bat,Hipposideros stenotis,https://biodiversity.org.au/afd/taxa/26fe0f53-...,dr492,"[{'key': 'vernacular name', 'value': 'Northern...",obscured
2,2058925,Hipposideros inornata,Arnhem Leaf-nosed Bat,Hipposideros inornatus,https://biodiversity.org.au/afd/taxa/5d2dab40-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhem L...",obscured
3,2058920,Pezoporus occidentalis,Night Parrot,Pezoporus occidentalis,https://biodiversity.org.au/afd/taxa/c630f3b0-...,dr492,"[{'key': 'vernacular name', 'value': 'Night Pa...",obscured
4,2058926,Polytelis alexandrae,Alexandra's Parrot,Polytelis alexandrae,https://biodiversity.org.au/afd/taxa/be7a08f5-...,dr492,"[{'key': 'vernacular name', 'value': 'Princess...",obscured
5,2058929,Falco hypoleucos,Grey Falcon,Falco (Hierofalco) hypoleucos,https://biodiversity.org.au/afd/taxa/4c73a934-...,dr492,"[{'key': 'vernacular name', 'value': 'Grey Fal...",obscured
6,2058922,Bellatorias obiri,Arnhem Land Egernia,Bellatorias obiri,https://biodiversity.org.au/afd/taxa/2afc8501-...,dr492,"[{'key': 'vernacular name', 'value': 'Arnhemla...",obscured
7,2058923,Attacus wardi,Atlas Moth,Attacus wardi,https://biodiversity.org.au/afd/taxa/8a05008e-...,dr492,"[{'key': 'vernacular name', 'value': 'Atlas Mo...",obscured
8,2058924,Ogyris iphis doddi,Dodd’s Azure,Ogyris iphis doddi,https://biodiversity.org.au/afd/taxa/ae3ab4c9-...,dr492,"[{'key': 'vernacular name', 'value': 'Dodd’s A...",obscured
9,2058927,Candalides geminus,,Erina geminus geminus,https://biodiversity.org.au/afd/taxa/14d46baa-...,dr492,"[{'key': 'vernacular name', 'value': 'Twin Dus...",obscured


In [237]:
# download conservation list from lists tool
conservationlist = lf.download_ala_list("https://lists.ala.org.au/ws/speciesListItems/dr651?max=10000&includeKVP=true")
conservationlist = lf.kvp_to_columns(conservationlist)
conservationlist['scientificName'] = conservationlist['scientificName'].str.replace('subsp. ', '', regex=False)
conservationlist['bionet_geoprivacy'] = 'open'
conservationlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,status,sourceStatus,vernacular name,taxonRemarks,family,bionet_geoprivacy
0,4029368,Dasyuroides byrnei,Kowari,Dasyuroides byrnei,https://biodiversity.org.au/afd/taxa/c342ff42-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Kowari,Mammal,,open
0,4029258,Phascogale calura,Red-tailed Phascogale,Phascogale calura,https://biodiversity.org.au/afd/taxa/36b436b1-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Red-tailed phascogale,Mammal,,open
0,4029309,Pseudomys fieldi,Shark Bay Mouse,Pseudomys fieldi,https://biodiversity.org.au/afd/taxa/edcf01fa-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Shark Bay mouse,Mammal,,open
0,4029399,Dasyurus geoffroii,Western Quoll,Dasyurus geoffroii,https://biodiversity.org.au/afd/taxa/a2260672-...,dr651,"[{'key': 'status', 'value': 'Vulnerable'}, {'k...",Vulnerable,Vulnerable (extinct in NT),Western quoll,Mammal,,open
0,4029343,Acacia latzii,Latz's Wattle,Acacia latzii,https://id.biodiversity.org.au/node/apni/2906346,dr651,"[{'key': 'family', 'value': 'Fabaceae'}, {'key...",Vulnerable,Vulnerable,Latz's wattle,,Fabaceae,open
...,...,...,...,...,...,...,...,...,...,...,...,...,...
0,4029304,Leipoa ocellata,Malleefowl,Leipoa ocellata,https://biodiversity.org.au/afd/taxa/c44c9098-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Malleefowl,Bird,,open
0,4029285,Dasyurus hallucatus,Northern Quoll,Dasyurus hallucatus,https://biodiversity.org.au/afd/taxa/5d7aeda8-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Northern quoll,Mammal,,open
0,4029371,Pedionomus torquatus,Plains-wanderer,Pedionomus torquatus,https://biodiversity.org.au/afd/taxa/30b4b2e5-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Plains-wanderer,Bird,,open
0,4029227,Vincentrachia desmonda,Saddle Creek Rocksnail,Vincentrachia desmonda,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...,dr651,"[{'key': 'status', 'value': 'Critically Endang...",Critically Endangered,Critically Endangered,Saddle Creek rocksnail,Invertebrate,,open


In [238]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = pd.concat([sensitivelist[['scientificName','bionet_geoprivacy', 'lsid']],
                    conservationlist[['scientificName','status','bionet_geoprivacy', 'lsid']]]).drop_duplicates()
statelist

Unnamed: 0,scientificName,bionet_geoprivacy,lsid,status
0,Macroderma gigas,obscured,https://biodiversity.org.au/afd/taxa/63bc796a-...,
1,Hipposideros stenotis,obscured,https://biodiversity.org.au/afd/taxa/26fe0f53-...,
2,Hipposideros inornatus,obscured,https://biodiversity.org.au/afd/taxa/5d2dab40-...,
3,Pezoporus occidentalis,obscured,https://biodiversity.org.au/afd/taxa/c630f3b0-...,
4,Polytelis alexandrae,obscured,https://biodiversity.org.au/afd/taxa/be7a08f5-...,
...,...,...,...,...
0,Leipoa ocellata,open,https://biodiversity.org.au/afd/taxa/c44c9098-...,Critically Endangered
0,Dasyurus hallucatus,open,https://biodiversity.org.au/afd/taxa/5d7aeda8-...,Critically Endangered
0,Pedionomus torquatus,open,https://biodiversity.org.au/afd/taxa/30b4b2e5-...,Critically Endangered
0,Vincentrachia desmonda,open,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...,Critically Endangered


### 4. Equivalent IUCN statuses

In [239]:
# iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild', 'Extinct'}
# sensitivelist.groupby(['status'])['status'].count()
statelist.groupby(['status'])['status'].count()


status
Critically Endangered     20
Endangered                52
Extinct                   11
Vulnerable               121
Name: status, dtype: int64

In [240]:
iucnStatusMappings = {
    'Vulnerable':'Vulnerable'
}

### 5. Determine best place ID to use

In [241]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 6825

place_id  place_name          place_display_name    
9994      Northern Territory  Northern Territory, AU    12
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State sensitive list on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [242]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['scientificName','status','bionet_geoprivacy', 'lsid']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,scientificName,status,bionet_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,status_inat,geoprivacy,place_id,place_display_name
145,Abrodictyum obscurum,Endangered,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,,,,
195,Acacia equisetifolia,Critically Endangered,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,,,,
14,Acacia latzii,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,,,,
146,Acacia peuce,Endangered,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,,,,
15,Acacia praetermissa,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
165,Xylopia monosperma,Endangered,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,,,,
65,Zeuxine oblonga,Vulnerable,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,,,,
68,Zyzomys maini,Vulnerable,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,,,,
176,Zyzomys palatalis,Endangered,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,,,,


In [243]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = " Territory Parks and Wildlife Conservation Act 1976"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://nt.gov.au/environment"
mergedstatuses['new_url'] = mergedstatuses['lsid']
mergedstatuses['new_geoprivacy'] = "obscured"
mergedstatuses['new_place_id'] = '9994'  # Northern Territory
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Threatened')
mergedstatuses

Unnamed: 0,scientificName,status,bionet_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,...,place_id,place_display_name,new_authority,new_description,new_url,new_geoprivacy,new_place_id,new_username,new_iucn_equivalent,new_status
145,Abrodictyum obscurum,Endangered,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/7402565,obscured,9994,peggydnew,Vulnerable,Endangered
195,Acacia equisetifolia,Critically Endangered,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/2890781,obscured,9994,peggydnew,Vulnerable,Critically Endangered
14,Acacia latzii,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/2906346,obscured,9994,peggydnew,Vulnerable,Vulnerable
146,Acacia peuce,Endangered,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/2906202,obscured,9994,peggydnew,Vulnerable,Endangered
15,Acacia praetermissa,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/2894855,obscured,9994,peggydnew,Vulnerable,Vulnerable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
165,Xylopia monosperma,Endangered,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/node/apni/2903202,obscured,9994,peggydnew,Vulnerable,Endangered
65,Zeuxine oblonga,Vulnerable,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://id.biodiversity.org.au/taxon/apni/5141...,obscured,9994,peggydnew,Vulnerable,Vulnerable
68,Zyzomys maini,Vulnerable,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://biodiversity.org.au/afd/taxa/3f638397-...,obscured,9994,peggydnew,Vulnerable,Vulnerable
176,Zyzomys palatalis,Endangered,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://biodiversity.org.au/afd/taxa/54aa72cc-...,obscured,9994,peggydnew,Vulnerable,Endangered


## Updates

In [244]:
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
6,UPDATE,Bellatorias obiri,152433,38633,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/2afc8501-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
169,UPDATE,Bellatorias obiri,152433,38633,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/2afc8501-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
82,UPDATE,Chloebia gouldiae,263901,1289379,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/27d27be2-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
67,UPDATE,Hipposideros inornatus,152434,74425,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/5d2dab40-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
2,UPDATE,Hipposideros inornatus,152434,74425,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/5d2dab40-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
1,UPDATE,Hipposideros stenotis,152436,40743,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/26fe0f53-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
0,UPDATE,Macroderma gigas,152432,41326,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/63bc796a-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
186,UPDATE,Pezoporus occidentalis,152431,73180,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/c630f3b0-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
3,UPDATE,Pezoporus occidentalis,152431,73180,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/c630f3b0-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
4,UPDATE,Polytelis alexandrae,152435,19250,Threatened,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/be7a08f5-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


## No status in iNaturalist via straight scientificName match
The NT records that didn't match up to a status in iNaturalist

In [245]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,scientificName,status,bionet_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,Endangered,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,...,,,,,,,,,,
1,Acacia equisetifolia,Critically Endangered,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,Endangered,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
198,Xylopia monosperma,Endangered,open,https://id.biodiversity.org.au/node/apni/2903202,,,,,,,...,,,,,,,,,,
199,Zeuxine oblonga,Vulnerable,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
200,Zyzomys maini,Vulnerable,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
201,Zyzomys palatalis,Endangered,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [246]:
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions

Unnamed: 0,scientificName,status,bionet_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
1,Acacia equisetifolia,Critically Endangered,open,https://id.biodiversity.org.au/node/apni/2890781,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2906346,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,Endangered,open,https://id.biodiversity.org.au/node/apni/2906202,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2894855,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
5,Acacia undoolyana,Vulnerable,open,https://id.biodiversity.org.au/node/apni/2891259,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,undoolyana,,2022-04-05T02:34:14Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
196,Vidumelon wattii,Vulnerable,open,https://biodiversity.org.au/afd/taxa/3aedec6e-...,,,,,,,...,Mollusca,Gastropoda,Stylommatophora,Camaenidae,Vidumelon,wattii,,2021-10-29T15:37:54Z,species,http://www.catalogueoflife.org/annual-checklis...
199,Zeuxine oblonga,Vulnerable,open,https://id.biodiversity.org.au/taxon/apni/5141...,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
200,Zyzomys maini,Vulnerable,open,https://biodiversity.org.au/afd/taxa/3f638397-...,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
201,Zyzomys palatalis,Endangered,open,https://biodiversity.org.au/afd/taxa/54aa72cc-...,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [247]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,ADD,Acacia equisetifolia,,1253756,Critically Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/node/apni/2890781,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
2,ADD,Acacia latzii,,1254327,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/node/apni/2906346,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
3,ADD,Acacia peuce,,465191,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/node/apni/2906202,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
4,ADD,Acacia praetermissa,,1254561,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/node/apni/2894855,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
5,ADD,Acacia undoolyana,,1254884,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/node/apni/2891259,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
...,...,...,...,...,...,...,...,...,...,...,...,...
196,ADD,Vidumelon wattii,,114966,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/3aedec6e-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
199,ADD,Zeuxine oblonga,,369267,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://id.biodiversity.org.au/taxon/apni/5141...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
200,ADD,Zyzomys maini,,45377,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/3f638397-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
201,ADD,Zyzomys palatalis,,75238,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://biodiversity.org.au/afd/taxa/54aa72cc-...,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


In [248]:
# write these to the file
pd.concat([updates,additions]).to_csv(sourcedir + "nt.csv", index=False)

In [249]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,scientificName,status,bionet_geoprivacy,lsid,status_id,taxon_id,user_id,description,iucn,authority,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,Endangered,open,https://id.biodiversity.org.au/node/apni/7402565,,,,,,,...,,,,,,,,,,
10,Amphidromus (Syndromus) cognatus,Vulnerable,open,https://biodiversity.org.au/afd/taxa/e7b52a3b-...,,,,,,,...,,,,,,,,,,
11,Amytornis (Amytornis) modestus indulkanna,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/1309cbb2-...,,,,,,,...,,,,,,,,,,
12,Amytornis (Amytornis) modestus modestus,Extinct,open,https://biodiversity.org.au/afd/taxa/8e0e9a30-...,,,,,,,...,,,,,,,,,,
13,Amytornis (Magnamytis) dorotheae,Endangered,open,https://biodiversity.org.au/afd/taxa/425c0e16-...,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
184,Typhonium mirabile,Endangered,open,https://id.biodiversity.org.au/node/apni/2900370,,,,,,,...,,,,,,,,,,
186,Typhonium sp. Sandover,Vulnerable,open,ALA_DR651_53,,,,,,,...,,,,,,,,,,
187,Typhonium taylorii,Endangered,open,https://id.biodiversity.org.au/node/apni/2906352,,,,,,,...,,,,,,,,,,
197,Vincentrachia desmonda,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/2ff5b7ab-...,,,,,,,...,,,,,,,,,,


### are there any that need to be removed?
NT sensitive list count: 0
NT inat statuses count: 12

updates to inat status: 4
additional inat status: 136
NT statuses we can't find a taxon match for in iNaturalist: 64
total: 166 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 12-4=8 that may need checking

In [250]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]


Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1,234788,918383,702203.0,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
