# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive list
4. Attempt to match the state statuses to an IUCN equivalent


### 1. iNaturalist statuses

In [51]:
import pandas as pd

projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
# projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [52]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    # print(authoritydf)
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         # taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])
inatstatuses = filter_state_statuses("Northern Territory|NT NRETAS", " ")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
289,263901,1289379,702203.0,9994,,Atlas of Living Australia,VU,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Chloebia,gouldiae,,2022-10-20T02:55:37Z,Chloebia gouldiae,species,https://www.birds.cornell.edu/clementschecklis...,,,
1251,152435,19250,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Polytelis,alexandrae,,2019-08-27T01:09:01Z,Polytelis alexandrae,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
1249,152433,38633,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Bellatorias,obiri,,2019-04-30T15:19:21Z,Bellatorias obiri,species,http://reptile-database.reptarium.cz/search.ph...,,,
1252,152436,40743,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,,,,,Hipposideros stenotis,,,Narrow-eared Roundleaf Bat,False,[1431118]
1248,152432,41326,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Macroderma,gigas,,2019-08-27T01:58:05Z,Macroderma gigas,species,http://www.catalogueoflife.org/annual-checklis...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1247,152431,73180,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Pezoporus,occidentalis,,2019-08-27T01:09:27Z,Pezoporus occidentalis,species,http://www.birdlife.org/datazone/speciesfactsh...,,,


### 2. iNaturalist taxonomy

In [53]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists
Sensitive list: `geoprivacy` = `obscured`
overrides

In [54]:
# NT - No sensitive list

# sensitivelist = pd.read_csv(listdir + "sensitive-lists/NT-sensitive.csv")  # Qld sensitive list
# sensitivelist['scientificName'] = sensitivelist['scientificname'].str.replace('subsp. ', '', regex=False)
# sensitivelist['bionet_geoprivacy'] = 'obscured'
# # sensitivelist = sensitivelist.rename(columns={'taxonID':'wildnetTaxonID'})
# sensitivelist

In [55]:
conservationlist = pd.read_csv(listdir + "conservation-lists/NT-conservation.csv")  # NT sensitive list
conservationlist['scientificName'] = conservationlist['scientificName'].str.replace('subsp. ', '', regex=False)
# conservationlist = conservationlist.rename(columns={'taxonID':'bionetTaxonID'})
# conservationlist['bionet_geoprivacy'] = conservationlist['sensitivityClass'].apply(gpcondition)
conservationlist['bionet_geoprivacy'] = 'open'
conservationlist

Unnamed: 0,family,status,sourceStatus,vernacularName,scientificName,taxonRemarks,bionet_geoprivacy
0,,Vulnerable,Vulnerable (extinct in NT),Kowari,Dasyuroides byrnei,Mammal,open
1,,Vulnerable,Vulnerable (extinct in NT),Red-tailed phascogale,Phascogale calura,Mammal,open
2,,Vulnerable,Vulnerable (extinct in NT),Shark Bay mouse,Pseudomys fieldi,Mammal,open
3,,Vulnerable,Vulnerable (extinct in NT),Western quoll,Dasyurus geoffroii,Mammal,open
4,Fabaceae,Vulnerable,Vulnerable,Latz's wattle,Acacia latzii,,open
...,...,...,...,...,...,...,...
199,,Critically Endangered,Critically Endangered,Malleefowl,Leipoa ocellata,Bird,open
200,,Critically Endangered,Critically Endangered,Northern quoll,Dasyurus hallucatus,Mammal,open
201,,Critically Endangered,Critically Endangered,Plains-wanderer,Pedionomus torquatus,Bird,open
202,,Critically Endangered,Critically Endangered,Saddle Creek rocksnail,Vincentrachia desmonda,Invertebrate,open


In [56]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist =  conservationlist.drop_duplicates()
statelist

Unnamed: 0,family,status,sourceStatus,vernacularName,scientificName,taxonRemarks,bionet_geoprivacy
0,,Vulnerable,Vulnerable (extinct in NT),Kowari,Dasyuroides byrnei,Mammal,open
1,,Vulnerable,Vulnerable (extinct in NT),Red-tailed phascogale,Phascogale calura,Mammal,open
2,,Vulnerable,Vulnerable (extinct in NT),Shark Bay mouse,Pseudomys fieldi,Mammal,open
3,,Vulnerable,Vulnerable (extinct in NT),Western quoll,Dasyurus geoffroii,Mammal,open
4,Fabaceae,Vulnerable,Vulnerable,Latz's wattle,Acacia latzii,,open
...,...,...,...,...,...,...,...
199,,Critically Endangered,Critically Endangered,Malleefowl,Leipoa ocellata,Bird,open
200,,Critically Endangered,Critically Endangered,Northern quoll,Dasyurus hallucatus,Mammal,open
201,,Critically Endangered,Critically Endangered,Plains-wanderer,Pedionomus torquatus,Bird,open
202,,Critically Endangered,Critically Endangered,Saddle Creek rocksnail,Vincentrachia desmonda,Invertebrate,open


### 4. Equivalent IUCN statuses

In [57]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild' and 'Extinct'}
# sensitivelist.groupby(['status'])['status'].count()
statelist.groupby(['status'])['status'].count()

status
Critically Endangered     20
Endangered                52
Extinct                   11
Vulnerable               120
Vulnerable                 1
Name: status, dtype: int64

In [58]:
iucnStatusMappings = {
    'Vulnernable':'Vulnerable'
}

### 5. Determine best place ID to use

In [59]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 6825

place_id  place_name          place_display_name    
9994      Northern Territory  Northern Territory, AU    12
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State sensitive list on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [60]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['scientificName','status','bionet_geoprivacy']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,geoprivacy,place_id,place_display_name
135,Abrodictyum obscurum,Endangered,open,,,,,,,,,,
185,Acacia equisetifolia,Critically Endangered,open,,,,,,,,,,
4,Acacia latzii,Vulnerable,open,,,,,,,,,,
136,Acacia peuce,Endangered,open,,,,,,,,,,
5,Acacia praetermissa,Vulnerable,open,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
155,Xylopia monosperma,Endangered,open,,,,,,,,,,
55,Zeuxine oblonga,Vulnerable,open,,,,,,,,,,
58,Zyzomys maini,Vulnerable,open,,,,,,,,,,
166,Zyzomys palatalis,Endangered,open,,,,,,,,,,


In [61]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = " Territory Parks and Wildlife Conservation Act 1976"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://nt.gov.au/environment"
mergedstatuses['new_url'] = "https://nt.gov.au/environment"
mergedstatuses['new_geoprivacy'] = "obscured"
mergedstatuses['new_place_id'] = '9994'  # Northern Territory
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Threatened')
mergedstatuses

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,place_id,place_display_name,new_authority,new_description,new_url,new_geoprivacy,new_place_id,new_username,new_iucn_equivalent,new_status
135,Abrodictyum obscurum,Endangered,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Endangered
185,Acacia equisetifolia,Critically Endangered,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Critically Endangered
4,Acacia latzii,Vulnerable,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Vulnerable
136,Acacia peuce,Endangered,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Endangered
5,Acacia praetermissa,Vulnerable,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Vulnerable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
155,Xylopia monosperma,Endangered,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Endangered
55,Zeuxine oblonga,Vulnerable,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Vulnerable
58,Zyzomys maini,Vulnerable,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Vulnerable
166,Zyzomys palatalis,Endangered,open,,,,,,,,...,,,Territory Parks and Wildlife Conservation Act...,Listed as Threatened - refer to https://nt.gov...,https://nt.gov.au/environment,obscured,9994,peggydnew,Vulnerable,Endangered


## Updates

In [62]:
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
159,UPDATE,Bellatorias obiri,152433,38633,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
57,UPDATE,Hipposideros inornatus,152434,74425,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
176,UPDATE,Pezoporus occidentalis,152431,73180,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
104,UPDATE,Polytelis alexandrae,152435,19250,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


## No status in iNaturalist via straight scientificName match
The NT records that didn't match up to a status in iNaturalist

In [63]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,Endangered,open,,,,,,,,...,,,,,,,,,,
1,Acacia equisetifolia,Critically Endangered,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,Vulnerable,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,Endangered,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,Vulnerable,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,Xylopia monosperma,Endangered,open,,,,,,,,...,,,,,,,,,,
196,Zeuxine oblonga,Vulnerable,open,,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
197,Zyzomys maini,Vulnerable,open,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
198,Zyzomys palatalis,Endangered,open,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [64]:
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
1,Acacia equisetifolia,Critically Endangered,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,equisetifolia,,2022-04-06T22:05:25Z,species,https://eol.org/pages/49426174
2,Acacia latzii,Vulnerable,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,latzii,,2022-04-07T02:06:43Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Acacia peuce,Endangered,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,peuce,,2022-04-06T23:48:13Z,species,http://www.catalogueoflife.org/annual-checklis...
4,Acacia praetermissa,Vulnerable,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,praetermissa,,2022-04-05T02:10:02Z,species,http://www.catalogueoflife.org/annual-checklis...
5,Acacia undoolyana,Vulnerable,open,,,,,,,,...,Tracheophyta,Magnoliopsida,Fabales,Fabaceae,Acacia,undoolyana,,2022-04-05T02:34:14Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,Vidumelon wattii,Vulnerable,open,,,,,,,,...,Mollusca,Gastropoda,Stylommatophora,Camaenidae,Vidumelon,wattii,,2021-10-29T15:37:54Z,species,http://www.catalogueoflife.org/annual-checklis...
196,Zeuxine oblonga,Vulnerable,open,,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,oblonga,,2021-07-16T02:50:44Z,species,http://www.catalogueoflife.org/annual-checklis...
197,Zyzomys maini,Vulnerable,open,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,maini,,2019-08-27T01:49:16Z,species,http://www.catalogueoflife.org/annual-checklis...
198,Zyzomys palatalis,Endangered,open,,,,,,,,...,Chordata,Mammalia,Rodentia,Muridae,Zyzomys,palatalis,,2019-11-22T22:46:28Z,species,http://www.iucnredlist.org/details/23327/0


In [65]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,ADD,Acacia equisetifolia,,1253756,Critically Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
2,ADD,Acacia latzii,,1254327,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
3,ADD,Acacia peuce,,465191,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
4,ADD,Acacia praetermissa,,1254561,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
5,ADD,Acacia undoolyana,,1254884,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
...,...,...,...,...,...,...,...,...,...,...,...,...
193,ADD,Vidumelon wattii,,114966,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
196,ADD,Zeuxine oblonga,,369267,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
197,ADD,Zyzomys maini,,45377,Vulnerable,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...
198,ADD,Zyzomys palatalis,,75238,Endangered,Vulnerable,Territory Parks and Wildlife Conservation Act...,https://nt.gov.au/environment,obscured,9994,peggydnew,Listed as Threatened - refer to https://nt.gov...


In [66]:
# write these to the file
pd.concat([updates,additions]).to_csv(sourcedir + "nt.csv", index=False)

In [67]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Abrodictyum obscurum,Endangered,open,,,,,,,,...,,,,,,,,,,
6,Acanthophsis hawkei,Vulnerable,open,,,,,,,,...,,,,,,,,,,
9,Amperia spicata,Vulnerable,open,,,,,,,,...,,,,,,,,,,
19,Babingtonia behrii,Vulnerable,open,,,,,,,,...,,,,,,,,,,
20,Baumea arthrophylla,Endangered,open,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
181,Typhonium mirabile,Endangered,open,,,,,,,,...,,,,,,,,,,
183,Typhonium taylori,Endangered,open,,,,,,,,...,,,,,,,,,,
184,Typhonium sp. Sandover,Vulnerable,open,,,,,,,,...,,,,,,,,,,
194,Vincentrachia desmonda,Critically Endangered,open,,,,,,,,...,,,,,,,,,,


### are there any that need to be removed?
NT sensitive list count: 0
NT inat statuses count: 12

updates to inat status: 4
additional inat status: 136
NT statuses we can't find a taxon match for in iNaturalist: 64
total: 166 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 12-4=8 that may need checking

In [68]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]


Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1001,162724,1134239,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/https://id.biod...,,open,...,Cryptandra,gemmata,,2020-09-25T18:01:37Z,Cryptandra gemmata,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
927,165448,12647,702203.0,9994,,Northern Territory,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Epthianura,crocea,,2021-09-17T08:46:17Z,Epthianura crocea,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
289,263901,1289379,702203.0,9994,,Atlas of Living Australia,VU,https://bie.ala.org.au/species/urn:lsid:biodiv...,,open,...,Chloebia,gouldiae,,2022-10-20T02:55:37Z,Chloebia gouldiae,species,https://www.birds.cornell.edu/clementschecklis...,,,
215,170163,20166,702203.0,9994,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/urn:lsid:biodiv...,,,...,Ninox,connivens,,2021-07-28T02:17:03Z,Ninox connivens,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
1252,152436,40743,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,,,,,Hipposideros stenotis,,,Narrow-eared Roundleaf Bat,False,[1431118]
1248,152432,41326,708886.0,9994,16652.0,NT NRETAS,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Macroderma,gigas,,2019-08-27T01:58:05Z,Macroderma gigas,species,http://www.catalogueoflife.org/annual-checklis...,,,
986,139906,698942,,9994,,,least concern,https://bie.ala.org.au/species/http://id.biodi...,Atlas of Living Australia (ALA),,...,Duboisia,hopwoodii,,2019-02-16T13:19:18Z,Duboisia hopwoodii,species,https://www.ala.org.au/,,,
1,234788,918383,702203.0,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
