# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive list
4. Attempt to match the state statuses to an IUCN equivalent


### 1. iNaturalist statuses

In [119]:
import pandas as pd

projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
# projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [120]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])
inatstatuses = filter_state_statuses("ACT Government", ".act.gov")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
2465,152234,100611,,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,armatus,,2022-06-06T16:36:21Z,Euastacus armatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
270,180956,1130138,3669610.0,12986,,ACT Government,EN,http://www.environment.gov.au/cgi-bin/sprat/pu...,,obscured,...,Prasophyllum,petilum,,2022-08-18T08:30:22Z,Prasophyllum petilum,species,http://www.catalogueoflife.org/annual-checklis...,,,
3305,266735,1255190,5159763.0,12986,,ACT Government,Endangered,https://www.environment.act.gov.au/nature-cons...,Listed as endangered in the ACT,obscured,...,Lepidium,ginninderrense,,2021-07-27T21:58:53Z,Lepidium ginninderrense,species,http://www.catalogueoflife.org/annual-checklis...,,,
2662,152233,138646,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Swainsona,recta,,2022-06-08T16:59:25Z,Swainsona recta,species,,,,
1078,152222,25262,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Pseudophryne,pengilleyi,,2018-11-27T04:54:16Z,Pseudophryne pengilleyi,species,http://research.amnh.org/vz/herpetology/amphib...,,,
1082,152227,36937,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,private,...,Delma,impar,,2019-06-06T05:58:40Z,Delma impar,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1090,152221,36957,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Aprasia,parapulchella,,2022-04-07T23:19:10Z,Aprasia parapulchella,species,http://reptile-database.reptarium.cz/search.ph...,,,
1076,152230,39441,708886.0,12986,16649.0,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Varanus,rosenbergi,,2021-07-28T09:18:05Z,Varanus rosenbergi,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1079,152224,45210,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Pseudomys,fumeus,,2019-11-22T22:45:54Z,Pseudomys fumeus,species,http://www.catalogueoflife.org/annual-checklis...,,,
1088,152235,51333,708886.0,12986,16649.0,ACT Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Macquaria,australasica,,2019-11-24T03:18:43Z,Macquaria australasica,species,http://www.fishbase.org,,,


### 2. iNaturalist taxonomy

In [121]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists
Sensitive list: `geoprivacy` = `obscured`
overrides

In [122]:
sensitivelist = pd.read_csv(listdir + "sensitive-lists/ACT-sensitive.csv")  # ACT sensitive list
sensitivelist['scientificName'] = sensitivelist['scientificname'].str.replace('subsp. ', '', regex=False)
sensitivelist['bionet_geoprivacy'] = 'obscured'
# sensitivelist = sensitivelist.rename(columns={'taxonID':'wildnetTaxonID'})
sensitivelist

Unnamed: 0,category,scientificname,vernacularname,taxonrank,taxonomicstatus,kingdom,phylum,class,order,family,sourcestatus,status,authority,ngunnawal,comments,synonym,taxonremarks,scientificName,bionet_geoprivacy
0,Bird,Anthochaera phrygia,Regent Honeyeater,species,accepted,Animalia,Chordata,Aves,Passeriformes,Meliphagidae,Critically Endangered,Critically Endangered,Nature Conservation Act 2014 (ACT),,,,,Anthochaera phrygia,obscured
1,Reptile,Aprasia parapulchella,Pink-tailed Worm-lizard,species,accepted,Animalia,Chordata,Reptilia,Squamata,Pygopodidae,Vulnernable,Vulnernable,Nature Conservation Act 2014 (ACT),Banburung,,,,Aprasia parapulchella,obscured
2,Mammal,Bettongia gaimardi,Eastern Bettong,species,accepted,Animalia,Chordata,Mammalia,Diprotodontia,Potoroidae,Regionally Conservation Dependent,,Nature Conservation Act 2014 (ACT),,Regionally Conservation Dependent,,,Bettongia gaimardi,obscured
3,Fish,Bidyanus bidyanus,Silver Perch,species,accepted,Animalia,Chordata,Actinopterygii,Perciformes,Terapontidae,Endangered,Endangered,Nature Conservation Act 2014 (ACT),Dhingur,,,,Bidyanus bidyanus,obscured
4,Plant,Bossiaea grayi,Murrumbidgee Bossiaea,species,accepted,Plantae,Charophyta,Equisetopsida,Fabales,Fabaceae,Endangered,Endangered,Nature Conservation Act 2014 (ACT),,,,,Bossiaea grayi,obscured
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
103,Plant,Swainsona recta,Small Purple Pea,species,accepted,Plantae,Charophyta,Equisetopsida,Fabales,Fabaceae,Endangered,Endangered,Nature Conservation Act 2014 (ACT),,,,,Swainsona recta,obscured
104,Invertebrate,Synemon plana,Golden Sun Moth,species,accepted,Animalia,Arthropoda,invertibratea,Lepidoptera,Castniidae,Endangered,Endangered,Nature Conservation Act 2014 (ACT),,,,,Synemon plana,obscured
105,Plant,Thesium australe,Austral Toadflax,species,accepted,Plantae,Charophyta,Equisetopsida,Malpighiales,Santalaceae,Vulnernable,Vulnernable,Nature Conservation Act 2014 (ACT),,,,,Thesium australe,obscured
106,Reptile,Tympanocryptis pinguicolla,Grassland Earless Dragon,species,homotypic synonym,Animalia,Chordata,Reptilia,Squamata,Agamidae,Endangered,Endangered,Nature Conservation Act 2014 (ACT),Bidjiwang,Homotypic synonym of Tympanocryptis lineata,Tympanocryptis lineata,Homotypic synonym of Tympanocryptis lineata,Tympanocryptis pinguicolla,obscured


In [123]:
conservationlist = pd.read_csv(listdir + "conservation-lists/ACT-conservation.csv")  # ACT conservation list
conservationlist['scientificName'] = conservationlist['scientificName'].str.replace('subsp. ', '', regex=False)
# conservationlist = conservationlist.rename(columns={'taxonID':'bionetTaxonID'})
# conservationlist['bionet_geoprivacy'] = conservationlist['sensitivityClass'].apply(gpcondition)
conservationlist['bionet_geoprivacy'] = 'obscured'
conservationlist

Unnamed: 0,scientificName,vernacularName,status,taxonRemarks,sourceStatus,bionet_geoprivacy
0,Anthochaera phrygia,Regent Honeyeater,Critically Endangered,,Critically Endangered,obscured
1,Lathamus discolor,Swift Parrot,Critically Endangered,,Critically Endangered,obscured
2,Pseudophryne pengilleyi,Northern Corroboree Frog,Critically Endangered,,Critically Endangered,obscured
3,Caladenia actensis,Canberra Spider Orchid,Critically Endangered,,Critically Endangered,obscured
4,Corunastylis ectopa,Brindabella Midge Orchid,Critically Endangered,,Critically Endangered,obscured
5,Pterostylis oreophila,Kiandra Greenhood,Critically Endangered,,Critically Endangered,obscured
6,Litoria castanea,Yellow-spotted Bell Frog,Critically Endangered,locally extinct,Critically Endangered,obscured
7,Gentiana baeuerlenii,Baeuerlen's Gentian,Endangered,,Endangered,obscured
8,Prasophyllum petilum,Tarengo Leek Orchid,Endangered,,Endangered,obscured
9,Rutidosis leptorhynchoides,Button Wrinklewort,Endangered,,Endangered,obscured


In [124]:
fullstatelist = pd.concat([sensitivelist[['scientificName','status','bionet_geoprivacy']],
                    conservationlist[['scientificName','status','bionet_geoprivacy']]])
fullstatelist

Unnamed: 0,scientificName,status,bionet_geoprivacy
0,Anthochaera phrygia,Critically Endangered,obscured
1,Aprasia parapulchella,Vulnernable,obscured
2,Bettongia gaimardi,,obscured
3,Bidyanus bidyanus,Endangered,obscured
4,Bossiaea grayi,Endangered,obscured
...,...,...,...
48,Eucalyptus aggregata,Vulnerable,obscured
49,Pomaderris pallida,Vulnerable,obscured
50,Thesium australe,Vulnerable,obscured
51,Hirundapus caudacutus,Vulnerable,obscured


In [125]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = pd.concat([sensitivelist[['scientificName','status','bionet_geoprivacy']],
                    conservationlist[['scientificName','status','bionet_geoprivacy']]]).drop_duplicates()
statelist

Unnamed: 0,scientificName,status,bionet_geoprivacy
0,Anthochaera phrygia,Critically Endangered,obscured
1,Aprasia parapulchella,Vulnernable,obscured
2,Bettongia gaimardi,,obscured
3,Bidyanus bidyanus,Endangered,obscured
4,Bossiaea grayi,Endangered,obscured
...,...,...,...
94,Pomaderris pallida,Vulnerable,obscured
97,Pseudomys novaehollandiae,Vulnerable,obscured
99,Pteropus poliocephalus,Vulnerable,obscured
50,Thesium australe,Vulnerable,obscured


In [106]:
# check for duplicates with conflicting information
# statelist.groupby('bionetTaxonID').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
# statelist.groupby('scientificName').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
#df.groupby('hash').filter(lambda group: len(group) > 1).sort('size', ascending=False)

### 4. Equivalent IUCN statuses

In [126]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild' and 'Extinct'}
sensitivelist.groupby(['status'])['status'].count()

status
Critically Endangered    14
Endangered               38
Vulnerable               26
Vulnernable              28
Name: status, dtype: int64

In [127]:
iucnStatusMappings = {
    'Vulnernable':'Vulnerable'
}

### 5. Determine best place ID to use

In [128]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 6825

place_id  place_name                    place_display_name              
12986     Australian Capital Territory  Australian Capital Territory, AU    18
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State sensitive list on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [129]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['scientificName','status','bionet_geoprivacy']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,geoprivacy,place_id,place_display_name
0,Anthochaera phrygia,Critically Endangered,obscured,,,,,,,,,,
1,Aprasia parapulchella,Vulnernable,obscured,152221,36957,708886,,30,ACT Government,vulnerable,obscured,12986,"Australian Capital Territory, AU"
54,Aprasia parapulchella,Vulnerable,obscured,152221,36957,708886,,30,ACT Government,vulnerable,obscured,12986,"Australian Capital Territory, AU"
81,Bettongia gaimardi,Conservation Dependent,obscured,,,,,,,,,,
2,Bettongia gaimardi,,obscured,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
50,Synemon plana,Endangered,obscured,,,,,,,,,,
51,Thesium australe,Vulnernable,obscured,,,,,,,,,,
80,Thesium australe,Vulnerable,obscured,,,,,,,,,,
53,Tympanocryptis lineata,Endangered,obscured,,,,,,,,,,


In [130]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
mergedstatuses['new_authority'] = "Nature Conservation Act 2014 (ACT)"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://www.environment.act.gov.au/nature-conservation/conservation-and-ecological-communities/threatened-species-factsheets"
mergedstatuses['new_url'] = "https://www.environment.act.gov.au/nature-conservation/conservation-and-ecological-communities/threatened-species-factsheets"
mergedstatuses['new_geoprivacy'] = "obscured"
mergedstatuses['new_place_id'] = '12986'  # Australian Capital Territory
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['status'].fillna('Threatened')
mergedstatuses

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,place_id,place_display_name,new_authority,new_description,new_url,new_geoprivacy,new_place_id,new_username,new_iucn_equivalent,new_status
0,Anthochaera phrygia,Critically Endangered,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Critically Endangered
1,Aprasia parapulchella,Vulnernable,obscured,152221,36957,708886,,30,ACT Government,vulnerable,...,12986,"Australian Capital Territory, AU",Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Vulnernable
54,Aprasia parapulchella,Vulnerable,obscured,152221,36957,708886,,30,ACT Government,vulnerable,...,12986,"Australian Capital Territory, AU",Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Vulnerable
81,Bettongia gaimardi,Conservation Dependent,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Conservation Dependent
2,Bettongia gaimardi,,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Threatened
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
50,Synemon plana,Endangered,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Endangered
51,Thesium australe,Vulnernable,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Vulnernable
80,Thesium australe,Vulnerable,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Vulnerable
53,Tympanocryptis lineata,Endangered,obscured,,,,,,,,...,,,Nature Conservation Act 2014 (ACT),Listed as Threatened - refer to https://www.en...,https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Vulnerable,Endangered


## Updates

In [131]:
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
1,UPDATE,Aprasia parapulchella,152221,36957,Vulnernable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
54,UPDATE,Aprasia parapulchella,152221,36957,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
4,UPDATE,Bossiaea grayi,152220,795624,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
14,UPDATE,Delma impar,152227,36937,Vulnernable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
60,UPDATE,Delma impar,152227,36937,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
15,UPDATE,Euastacus armatus,152234,100611,Vulnernable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
61,UPDATE,Euastacus armatus,152234,100611,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
25,UPDATE,Lepidium ginninderrense,266735,1255190,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
30,UPDATE,Maccullochella macquariensis,152226,85287,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
31,UPDATE,Macquaria australasica,152235,51333,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...


## No status in iNaturalist via straight scientificName match
The ACT records that didn't match up to a status in iNaturalist

In [132]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Anthochaera phrygia,Critically Endangered,obscured,,,,,,,,...,Chordata,Aves,Passeriformes,Meliphagidae,Anthochaera,phrygia,,2022-06-09T16:08:32Z,species,http://www.birds.cornell.edu/clementschecklist...
1,Bettongia gaimardi,Conservation Dependent,obscured,,,,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
2,Bettongia gaimardi,,obscured,,,,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Bidyanus bidyanus,Endangered,obscured,,,,,,,,...,Chordata,Actinopterygii,Centrarchiformes,Terapontidae,Bidyanus,bidyanus,,2019-11-24T03:53:09Z,species,http://www.fishbase.org
4,Botaurus poiciloptilus,Endangered,obscured,,,,,,,,...,Chordata,Aves,Pelecaniformes,Ardeidae,Botaurus,poiciloptilus,,2022-06-09T16:43:19Z,species,http://www.birdlife.org/datazone/speciesfactsh...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
60,Rutidosis leptorhynchoides,Endangered,obscured,,,,,,,,...,Tracheophyta,Magnoliopsida,Asterales,Asteraceae,Rutidosis,leptorhynchoides,,2021-05-11T02:06:04Z,species,http://eol.org/pages/5123658
61,Synemon plana,Endangered,obscured,,,,,,,,...,Arthropoda,Insecta,Lepidoptera,Castniidae,Synemon,plana,,2021-02-04T16:45:30Z,species,http://www.catalogueoflife.org/annual-checklis...
62,Thesium australe,Vulnernable,obscured,,,,,,,,...,Tracheophyta,Magnoliopsida,Santalales,Santalaceae,Thesium,australe,,2022-07-10T10:55:53Z,species,https://eol.org/pages/5609550
63,Thesium australe,Vulnerable,obscured,,,,,,,,...,Tracheophyta,Magnoliopsida,Santalales,Santalaceae,Thesium,australe,,2022-07-10T10:55:53Z,species,https://eol.org/pages/5609550


In [133]:
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,Anthochaera phrygia,Critically Endangered,obscured,,,,,,,,...,Chordata,Aves,Passeriformes,Meliphagidae,Anthochaera,phrygia,,2022-06-09T16:08:32Z,species,http://www.birds.cornell.edu/clementschecklist...
1,Bettongia gaimardi,Conservation Dependent,obscured,,,,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
2,Bettongia gaimardi,,obscured,,,,,,,,...,Chordata,Mammalia,Diprotodontia,Potoroidae,Bettongia,gaimardi,,2020-07-20T02:39:44Z,species,http://www.catalogueoflife.org/annual-checklis...
3,Bidyanus bidyanus,Endangered,obscured,,,,,,,,...,Chordata,Actinopterygii,Centrarchiformes,Terapontidae,Bidyanus,bidyanus,,2019-11-24T03:53:09Z,species,http://www.fishbase.org
4,Botaurus poiciloptilus,Endangered,obscured,,,,,,,,...,Chordata,Aves,Pelecaniformes,Ardeidae,Botaurus,poiciloptilus,,2022-06-09T16:43:19Z,species,http://www.birdlife.org/datazone/speciesfactsh...
5,Caladenia actensis,Critically Endangered,obscured,,,,,,,,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Caladenia,actensis,,2022-08-18T08:16:27Z,species,
6,Calyptorhynchus lathami lathami,Vulnernable,obscured,,,,,,,,...,Chordata,Aves,Psittaciformes,Cacatuidae,Calyptorhynchus,lathami,lathami,2018-12-19T06:38:49Z,subspecies,
7,Calyptorhynchus lathami lathami,Vulnerable,obscured,,,,,,,,...,Chordata,Aves,Psittaciformes,Cacatuidae,Calyptorhynchus,lathami,lathami,2018-12-19T06:38:49Z,subspecies,
8,Climacteris picumnus victoriae,Vulnernable,obscured,,,,,,,,...,Chordata,Aves,Passeriformes,Climacteridae,Climacteris,picumnus,victoriae,2021-11-24T01:01:34Z,subspecies,http://www.birds.cornell.edu/clementschecklist...
9,Climacteris picumnus victoriae,Vulnerable,obscured,,,,,,,,...,Chordata,Aves,Passeriformes,Climacteridae,Climacteris,picumnus,victoriae,2021-11-24T01:01:34Z,subspecies,http://www.birds.cornell.edu/clementschecklist...


In [134]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
0,ADD,Anthochaera phrygia,,144707,Critically Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
1,ADD,Bettongia gaimardi,,42996,Conservation Dependent,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
2,ADD,Bettongia gaimardi,,42996,Threatened,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
3,ADD,Bidyanus bidyanus,,95759,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
4,ADD,Botaurus poiciloptilus,,5032,Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
5,ADD,Caladenia actensis,,1127611,Critically Endangered,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
6,ADD,Calyptorhynchus lathami lathami,,720267,Vulnernable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
7,ADD,Calyptorhynchus lathami lathami,,720267,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
8,ADD,Climacteris picumnus victoriae,,713108,Vulnernable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...
9,ADD,Climacteris picumnus victoriae,,713108,Vulnerable,Vulnerable,Nature Conservation Act 2014 (ACT),https://www.environment.act.gov.au/nature-cons...,obscured,12986,peggydnew,Listed as Threatened - refer to https://www.en...


In [136]:
# write these to the file
pd.concat([updates,additions]).to_csv(sourcedir + "act.csv", index=False)

In [137]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,scientificName,status,bionet_geoprivacy,status_id,taxon_id,user_id,description,iucn,authority,status_inat,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
10,Corunastylis ectopa,Critically Endangered,obscured,,,,,,,,...,,,,,,,,,,
13,Dasyurus maculatus maculatus,Vulnernable,obscured,,,,,,,,...,,,,,,,,,,
14,Dasyurus maculatus maculatus,Vulnerable,obscured,,,,,,,,...,,,,,,,,,,
18,Gadopsis bispinosus,Vulnernable,obscured,,,,,,,,...,,,,,,,,,,
19,Gadopsis bispinosus,Vulnerable,obscured,,,,,,,,...,,,,,,,,,,
20,Gentiana baeuerlenii,Endangered,obscured,,,,,,,,...,,,,,,,,,,
31,Litoria aurea,Vulnerable,obscured,,,,,,,,...,,,,,,,,,,
32,Litoria aurea,Vulnernable,obscured,,,,,,,,...,,,,,,,,,,
34,Litoria raniformis,Vulnernable,obscured,,,,,,,,...,,,,,,,,,,
35,Litoria raniformis,Vulnerable,obscured,,,,,,,,...,,,,,,,,,,


### are there any that need to be removed?
ACT sensitive list count: 108
ACT inat statuses count: 18

updates to inat status: 17
additional inat status: 53
ACT statuses we can't find a taxon match for in iNaturalist: 12
total: 208 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 18-17=1 that may need checking

In [138]:
# inat statuses that aren't in added or updated
inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]


Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
1076,152230,39441,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Varanus,rosenbergi,,2021-07-28T09:18:05Z,Varanus rosenbergi,species,http://www.iucnredlist.org/apps/redlist/detail...,,,
1086,152232,568018,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Maratus,calcitrans,,2021-08-26T01:12:23Z,Maratus calcitrans,species,,,,
2417,152223,74547,708886,12986,16649,ACT Government,vulnerable,https://lists.ala.org.au/speciesListItem/list/...,,,...,Mastacomys,fuscus,,2020-07-20T02:56:32Z,Mastacomys fuscus,species,http://www.iucnredlist.org/details/18563/0,,,
1080,152225,761022,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Keyacris,scurra,,2018-08-08T08:18:31Z,Keyacris scurra,species,http://orthoptera.speciesfile.org,,,
1085,152231,781565,708886,12986,16649,ACT Government,rare,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Cooraboorama,canberrae,,2018-09-27T04:49:50Z,Cooraboorama canberrae,species,http://www.catalogueoflife.org/annual-checklis...,,,
