# iNaturalist status updates by state

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

In [1]:
import pandas as pd

# projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"


# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
3,166416,1033183,3669610,6825,,NSW Office of Environment & Heritage,EN,https://www.environment.nsw.gov.au/threateneds...,,obscured,...,Eidothea,hardeniana,,2021-02-22T07:21:17Z,Eidothea hardeniana,species,,,,
4,180721,1247288,222137,6825,,NSW Threatened Species Scientific Committee,vu,https://www.environment.nsw.gov.au/topics/anim...,,obscured,...,Pomaderris,bodalla,,2021-08-27T06:18:35Z,Pomaderris bodalla,species,https://eol.org/pages/49432063,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3448,165697,1182117,3669610,73684,,Australian Government,CR,http://www.environment.gov.au/biodiversity/thr...,,,...,Achyranthes,margaretarum,,2021-02-12T19:39:39Z,Achyranthes margaretarum,species,http://plantsoftheworldonline.org/taxon/urn:ls...,,,
3449,130966,508985,,,,,critically endangered,http://environment.gov.au/cgi-bin/sprat/public...,,obscured,...,Lichenostomus,melanops,cassidix,2022-06-11T03:51:49Z,Lichenostomus melanops cassidix,subspecies,http://www.birds.cornell.edu/clementschecklist...,,,
3450,161226,50744,516268,,,DAWE Species Profile and Threats Database,CR,https://www.environment.gov.au/cgi-bin/sprat/p...,,obscured,...,,,,,Stiphodon allen,,,Opal Cling-Goby,False,[]
3451,162783,924263,764897,,,"Department of Biodiversity, Conservation and A...",EN,https://www.dpaw.wa.gov.au/images/documents/pl...,,obscured,...,Reedia,spathacea,,2020-09-27T12:27:42Z,Reedia spathacea,species,,,,


## Queensland

In [2]:
def filter_state_statuses(stateregex:str, urlregex:str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                       taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                       taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                       taxastatus.apply(lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id','user_id'])

In [3]:
qldstatuses = filter_state_statuses("Qld|QLD|Queensland|QUEENSLAND|QL",".qld.")
qldstatuses

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
532,223098,1003032,3669610,7308,,Queensland Nature Conservation Act 1992,CE,https://apps.des.qld.gov.au/species-search/det...,,obscured,...,Antrophyum,austroqueenslandicum,,2021-10-01T10:37:36Z,Antrophyum austroqueenslandicum,species,,,,
131,169944,1004434,58320,7308,,Queensland Nature Conservation Act 1992,Least concern,https://apps.des.qld.gov.au/species-search/det...,Listed as 'confidential' by the Queensland Dep...,obscured,...,Oberonia,crateriformis,,2021-07-18T09:05:24Z,Oberonia crateriformis,species,http://www.catalogueoflife.org/annual-checklis...,,,
1473,152757,100615,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,Euastacus,bindal,,2020-05-28T05:44:48Z,Euastacus bindal,species,http://www.iucnredlist.org/apps/redlist/details,,,
1407,152668,100635,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,Euastacus,jagara,,2020-05-28T05:43:40Z,Euastacus jagara,species,http://www.iucnredlist.org/apps/redlist/details,,,
1267,152461,100638,708886,7308,16653,QLD DEHP,endangered,https://data.qld.gov.au/dataset/conservation-s...,,obscured,...,Euastacus,maidae,,2020-05-28T05:06:01Z,Euastacus maidae,species,http://www.iucnredlist.org/apps/redlist/details,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
835,262047,977629,3669610,7308,,Queensland Nature Conservation Act 1992,NT,https://www.data.qld.gov.au/dataset/conservati...,,open,...,Melaleuca,formosa,,2022-09-05T10:47:52Z,Melaleuca formosa,species,,,,
2615,161807,993333,702203,7308,,Nature Conservation Act 1992,VU,https://apps.des.qld.gov.au/species-search/det...,,,...,Cupaniopsis,tomentella,,2020-08-30T17:12:00Z,Cupaniopsis tomentella,species,https://eol.org/pages/5629346,,,
2616,161808,993333,702203,6744,,Environment Protection and Biodiversity Conser...,VU,https://apps.des.qld.gov.au/species-search/det...,,,...,Cupaniopsis,tomentella,,2020-08-30T17:12:00Z,Cupaniopsis tomentella,species,https://eol.org/pages/5629346,,,
23,167723,993605,3669610,7308,,QLD DEHP,NT,https://www.data.qld.gov.au/dataset/conservati...,,obscured,...,Acianthus,amplexicaulis,,2021-10-05T08:48:02Z,Acianthus amplexicaulis,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [4]:
qldsensitive = pd.read_csv(listdir + "sensitive-lists/QLD-sensitive.csv")  # Qld sensitive list
qldsensitive['scientificName'] = qldsensitive['scientificName'].str.replace('subsp. ', '', regex=False)
qldsensitive

Unnamed: 0,taxonID,kingdom,class,family,scientificName,vernacularName,scientificNameAuthorship,sourceStatus,Significant,status,Endemicity,EPBC Status
0,969,Animalia,Mammalia,Rhinolophidae,Rhinolophus philippinensis,greater large-eared horseshoe bat,"Waterhouse, 1843",E,Y,Endangered,Regional Endemic,Vulnerable
1,1376,Animalia,Aves,Estrildidae,Chloebia gouldiae,Gouldian finch,"(Gould, 1844)",E,Y,Endangered,Intranational,Endangered
2,1378,Animalia,Aves,Estrildidae,Erythrura trichroa,blue-faced parrot-finch,"(Kittlitz, 1835)",NT,Y,Near Threatened,Not Endemic to Australia,
3,1370,Animalia,Aves,Estrildidae,Neochmia phaeton evangelinae,crimson finch (white-bellied subspecies),"(Hombron & Jacquinot, 1841)",E,Y,Endangered,Regional Endemic,Endangered
4,1365,Animalia,Aves,Estrildidae,Poephila cincta cincta,black-throated finch (white-rumped subspecies),"Gould, 1837",E,Y,Endangered,Intranational,Endangered
...,...,...,...,...,...,...,...,...,...,...,...,...
950,11699,Plantae,Equisetopsida,Thelypteridaceae,Pneumatopteris costata,,(Brack.) Holttum,NT,Y,Near Threatened,Regional Endemic,
951,11700,Plantae,Equisetopsida,Thelypteridaceae,Pneumatopteris pennigera,lime fern,(G.Forst.) Holttum,E,Y,Endangered,Not Endemic to Australia,
952,16042,Plantae,Equisetopsida,Thelypteridaceae,Thelypteris confluens,,(Thunb.) C.V.Morton,V,Y,Vulnerable,Not Endemic to Australia,
953,8185,Plantae,Equisetopsida,Proteaceae,Macadamia jansenii,,C.L.Gross & P.H.Weston,CR,Y,Critically Endangered,Queensland Endemic,Endangered


In [5]:
bothlists = qldsensitive[['taxonID','scientificName','status']].merge(qldstatuses[['id','scientificName','user_id','authority','status','geoprivacy']],how="left",left_on='scientificName',right_on='scientificName').sort_values(['scientificName'])
bothlists

Unnamed: 0,taxonID,scientificName,status_x,id,user_id,authority,status_y,geoprivacy
127,9655,Acianthus,Least concern,,,,,
128,7974,Acianthus borealis,Special least concern,,,,,
129,13477,Acianthus caudatus,Special least concern,152636,708886,QLD DEHP,endangered,obscured
130,14086,Acianthus exsertus,Special least concern,152637,708886,QLD DEHP,LC,
131,14087,Acianthus fornicatus,Special least concern,,,,,
...,...,...,...,...,...,...,...,...
126,8539,Wodyetia bifurcata,Vulnerable,152450,708886,QLD DEHP,endangered,obscured
55,22659,Wollumbinia belli,Vulnerable,,,,,
778,21765,Zeuxine,,,,,,
779,40091,Zeuxine attenuata,Least concern,,,,,


## Notes about conservation status inheritance in inaturalist:
Adding a conservation status for a higher level taxon affects observations of all the species in this taxon. Please do not add statuses for taxa that contain species that have no status because that will incorrectly obscure coordinates for observations of those species.

### Example:
Genus Acriopsis - https://www.inaturalist.org/taxa/425476-Acriopsis

The inherited species records are not in the export and can't be edited, but they do appear on the species pages.
The changes are to set _Acriopsis emarginata_ to Vulnerable. The rest will remain least concern.


In [6]:
eg = pd.DataFrame(columns=('TaxonID', 'Name', 'Status','Rank','iNat species page','iNat edit taxon','Export'))
eg.loc[1]=['425476','Acriopsis','LC/obscured','genus','No ','Yes','Yes']
eg.loc[2]=['1141144','Acriopsis emarginata','LC/obscured','species','Yes','No','No']
eg.loc[3]=['425475','Acriopsis liliifolia','LC/obscured','species','Yes','No','No']
eg.loc[4]=['427833','Acriopsis ridleyi','LC/obscured','species','Yes','No','No']
eg.loc[5]=['1037999','Acriopsis indica','LC/obscured','Species','Yes','No','No']
eg

Unnamed: 0,TaxonID,Name,Status,Rank,iNat species page,iNat edit taxon,Export
1,425476,Acriopsis,LC/obscured,genus,No,Yes,Yes
2,1141144,Acriopsis emarginata,LC/obscured,species,Yes,No,No
3,425475,Acriopsis liliifolia,LC/obscured,species,Yes,No,No
4,427833,Acriopsis ridleyi,LC/obscured,species,Yes,No,No
5,1037999,Acriopsis indica,LC/obscured,Species,Yes,No,No



## New Entries

In [7]:
# to add: those that have no status `id` are new taxa - 532!!
additions = bothlists[bothlists['id'].isnull()]
additions


Unnamed: 0,taxonID,scientificName,status_x,id,user_id,authority,status_y,geoprivacy
127,9655,Acianthus,Least concern,,,,,
128,7974,Acianthus borealis,Special least concern,,,,,
131,14087,Acianthus fornicatus,Special least concern,,,,,
132,8545,Acianthus ledwardii,Special least concern,,,,,
136,31909,Acriopsis emarginata,Vulnerable,,,,,
...,...,...,...,...,...,...,...,...
775,31261,Vanilla planifolia,,,,,,
777,30794,Vrydagzynea grayi,Endangered,,,,,
55,22659,Wollumbinia belli,Vulnerable,,,,,
778,21765,Zeuxine,,,,,,


In [8]:
# this needs to go into a file containing these:
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


In [9]:
# try to match the taxon name to something in inaturalist
additions = additions.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
additions

Unnamed: 0,taxonID_x,scientificName,status_x,id_x,user_id,authority,status_y,geoprivacy,id_y,taxonID_y,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,9655,Acianthus,Least concern,,,,,,202580,https://www.inaturalist.org/taxa/202580,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Acianthus,,,2021-04-17T03:13:28Z,genus,http://www.catalogueoflife.org/annual-checklis...
1,7974,Acianthus borealis,Special least concern,,,,,,,,...,,,,,,,,,,
2,14087,Acianthus fornicatus,Special least concern,,,,,,202579,https://www.inaturalist.org/taxa/202579,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Acianthus,fornicatus,,2021-05-20T01:27:37Z,species,http://www.catalogueoflife.org/annual-checklis...
3,8545,Acianthus ledwardii,Special least concern,,,,,,1239753,https://www.inaturalist.org/taxa/1239753,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Acianthus,ledwardii,,2021-05-18T04:43:56Z,species,http://www.catalogueoflife.org/annual-checklis...
4,31909,Acriopsis emarginata,Vulnerable,,,,,,1141144,https://www.inaturalist.org/taxa/1141144,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Acriopsis,emarginata,,2020-11-16T22:24:21Z,species,http://www.catalogueoflife.org/annual-checklis...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
532,31261,Vanilla planifolia,,,,,,,61393,https://www.inaturalist.org/taxa/61393,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Vanilla,planifolia,,2019-11-23T05:28:46Z,species,http://www.catalogueoflife.org/annual-checklis...
533,30794,Vrydagzynea grayi,Endangered,,,,,,,,...,,,,,,,,,,
534,22659,Wollumbinia belli,Vulnerable,,,,,,,,...,,,,,,,,,,
535,21765,Zeuxine,,,,,,,124430,https://www.inaturalist.org/taxa/124430,...,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Zeuxine,,,2022-10-17T14:37:53Z,genus,http://www.ubio.org/browser/details.php?nameba...


In [10]:
# what didnt match?
unknownToInat = additions[additions['taxonID_y'].isna()]
unknownToInat

Unnamed: 0,taxonID_x,scientificName,status_x,id_x,user_id,authority,status_y,geoprivacy,id_y,taxonID_y,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
1,7974,Acianthus borealis,Special least concern,,,,,,,,...,,,,,,,,,,
5,40762,Alsophila baileyana,Least concern,,,,,,,,...,,,,,,,,,,
7,41582,Alsophila exilis,Endangered,,,,,,,,...,,,,,,,,,,
21,30917,Aponogeton cuneatus,Special least concern,,,,,,,,...,,,,,,,,,,
22,21902,Aponogeton elongatus elongatus,Near Threatened,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
529,11357,Trichoglottis australiensis,Vulnerable,,,,,,,,...,,,,,,,,,,
531,40770,Vanda whiteana,Least concern,,,,,,,,...,,,,,,,,,,
533,30794,Vrydagzynea grayi,Endangered,,,,,,,,...,,,,,,,,,,
534,22659,Wollumbinia belli,Vulnerable,,,,,,,,...,,,,,,,,,,


In [26]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild' and 'Extinct'}
qldsensitive.groupby(['status'])['status'].count()

status
Critically Endangered     35
Endangered                78
Extinct in the wild        7
Least concern            143
Near Threatened           63
Special least concern    361
Vulnerable               139
Name: status, dtype: int64

In [59]:
def map_to_iucn_status(status):
    if pd.isna(status):
        status = "Confidential"
    status = status.lower().strip()
    return {
        'critically endangered': 'Critically Endangered',
        'vulnerable':'Vulnerable',
        'not evaluated':'Not Evaluated',
        'data deficient':'Data Deficient',
        'least concern':'Least Concern',
        'special least concern':'Least Concern',
        'near threatened':'Near Threatened',
        'endangered':'Endangered',
        'extinct in the wild':'Extinct in the Wild',
        'extinct':'Extinct'
    }.get(status,'Confidential')

sample = ' speCial least concERn'
print(sample + " maps to : " + map_to_iucn_status(sample))


iucnStatusMappings = {
    'critically endangered': 'Critically Endangered',
    'vulnerable':'Vulnerable',
    'not evaluated':'Not Evaluated',
    'data deficient':'Data Deficient',
    'least concern':'Least Concern',
    'special least concern':'Least Concern',
    'near threatened':'Near Threatened',
    'endangered':'Endangered',
    'extinct in the wild':'Extinct in the Wild',
    'extinct':'Extinct'
}


 speCial least concERn maps to : Least Concern


In [23]:
#qldplaces = qldstatuses[['place_id','place_name','place_display_name']].drop_duplicates().sort_values(by='place_display_name')
#qldplaces
qldstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()

place_id  place_name             place_display_name           
                                                                    5
144315    Brisbane City          Brisbane City                      2
153119    South East Queensland  South East Queensland, QL, AU      1
18870     Cairns - Pt B          Cairns - Pt B, QL, AU              1
19232     Yarrabah               Yarrabah, QL, AU                   1
6744      Australia              Australia                         12
7308      Queensland             Queensland, AU                   631
Name: place_id, dtype: int64

In [63]:
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
additionsexport = additions[additions['taxonID_y'].notna()]
additionsexport = additionsexport.rename(columns={'scientificName':'Taxon_Name',
                                                  'status_x':'Status',
                                                  'id_y':'taxon_id',
                                                  'authority':'Authority',
                                                  'geoprivacy':'Taxon_Geoprivacy',
                                                  'taxonID_x':'wildnetID'})
additionsexport = additionsexport[['Taxon_Name','Status','Authority','Taxon_Geoprivacy','taxon_id','wildnetID']]
additionsexport['Authority'] = "Qld Department of Environment and Science"

additionsexport['Description'] = "Listed as Confidential - refer to https://www.data.qld.gov.au/dataset/queensland-confidential-species"
additionsexport['url'] = "https://apps.des.qld.gov.au/species-search/details/?id=" + additionsexport['wildnetID'].astype(str)
additionsexport['Taxon_Geoprivacy'] = "obscured"
additionsexport['iNaturalist_Place_ID'] = '7308'  # Queensland, AU
additionsexport['Username'] = 'peggydnew'
additionsexport['IUCN_equivalent'] = map_to_iucn_status(additionsexport['Status']) # not working
additionsexport

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [64]:
additionsexport.to_csv(sourcedir + "qld-add.csv",index=False)