# iNaturalist status updates by state - VIC

Using the file produced in the collate-status-taxa.ipynb: `inat-aust-status-taxa.csv`, generate lists to update iNaturalist statuses

## Prep - common to all states
1. Read in the inaturalist statuses & filter out this state
2. Read in the inaturalist taxa list
3. Read in the state sensitive and conservation list
4. Attempt to match the state statuses to an IUCN equivalent
5. Determine the best placeID to use for this state

**Next steps:**
Establish the changes that need to be made. Read in the sensitive and conservation list for each state.
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (new template)
    b. updates - any changes to information which was added by us previously (user_id = 708886) (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

### 1. iNaturalist statuses

In [105]:
import pandas as pd

projectdir = "/Users/oco115/PycharmProjects/authoritative-lists/" # basedir for this gh project
# projectdir = "/Users/new330/IdeaProjects/authoritative-lists/" # basedir for this gh project
sourcedir = projectdir + "source-data/inaturalist-statuses/"
listdir = projectdir + "current-lists/"

# read in the statuses
taxastatus = pd.read_csv(sourcedir + "inat-aust-status-taxa.csv", encoding='UTF-8',na_filter=False,dtype=str) ## Read inaturalist conservation statuses file
taxastatus.head(3)

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,166449,38493,1138587,7830,,Flora and Fauna Guarantee Act 1988,CR,,,obscured,...,Eulamprus,kosciuskoi,,2021-03-01T10:35:01Z,Eulamprus kosciuskoi,species,http://reptile-database.reptarium.cz/search.ph...,,,
1,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
2,234789,918383,702203,7308,,Atlas of Living Australia,LC,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2022-01-08T03:30:36Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [106]:
def filter_state_statuses(stateregex: str, urlregex: str):
    authoritydf = taxastatus['authority'].drop_duplicates().sort_values()
    authoritydf = authoritydf[pd.Series(authoritydf).str.contains(stateregex)]
    urldf = taxastatus['url'].drop_duplicates().sort_values()
    urldf = urldf[pd.Series(urldf).str.contains(urlregex)]
    placedisplaydf = taxastatus['place_display_name'].drop_duplicates().sort_values()
    placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains(stateregex)]
    placedf = taxastatus['place_name'].drop_duplicates().sort_values()
    placedf = placedf[pd.Series(placedf).str.contains(stateregex)]
    # concat all and remove duplicates
    statedf = pd.concat([taxastatus.apply(lambda row: row[taxastatus['place_display_name'].isin(placedisplaydf)]),
                         taxastatus.apply(lambda row: row[taxastatus['place_name'].isin(placedf)]),
                         taxastatus.apply(lambda row: row[taxastatus['url'].isin(urldf)]),
                         taxastatus.apply(
                             lambda row: row[taxastatus['authority'].isin(authoritydf)])]).drop_duplicates()
    return statedf.sort_values(['taxon_id', 'user_id'])

inatstatuses = filter_state_statuses(" VIC |Victoria|VICTORIA|Vic","vic.gov.au")
inatstatuses.rename(columns={'id':'status_id','id_y':'taxon_id_y'},inplace=True)
inatstatuses

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
159,264604,100611,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Threatened,https://www.environment.vic.gov.au/conserving-...,,open,...,Euastacus,armatus,,2022-06-06T16:36:21Z,Euastacus armatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
158,264603,100616,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Euastacus,bispinosus,,2022-06-06T16:26:39Z,Euastacus bispinosus,species,http://www.iucnredlist.org/apps/redlist/details,,,
2371,153834,100619,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,claytoni,,2020-05-28T05:05:59Z,Euastacus claytoni,species,http://www.iucnredlist.org/apps/redlist/details,,,
2388,153867,100620,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Euastacus,crassus,,2020-05-28T05:04:27Z,Euastacus crassus,species,http://www.iucnredlist.org/apps/redlist/details,,,
3316,265501,100657,3249428,7830,,Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Euastacus,yanga,,2022-06-14T09:17:17Z,Euastacus yanga,species,http://www.iucnredlist.org/apps/redlist/details,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2865,153813,99966,708886,7830,16656,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Engaeus,sternalis,,2022-06-10T13:58:03Z,Engaeus sternalis,species,http://www.iucnredlist.org/apps/redlist/details,,,
2386,153863,99967,708886,7830,16656,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Engaeus,strictifrons,,2020-05-28T05:03:37Z,Engaeus strictifrons,species,http://www.iucnredlist.org/apps/redlist/details,,,
163,264608,99969,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Engaeus,tuberculatus,,2022-07-02T08:00:10Z,Engaeus tuberculatus,species,http://www.iucnredlist.org/apps/redlist/details,,,
2484,153828,99970,708886,7830,16656,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Engaeus,urostrictus,,2022-07-18T14:12:03Z,Engaeus urostrictus,species,http://www.iucnredlist.org/apps/redlist/details,,,


### 2. iNaturalist taxonomy

In [107]:
# Output files contain these fields
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# so we need to match species from the state lists to the inat taxa to get the taxon_id

import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(sourcedir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive,dtype=str)
z.close()
inattaxa.head(3)


Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2022-12-27T07:33:16Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...


### 3. State lists

Get the ALA Conservation and Sensitive lists


In [108]:
# %%script echo skipping # comment this line to download dataset from lists.ala.org.au the web and save locally
import sys
import os
sys.path.append(os.path.abspath(projectdir + "source-code/includes"))
import list_functions as lf

sensitivelist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr18669?max=10000&includeKVP=true")
sensitivelist = lf.kvp_to_columns(sensitivelist)
sensitivelist.to_csv(sourcedir + "vic-ala-sensitive.csv", index=False)

conservationlist = lf.download_ala_list("https://lists-test.ala.org.au/ws/speciesListItems/dr655?max=10000&includeKVP=true")
conservationlist = lf.kvp_to_columns(conservationlist)
conservationlist.to_csv(sourcedir + "vic-ala-conservation.csv", index=False)

In [109]:
# Read sensitive list data
sensitivelist = pd.read_csv(sourcedir + "vic-ala-sensitive.csv", dtype=str)
sensitivelist['vba_geoprivacy'] = 'obscured'
sensitivelist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,taxonID,scientificNameAuthority,primaryDiscipline,speciesGroup,ffgactstatus,vicadvisorystatus,restrictedFlag,modified,extractDate,status,sourceStatus,epbcactStatus,vba_geoprivacy
0,2803999,Engaeus australis,Freshwater Crayfish Or Yabby,Engaeus australis,https://biodiversity.org.au/afd/taxa/feada41f-...,dr18669,"[{'key': 'taxonID', 'value': '1686'}, {'key': ...",1686,"Riek, 1969",Aquatic fauna,"Mussels, decapod crustacea",Critically Endangered,Vulnerable,rest,2013-12-18,2023-01-16,Critically Endangered,Critically Endangered,,obscured
1,2804013,Engaeus fultoni,Otway Burrowing Crayfish,Engaeus fultoni,https://biodiversity.org.au/afd/taxa/7994c955-...,dr18669,"[{'key': 'taxonID', 'value': '1674'}, {'key': ...",1674,"Smith & Schuster, 1913",Aquatic fauna,"Mussels, decapod crustacea",Vulnerable,Vulnerable,rest,2013-12-18,2023-01-16,Vulnerable,Vulnerable,,obscured
2,2804088,Engaeus mallacoota,Mallacoota Burrowing Crayfish,Engaeus mallacoota,https://biodiversity.org.au/afd/taxa/bf6f5d52-...,dr18669,"[{'key': 'taxonID', 'value': '1694'}, {'key': ...",1694,"Horwitz, 1990",Aquatic fauna,"Mussels, decapod crustacea",Critically Endangered,Vulnerable,rest,2013-12-17,2023-01-16,Critically Endangered,Critically Endangered,,obscured
3,2804052,Engaeus phyllocercus,Narracan Burrowing Crayfish,Engaeus phyllocercus,https://biodiversity.org.au/afd/taxa/bb2b1f80-...,dr18669,"[{'key': 'taxonID', 'value': '1695'}, {'key': ...",1695,"Smith & Schuster, 1913",Aquatic fauna,"Mussels, decapod crustacea",Endangered,Endangered,rest,2012-11-07,2023-01-16,Endangered,Endangered,,obscured
4,2803985,Engaeus rostrogaleatus,Strzelecki Burrowing Crayfish,Engaeus rostrogaleatus,https://biodiversity.org.au/afd/taxa/cd66d8b6-...,dr18669,"[{'key': 'taxonID', 'value': '1683'}, {'key': ...",1683,"Horwitz, 1990",Aquatic fauna,"Mussels, decapod crustacea",Endangered,Endangered,rest,2012-11-07,2023-01-16,Endangered,Endangered,,obscured
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
131,2804096,Synamphisopus ambiguus,Phreatoic Isopod,Synamphisopus ambiguus,https://biodiversity.org.au/afd/taxa/bc3b9067-...,dr18669,"[{'key': 'taxonID', 'value': '75168'}, {'key':...",75168,"(Sheard, 1936)",Terrestrial fauna,Invertebrates,Vulnerable,Vulnerable,rest,2010-09-16,2023-01-16,Vulnerable,Vulnerable,,obscured
132,2804078,Synamphisopus doegi,Phreatoic Isopod,Synamphisopus doegi,https://biodiversity.org.au/afd/taxa/fdb51ee6-...,dr18669,"[{'key': 'taxonID', 'value': '75169'}, {'key':...",75169,"Wilson & Keable, 2002",Terrestrial fauna,Invertebrates,Vulnerable,Vulnerable,rest,2012-11-20,2023-01-16,Vulnerable,Vulnerable,,obscured
133,2804004,Varanus rosenbergi,Heath Monitor,Varanus rosenbergi,https://biodiversity.org.au/afd/taxa/a01a6bb4-...,dr18669,"[{'key': 'taxonID', 'value': '12287'}, {'key':...",12287,,Terrestrial fauna,Reptiles,Critically Endangered,Endangered,rest,2020-04-14,2023-01-16,Critically Endangered,Critically Endangered,,obscured
134,2804077,Vermicella annulata,Bandy Bandy,Vermicella annulata,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...,dr18669,"[{'key': 'taxonID', 'value': '12734'}, {'key':...",12734,,Terrestrial fauna,Reptiles,Endangered,Vulnerable,rest,2018-08-03,2023-01-16,Endangered,Endangered,,obscured


In [110]:
conservationlist = pd.read_csv(sourcedir + "vic-ala-conservation.csv", dtype=str)
conservationlist['vba_geoprivacy'] = conservationlist['restrictedFlag'].apply(lambda x: 'obscured' if -pd.isnull(x) else 'open')
conservationlist

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid,kvpValues,taxonID,scientificNameAuthority,primaryDiscipline,...,ffgactstatus,vicadvisorystatus,modified,extractDate,status,sourceStatus,epbcactStatus,restrictedFlag,establishmentMeans,vba_geoprivacy
0,2803160,Ambassis agassizii,Agassiz's Glassfish,Ambassis agassizii,https://biodiversity.org.au/afd/taxa/b0ff773c-...,dr655,"[{'key': 'taxonID', 'value': '4864'}, {'key': ...",4864,"Steindachner, 1867",Aquatic fauna,...,Extinct,Regionally extinct,2013-04-04,2023-01-16,Extinct,Extinct,,,,obscured
1,2803307,Bidyanus bidyanus,Silver Perch,Bidyanus bidyanus,https://biodiversity.org.au/afd/taxa/05866f31-...,dr655,"[{'key': 'taxonID', 'value': '528544'}, {'key'...",528544,"(Mitchell, 1838)",Aquatic fauna,...,Endangered,Vulnerable,2016-05-24,2023-01-16,Endangered,Endangered,Critically Endangered,,,obscured
2,2802393,Chelodina expansa,Broad-shelled Turtle,Chelodina (Macrochelodina) expansa,https://biodiversity.org.au/afd/taxa/fc7d0724-...,dr655,"[{'key': 'taxonID', 'value': '5133'}, {'key': ...",5133,"Gray, 1857",Aquatic fauna,...,Endangered,Endangered,2014-11-20,2023-01-16,Endangered,Endangered,,,,obscured
3,2803025,Craterocephalus fluviatilis,Murray Hardyhead,Craterocephalus fluviatilis,https://biodiversity.org.au/afd/taxa/50568ccf-...,dr655,"[{'key': 'taxonID', 'value': '4784'}, {'key': ...",4784,"McCulloch, 1912",Aquatic fauna,...,Critically Endangered,Critically endangered,2013-04-04,2023-01-16,Critically Endangered,Critically Endangered,Endangered,,,obscured
4,2802906,Emydura macquarii,Southern River Turtles,Emydura macquarii,https://biodiversity.org.au/afd/taxa/39c22a1e-...,dr655,"[{'key': 'taxonID', 'value': '5135'}, {'key': ...",5135,,Aquatic fauna,...,Critically Endangered,Vulnerable,2013-07-02,2023-01-16,Critically Endangered,Critically Endangered,,,,obscured
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1994,2802846,Varanus varius,Lace Monitor,Varanus varius,https://biodiversity.org.au/afd/taxa/6338346a-...,dr655,"[{'key': 'taxonID', 'value': '12283'}, {'key':...",12283,,Terrestrial fauna,...,Endangered,Endangered,2013-04-29,2023-01-16,Endangered,Endangered,,,,obscured
1995,2803060,Vermicella annulata,Bandy Bandy,Vermicella annulata,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...,dr655,"[{'key': 'taxonID', 'value': '12734'}, {'key':...",12734,,Terrestrial fauna,...,Endangered,Vulnerable,2018-08-03,2023-01-16,Endangered,Endangered,,rest,,open
1996,2803837,Victaphanta compacta,Otway Black Snail,Victaphanta compacta,https://biodiversity.org.au/afd/taxa/e9582432-...,dr655,"[{'key': 'taxonID', 'value': '15050'}, {'key':...",15050,"(Cox & Hedley, 1912)",Terrestrial fauna,...,Endangered,Endangered,2010-12-02,2023-01-16,Endangered,Endangered,,rest,,open
1997,2802531,Xenus cinereus,Terek Sandpiper,Xenus cinereus,https://biodiversity.org.au/afd/taxa/4090ad27-...,dr655,"[{'key': 'taxonID', 'value': '10160'}, {'key':...",10160,,Terrestrial fauna,...,Endangered,Endangered,2010-12-02,2023-01-16,Endangered,Endangered,,,,obscured


In [111]:
# join them in a way that works for inat (eg sensitive list, geoprivacy = 'obscured'
statelist = pd.concat([sensitivelist[['taxonID', 'name', 'status', 'vba_geoprivacy', 'lsid']],
                       conservationlist[['taxonID', 'name', 'status', 'vba_geoprivacy', 'lsid']]]).drop_duplicates()
# retrieve binomial and trinomial names from GBIF
parsednames = lf.gbifparse(statelist)
parsednames.to_csv(sourcedir + "vic-gbif.csv", index=False)
statelist = statelist.merge(parsednames[['scientificName','canonicalName','canonicalNameComplete','type','rankMarker']],how="left",left_on="name",right_on="scientificName")
numfullstatelist = len(statelist.index)
statelist = statelist.rename(columns={'taxonID':'vba_taxonID', 'name':'vba_name','status':'vba_status'})
statelist['vba_scientificName'] = statelist['canonicalName']
statelist

Unnamed: 0,vba_taxonID,vba_name,vba_status,vba_geoprivacy,lsid,scientificName,canonicalName,canonicalNameComplete,type,rankMarker,vba_scientificName
0,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...,Engaeus australis,Engaeus australis,Engaeus australis,SCIENTIFIC,sp.,Engaeus australis
1,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...,Engaeus australis,Engaeus australis,Engaeus australis,SCIENTIFIC,sp.,Engaeus australis
2,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...,Engaeus fultoni,Engaeus fultoni,Engaeus fultoni,SCIENTIFIC,sp.,Engaeus fultoni
3,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...,Engaeus fultoni,Engaeus fultoni,Engaeus fultoni,SCIENTIFIC,sp.,Engaeus fultoni
4,1694,Engaeus mallacoota,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf6f5d52-...,Engaeus mallacoota,Engaeus mallacoota,Engaeus mallacoota,SCIENTIFIC,sp.,Engaeus mallacoota
...,...,...,...,...,...,...,...,...,...,...,...
2378,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...,Vermicella annulata,Vermicella annulata,Vermicella annulata,SCIENTIFIC,sp.,Vermicella annulata
2379,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...,Victaphanta compacta,Victaphanta compacta,Victaphanta compacta,SCIENTIFIC,sp.,Victaphanta compacta
2380,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...,Victaphanta compacta,Victaphanta compacta,Victaphanta compacta,SCIENTIFIC,sp.,Victaphanta compacta
2381,10160,Xenus cinereus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/4090ad27-...,Xenus cinereus,Xenus cinereus,Xenus cinereus,SCIENTIFIC,sp.,Xenus cinereus


In [112]:
# Identify records that won't comply with iNaturalist species names
noncomply = statelist[statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID']) ]
noncomply

Unnamed: 0,vba_taxonID,vba_name,vba_status,vba_geoprivacy,lsid,scientificName,canonicalName,canonicalNameComplete,type,rankMarker,vba_scientificName
132,505589,Caladenia sp. aff. fragrantissima (Central Vic...,Critically Endangered,obscured,ALA_DR490_93,Caladenia sp. aff. fragrantissima (Central Vic...,Caladenia spec.,Caladenia spec.,INFORMAL,sp.,Caladenia spec.
133,505589,Caladenia sp. aff. fragrantissima (Central Vic...,Critically Endangered,obscured,ALA_DR490_93,Caladenia sp. aff. fragrantissima (Central Vic...,Caladenia spec.,Caladenia spec.,INFORMAL,sp.,Caladenia spec.
134,505431,Caladenia sp. aff. venusta (Kilsyth South),Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Caladenia sp. aff. venusta (Kilsyth South),Caladenia spec.,Caladenia spec.,INFORMAL,sp.,Caladenia spec.
135,505431,Caladenia sp. aff. venusta (Kilsyth South),Critically Endangered,obscured,https://id.biodiversity.org.au/taxon/apni/5139...,Caladenia sp. aff. venusta (Kilsyth South),Caladenia spec.,Caladenia spec.,INFORMAL,sp.,Caladenia spec.
314,903498,Galaxias sp. 14,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/c2bcc474-...,Galaxias sp. 14,Galaxias sp.14,Galaxias sp.14,INFORMAL,sp.,Galaxias sp.14
331,903041,Nannoperca sp. 1,Vulnerable,obscured,ALA_DR655_1698,Nannoperca sp. 1,Nannoperca sp.1,Nannoperca sp.1,INFORMAL,sp.,Nannoperca sp.1
522,503699,Arthropodium sp. 1 (robust glaucous),Endangered,obscured,ALA_DR655_657,Arthropodium sp. 1 (robust glaucous),Arthropodium sp.1(robust-glaucous),Arthropodium sp.1(robust-glaucous),INFORMAL,sp.,Arthropodium sp.1(robust-glaucous)
538,504122,Astrotricha asperifolia subsp. 2,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2911958,Astrotricha asperifolia subsp. 2,Astrotricha asperifolia subsp.,Astrotricha asperifolia subsp.,INFORMAL,subsp.,Astrotricha asperifolia subsp.
540,505604,Astrotricha linearis subsp. 1,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2895901,Astrotricha linearis subsp. 1,Astrotricha linearis subsp.,Astrotricha linearis subsp.,INFORMAL,subsp.,Astrotricha linearis subsp.
541,505605,Astrotricha linearis subsp. 2,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2916765,Astrotricha linearis subsp. 2,Astrotricha linearis subsp.,Astrotricha linearis subsp.,INFORMAL,subsp.,Astrotricha linearis subsp.


In [113]:
# remove records that do not comply
statelist = statelist[~statelist['type'].isin(['INFORMAL','CULTIVAR','HYBRID', 'BLACKLISTED']) ]
statelist = pd.DataFrame(statelist[['vba_taxonID','vba_scientificName','vba_status','vba_geoprivacy','lsid']]).drop_duplicates()
statelist

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,vba_geoprivacy,lsid
0,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
1,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
2,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
3,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
4,1694,Engaeus mallacoota,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf6f5d52-...
...,...,...,...,...,...
2378,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...
2379,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...
2380,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...
2381,10160,Xenus cinereus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/4090ad27-...


In [114]:
parsednames['type'].unique()

array(['SCIENTIFIC', 'INFORMAL', 'HYBRID'], dtype=object)

In [115]:
# check for duplicates with conflicting information
dupinformation = statelist.groupby('vba_taxonID').filter(lambda x: len(x) > 1)#.sort('size',ascending=False)
dupinformation

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,vba_geoprivacy,lsid
0,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
1,1686,Engaeus australis,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/feada41f-...
2,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
3,1674,Engaeus fultoni,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/7994c955-...
4,1694,Engaeus mallacoota,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf6f5d52-...
...,...,...,...,...,...
2375,12287,Varanus rosenbergi,Critically Endangered,open,https://biodiversity.org.au/afd/taxa/a01a6bb4-...
2377,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...
2378,12734,Vermicella annulata,Endangered,open,https://biodiversity.org.au/afd/taxa/4c2e7ce4-...
2379,15050,Victaphanta compacta,Endangered,open,https://biodiversity.org.au/afd/taxa/e9582432-...


### 4. Equivalent IUCN statuses

In [116]:
iucn_statuses = {'Not Evaluated', 'Data Deficient', 'Least Concern', 'Near Threatened', 'Vulnerable', 'Endangered', 'Critically Endangered', 'Extinct in the Wild','Extinct'}
statelist.groupby(['vba_status'])['vba_status'].count()

vba_status
Conservation Dependent                 3
Critically Endangered                735
Endangered                          1168
Endangered (Extinct in Victoria)       1
Extinct                               54
Threatened                             4
Vulnerable                           359
Name: vba_status, dtype: int64

In [117]:
# these will be used to populate the iucn_equivalent field
iucnStatusMappings = {
    'conservation dependent': 'Vulnerable',
    'endangered (extinct in victoria)': 'Extinct',
    'threatened':'Vulnerable',
    'least concern':'Least Concern',
    'special least concern':'Least Concern',
    'critically endangered': 'Critically Endangered',
    'endangered': 'Endangered',
    'extinct': 'Extinct',
    'vulnerable': 'Vulnerable'
}

### 5. Determine best place ID to use

In [118]:
inatstatuses.groupby(['place_id','place_name','place_display_name'])['place_id'].count()
# looks like 6827 - note for extract


place_id  place_name    place_display_name
117993    Vic Offshore  Vic Offshore             1
6744      Australia     Australia                2
7830      Victoria      Victoria, AU          1073
Name: place_id, dtype: int64

## Merge iNaturalist statuses with State lists on scientificName

1. Match - updates, even if the statuses are the same we'll update the links and values anyway
2. No match - statuses to be added (additions)
   1.1 No match and no taxnomy - search for synonyms
   1.2 No match
3. Merge the other direction to see if there are deletes?


In [119]:
# join to see which lists already have a status in inaturalist based on scientificName
mergedstatuses = statelist[['vba_taxonID','vba_scientificName','vba_status','vba_geoprivacy','lsid']].merge(inatstatuses[['status_id','scientificName','taxon_id','user_id','description','iucn','authority','status','geoprivacy','place_id','place_display_name']],how="left",left_on='vba_scientificName',right_on='scientificName',suffixes=(None,'_inat')).sort_values(['scientificName'])
mergedstatuses


Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,vba_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,iucn,authority,status,geoprivacy,place_id,place_display_name
399,502094,Abrodictyum caudatum,Endangered,obscured,https://id.biodiversity.org.au/node/apni/7402200,264614,Abrodictyum caudatum,451374,3249428,,40,Victoria Flora and Fauna Guarantee Act 1988,Endangered,open,7830,"Victoria, AU"
400,500001,Abrotanella nivigena,Critically Endangered,obscured,https://id.biodiversity.org.au/node/apni/2900512,170090,Abrotanella nivigena,323722,527710,,50,Flora and Fauna Guarantee Act 1988,Critically Endangered,,7830,"Victoria, AU"
402,504199,Abutilon malvifolium,Critically Endangered,obscured,https://id.biodiversity.org.au/node/apni/2887438,264615,Abutilon malvifolium,323737,3249428,,50,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,open,7830,"Victoria, AU"
403,500003,Abutilon otocarpum,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2892894,264617,Abutilon otocarpum,323731,3249428,,40,Victoria Flora and Fauna Guarantee Act 1988,Endangered,open,7830,"Victoria, AU"
405,500009,Acacia alpina,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2907301,264618,Acacia alpina,139887,3249428,,40,Victoria Flora and Fauna Guarantee Act 1988,Endangered,open,7830,"Victoria, AU"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2314,10138,Thinornis cucullatus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/1ebf8ec6-...,,,,,,,,,,,
2316,75139,Trapezites luteus luteus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/fcc2ac7b-...,,,,,,,,,,,
2323,12922,Tympanocryptis pinguicolla,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/5bceebc1-...,,,,,,,,,,,
2325,10253,Tyto tenebricosa,Endangered,obscured,https://biodiversity.org.au/afd/taxa/645b287c-...,,,,,,,,,,,


In [120]:
# prepare the export fields, common to New template and Update template
# new statuses
# Taxon_Name,Status,Authority,IUCN_equivalent,Description,iNaturalist_Place_ID,url,Taxon_Geoprivacy,Username,taxon_id
# updates
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username
# url is either a florabase url or bie page
biesearchurl = "https://bie.ala.org.au/species/" # eg + "https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_url'] =  biesearchurl + mergedstatuses['lsid']
# biesearchurl = "https://bie.ala.org.au/species/" # eg + "https://id.biodiversity.org.au/node/apni/2894366"
mergedstatuses['new_description'] = "Listed as Threatened - refer to https://www.deeca.vic.gov.au/"
mergedstatuses['new_authority'] = "Victorian Department of Energy, Environment and Climate Action "
mergedstatuses.rename(columns={'vba_geoprivacy':'new_geoprivacy'},inplace=True)
mergedstatuses['new_place_id'] = '7830'  # Victoria, AU
mergedstatuses['new_username'] = 'peggydnew'
mergedstatuses['new_iucn_equivalent'] = mergedstatuses['status'].str.lower().str.strip().map(iucnStatusMappings).fillna('Vulnerable') # map to dictionary
mergedstatuses['new_status'] = mergedstatuses['vba_status'].fillna('Sensitive')
mergedstatuses

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,new_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,...,geoprivacy,place_id,place_display_name,new_url,new_description,new_authority,new_place_id,new_username,new_iucn_equivalent,new_status
399,502094,Abrodictyum caudatum,Endangered,obscured,https://id.biodiversity.org.au/node/apni/7402200,264614,Abrodictyum caudatum,451374,3249428,,...,open,7830,"Victoria, AU",https://id.biodiversity.org.au/node/apni/7402200,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Endangered
400,500001,Abrotanella nivigena,Critically Endangered,obscured,https://id.biodiversity.org.au/node/apni/2900512,170090,Abrotanella nivigena,323722,527710,,...,,7830,"Victoria, AU",https://id.biodiversity.org.au/node/apni/2900512,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Critically Endangered
402,504199,Abutilon malvifolium,Critically Endangered,obscured,https://id.biodiversity.org.au/node/apni/2887438,264615,Abutilon malvifolium,323737,3249428,,...,open,7830,"Victoria, AU",https://id.biodiversity.org.au/node/apni/2887438,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Critically Endangered
403,500003,Abutilon otocarpum,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2892894,264617,Abutilon otocarpum,323731,3249428,,...,open,7830,"Victoria, AU",https://id.biodiversity.org.au/node/apni/2892894,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Endangered
405,500009,Acacia alpina,Endangered,obscured,https://id.biodiversity.org.au/node/apni/2907301,264618,Acacia alpina,139887,3249428,,...,open,7830,"Victoria, AU",https://id.biodiversity.org.au/node/apni/2907301,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Endangered
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2314,10138,Thinornis cucullatus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/1ebf8ec6-...,,,,,,...,,,,https://biodiversity.org.au/afd/taxa/1ebf8ec6-...,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Vulnerable
2316,75139,Trapezites luteus luteus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/fcc2ac7b-...,,,,,,...,,,,https://biodiversity.org.au/afd/taxa/fcc2ac7b-...,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Endangered
2323,12922,Tympanocryptis pinguicolla,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/5bceebc1-...,,,,,,...,,,,https://biodiversity.org.au/afd/taxa/5bceebc1-...,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Critically Endangered
2325,10253,Tyto tenebricosa,Endangered,obscured,https://biodiversity.org.au/afd/taxa/645b287c-...,,,,,,...,,,,https://biodiversity.org.au/afd/taxa/645b287c-...,Listed as Threatened - refer to https://www.de...,"Victorian Department of Energy, Environment an...",7830,peggydnew,Vulnerable,Endangered


## Updates

In [121]:
# those that need to be updated - we found a status
mergedstatuses[mergedstatuses['status_id'].notnull()][['vba_scientificName','vba_status','status_id','taxon_id','status','new_geoprivacy','geoprivacy','authority','user_id']]

Unnamed: 0,vba_scientificName,vba_status,status_id,taxon_id,status,new_geoprivacy,geoprivacy,authority,user_id
399,Abrodictyum caudatum,Endangered,264614,451374,Endangered,obscured,open,Victoria Flora and Fauna Guarantee Act 1988,3249428
400,Abrotanella nivigena,Critically Endangered,170090,323722,Critically Endangered,obscured,,Flora and Fauna Guarantee Act 1988,527710
402,Abutilon malvifolium,Critically Endangered,264615,323737,Critically Endangered,obscured,open,Victoria Flora and Fauna Guarantee Act 1988,3249428
403,Abutilon otocarpum,Endangered,264617,323731,Endangered,obscured,open,Victoria Flora and Fauna Guarantee Act 1988,3249428
405,Acacia alpina,Endangered,264618,139887,Endangered,obscured,open,Victoria Flora and Fauna Guarantee Act 1988,3249428
...,...,...,...,...,...,...,...,...,...
2021,Zieria cytisoides,Endangered,265458,700296,Endangered,obscured,open,Flora and Fauna Guarantee Act 1988,3249428
2022,Zieria littoralis,Critically Endangered,264760,896657,Critically Endangered,obscured,open,Victoria Flora and Fauna Guarantee Act 1988,3249428
2023,Zieria oreocena,Endangered,265459,1092447,Endangered,obscured,open,Flora and Fauna Guarantee Act 1988,3249428
2024,Zieria robusta,Endangered,265460,973465,Endangered,obscured,open,Flora and Fauna Guarantee Act 1988,3249428


In [122]:
# updates - create the data frame
# action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
updates = pd.DataFrame(mergedstatuses[mergedstatuses['status_id'].notnull()])
updates.sort_values('scientificName')
updates['action'] = 'UPDATE'
#updates.loc[:,'action'] = 'UPDATE'
updates = updates[['action','scientificName','status_id','taxon_id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
updates.columns = updates.columns.str.replace("new_", "", regex=True)
updates = updates.rename(columns={'scientificName':'taxon_name',
                                  'status_id':'id'})
updates

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description
399,UPDATE,Abrodictyum caudatum,264614,451374,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/7402200,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
400,UPDATE,Abrotanella nivigena,170090,323722,Critically Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2900512,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
402,UPDATE,Abutilon malvifolium,264615,323737,Critically Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2887438,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
403,UPDATE,Abutilon otocarpum,264617,323731,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2892894,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
405,UPDATE,Acacia alpina,264618,139887,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2907301,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
...,...,...,...,...,...,...,...,...,...,...,...,...
2021,UPDATE,Zieria cytisoides,265458,700296,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2901401,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
2022,UPDATE,Zieria littoralis,264760,896657,Critically Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2888662,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
2023,UPDATE,Zieria oreocena,265459,1092447,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2899095,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...
2024,UPDATE,Zieria robusta,265460,973465,Endangered,Vulnerable,"Victorian Department of Energy, Environment an...",https://id.biodiversity.org.au/node/apni/2896419,obscured,7830,peggydnew,Listed as Threatened - refer to https://www.de...


## No status in iNaturalist via straight scientificName match
The WA records that didn't match up to a status in iNaturalist

In [123]:
# to add: those that have no inaturalist status - 532!!
noinatstatus = mergedstatuses[mergedstatuses['status_id'].isnull()]
# try to match the taxon name to something in inaturalist
noinatstatus = noinatstatus.merge(inattaxa, how="left", left_on="scientificName",right_on="scientificName")
noinatstatus

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,new_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,1633,Euastacus bidawalus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/e5f3cb27-...,,,,,,...,,,,,,,,,,
1,1633,Euastacus bidawalus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/e5f3cb27-...,,,,,,...,,,,,,,,,,
2,1467,Austrogammarus haasei,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf06e830-...,,,,,,...,,,,,,,,,,
3,1467,Austrogammarus haasei,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf06e830-...,,,,,,...,,,,,,,,,,
4,75160,Colubotelson joyneri,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/1fb2623d-...,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1002,10138,Thinornis cucullatus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/1ebf8ec6-...,,,,,,...,,,,,,,,,,
1003,75139,Trapezites luteus luteus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/fcc2ac7b-...,,,,,,...,,,,,,,,,,
1004,12922,Tympanocryptis pinguicolla,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/5bceebc1-...,,,,,,...,,,,,,,,,,
1005,10253,Tyto tenebricosa,Endangered,obscured,https://biodiversity.org.au/afd/taxa/645b287c-...,,,,,,...,,,,,,,,,,


In [124]:
noinatstatus[noinatstatus['id'].notna()] # there's no status but there is a matching inat taxon (id is the taxon id)
# note: "Dendrobium" matches to both genus and section

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,new_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references


In [125]:
# there's no status but there is a matching inat taxon (id is the taxon id)
additions = pd.DataFrame(noinatstatus[noinatstatus['id'].notna()])
additions['scientificName'] = additions['vba_scientificName']
#additions['new_status'] = additions['wa_status']
additions.sort_values(['scientificName'])
additions['action'] = 'ADD'
additions = additions[['action','scientificName','status_id','id','new_status','new_iucn_equivalent','new_authority','new_url','new_geoprivacy','new_place_id','new_username','new_description']]
additions.columns = additions.columns.str.replace("new_", "", regex=True)
additions = additions.rename(columns={'scientificName':'taxon_name',
                                      'id':'taxon_id',
                                  'status_id':'id'})
additions

Unnamed: 0,action,taxon_name,id,taxon_id,status,iucn_equivalent,authority,url,geoprivacy,place_id,username,description


In [126]:
all = pd.concat([updates,additions])
all.to_csv(sourcedir + "vic.csv", index=False )

In [127]:
# what didnt match to a taxon?
unknownToInat = noinatstatus[noinatstatus['id'].isna()]
unknownToInat

Unnamed: 0,vba_taxonID,vba_scientificName,vba_status,new_geoprivacy,lsid,status_id,scientificName,taxon_id,user_id,description,...,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,taxonRank,references
0,1633,Euastacus bidawalus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/e5f3cb27-...,,,,,,...,,,,,,,,,,
1,1633,Euastacus bidawalus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/e5f3cb27-...,,,,,,...,,,,,,,,,,
2,1467,Austrogammarus haasei,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf06e830-...,,,,,,...,,,,,,,,,,
3,1467,Austrogammarus haasei,Endangered,obscured,https://biodiversity.org.au/afd/taxa/bf06e830-...,,,,,,...,,,,,,,,,,
4,75160,Colubotelson joyneri,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/1fb2623d-...,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1002,10138,Thinornis cucullatus,Vulnerable,obscured,https://biodiversity.org.au/afd/taxa/1ebf8ec6-...,,,,,,...,,,,,,,,,,
1003,75139,Trapezites luteus luteus,Endangered,obscured,https://biodiversity.org.au/afd/taxa/fcc2ac7b-...,,,,,,...,,,,,,,,,,
1004,12922,Tympanocryptis pinguicolla,Critically Endangered,obscured,https://biodiversity.org.au/afd/taxa/5bceebc1-...,,,,,,...,,,,,,,,,,
1005,10253,Tyto tenebricosa,Endangered,obscured,https://biodiversity.org.au/afd/taxa/645b287c-...,,,,,,...,,,,,,,,,,


In [128]:
noinatstatus[noinatstatus['id'].isna()].groupby('vba_status').size()

vba_status
Conservation Dependent      2
Critically Endangered     325
Endangered                391
Extinct                    51
Threatened                  2
Vulnerable                229
dtype: int64

### are there any that need to be removed?
qld list count: 2517
qld inat statuses count: 653

updates to inat status: 570
additional inat status: 1355
qld statuses we can't find a taxon match for in iNaturalist: 606
total: 2531 (explainable via the various genus/section entries that we matched to in the taxonomy)

inat statuses left over: 653-570=83 that may need checking against the above

In [129]:
# inat statuses that aren't in added or updated
notaddedupdated = inatstatuses[~inatstatuses['taxon_id'].isin(updates['taxon_id'])]
notaddedupdated

Unnamed: 0,status_id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
2837,264792,1038965,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,,,,,Boronia anemonifolia variabilis,,,,False,[1426173]
705,264648,1064159,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Alsophila,leichhardtiana,,2022-06-07T16:03:11Z,Alsophila leichhardtiana,species,http://plantsoftheworldonline.org/,,,
60,168028,1084244,702203,7830,,Rare Plants of Victoria,CR,http://www.viridans.com/RAREPL/oncecommon.htm,,,...,Pimelea,spinescens,,2021-05-11T01:21:06Z,Pimelea spinescens,species,,,,
3157,265697,1115629,702203,7830,,Atlas of Living Australia,EN,https://bie.ala.org.au/species/https://id.biod...,,obscured,...,Pterostylis,×,,2022-07-03T07:38:09Z,Pterostylis × toveyana,hybrid,https://bie.ala.org.au/species/https://id.biod...,,,
868,162244,1127952,708886,7830,16656.0,VIC Government,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Suta,spectabilis,,2020-09-11T20:34:41Z,Suta spectabilis,species,,,,
2686,264754,1170290,3249428,7830,,Victoria Flora and Fauna Guarantee Act 1988,Critically Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Thelymitra,x,,2022-12-13T05:34:46Z,Thelymitra x merraniae,hybrid,https://vicflora.rbg.vic.gov.au/flora/taxon/a0...,,,
3306,266736,1348449,702203,7830,,FFG Threatened List,CR,https://www.environment.vic.gov.au/__data/asse...,,,...,Cranfillia,deltoides,,2022-01-07T04:56:03Z,Cranfillia deltoides,species,https://www.nzpcn.org.nz/flora/species/cranfil...,,,
2698,153818,19251,708886,7830,16656.0,Victoria Flora and Fauna Guarantee Act 1988,Vulnerable,https://www.environment.vic.gov.au/conserving-...,,obscured,...,Polytelis,anthopeplus,,2022-06-11T01:21:17Z,Polytelis anthopeplus,species,http://www.birdlife.org/datazone/speciesfactsh...,,,
3374,265543,33842,3249428,7830,,Flora and Fauna Guarantee Act 1988,Endangered,https://www.environment.vic.gov.au/conserving-...,,open,...,Rhynchoedura,ornata,,2022-06-14T11:02:11Z,Rhynchoedura ornata,species,http://reptile-database.reptarium.cz/search.ph...,,,
2660,164661,353855,702203,7830,,Victoria,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Calamagrostis,quadriseta,,2020-12-08T19:17:33Z,Calamagrostis quadriseta,species,http://www.catalogueoflife.org/annual-checklis...,,,


In [130]:
# Stats
numsensitive = len(sensitivelist.index)
numconservation = len(conservationlist.index)
numupdates  = len(updates.index)
numadditions  = len(additions.index)
numnoinatstatus = len(noinatstatus.index)
numunknownToInat = len(unknownToInat.index)
numnotaddedupdated = len(notaddedupdated.index)
numnoncomply = len(noncomply.index)
numcomply = len(statelist.index)
numdupinfo = len(dupinformation.index)
d = {'Sensitive': [numsensitive],
    'Conservation': [numconservation],
    'Statelist merge': [numfullstatelist],
    'Species iNat Comply' : [numcomply],
    'Species iNat non-Comply': [numnoncomply],
    'Duplicate Information': [numdupinfo],
    'Updates': [numupdates],
    'Additions': [numadditions],
    'Not added updated': [numnotaddedupdated],
    'No Inat Status': [numnoinatstatus],
    'Unknown to Inat': [numunknownToInat]}

statsdf = pd.DataFrame(data=d)
statsdf

Unnamed: 0,Sensitive,Conservation,Statelist merge,Species iNat Comply,Species iNat non-Comply,Duplicate Information,Updates,Additions,Not added updated,No Inat Status,Unknown to Inat
0,136,1999,2383,2336,47,488,1330,0,26,1007,1007
