# Collate iNaturalist taxon status data

**Steps covered in this notebook:**
1. Extract all the current statuses in Australian jurisdictions in the export (eg.AU or Australia place, .gov.au in the URL or user 708886)
2. Resolve the taxon name for each using the iNaturalist Taxon DwCA at https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip (~350MB) or the iNat taxon API (if the taxon is inactive in iNaturalist)

File output: `inat-aust-status-taxa.csv` containing Australian conservation statuses,

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (ADD)
    b. updates - any changes to statuses (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## 1. Read in the iNaturalist Conservation Statuses export
Find the Australian statuses by querying:
* place names and display place names containing `AU`, `Australia` etc
* urls containing the string `.gov.au`
* records with a user id 708886 (Peggy who submitted the last round of statuses)

In [12]:
import os
import pandas as pd
projectdir = os.path.dirname(os.getcwd()) # get the parent directory of the current directory
dataindir = projectdir + "data/in/"
dataoutdir = projectdir + "data/out/"
df = pd.read_csv(projectdir + "data/in/inaturalist-australia-9/inaturalist-australia-9-conservation_statuses.csv", encoding='UTF-8', na_filter=False, dtype=str)
df

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,iucn,created_at,updated_at,place_name,place_display_name
0,210936,78570,,,,IUCN Red List,LC,https://www.iucnredlist.org/species/109716665/...,,,10,2021-09-28 11:12:51.830044,2021-12-21 04:53:02.568719,,
1,228106,219800,119123,6883,,NatureServe,S2,https://explorer.natureserve.org/,,,40,2021-12-16 19:59:11.692913,2021-12-21 13:14:46.38026,Ontario,"Ontario, CA"
2,224080,239472,51061,50,,NatureServe,S2S3,https://explorer.natureserve.org/Taxon/ELEMENT...,,,0,2021-11-24 18:14:30.91353,2021-11-24 18:14:30.91353,Nevada,"Nevada, US"
3,228110,219828,119123,6834,,NatureServe,S2S4,https://explorer.natureserve.org/,,,30,2021-12-16 19:59:12.175319,2021-12-21 13:18:11.648863,Alberta,"Alberta, CA"
4,223126,155851,,6815,,Tatzpiteva,VU,http://tatzpiteva.org.il/,,obscured,30,2021-10-03 01:25:45.177631,2021-10-03 01:25:45.177631,Israel,Israel
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
269077,262347,35912,2059107,10581,,LPO Pays de la Loire,CR,http://www.pays-de-la-loire.developpement-dura...,,open,50,2022-03-18 13:45:53.32241,2023-11-30 10:26:38.484467,Pays de la Loire,"Pays de la Loire, FR"
269078,284899,26888,2059107,10581,,LPO Pays de la Loire,VU,https://cdnfiles1.biolovision.net/www.faune-an...,Découverte récemment sur l’île d’Yeu en Vendée...,open,30,2023-11-30 10:33:27.810836,2023-11-30 10:33:27.810836,Pays de la Loire,"Pays de la Loire, FR"
269079,284904,39976,2059107,10581,,,NE,,,open,0,2023-11-30 10:45:35.322235,2023-11-30 10:45:35.322235,Pays de la Loire,"Pays de la Loire, FR"
269080,284962,488343,2059107,6753,,UICN Comité Français,NT,https://inpn.mnhn.fr/docs/LR_FCE/Dossier_press...,,open,20,2023-12-05 11:35:07.775826,2023-12-05 11:35:07.775826,France,France


In [7]:
# list of unique Aust place display names
placedisplaydf = df['place_display_name'].drop_duplicates().sort_values()
placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains("AU|Australia|AUSTRALIA")]
placedisplaydf

779                               Australia
84944     Australia Exclusive Economic Zone
237563     Australian Capital Territory, AU
256396                Cairns - Pt B, QL, AU
174004         Great Barrier Island, AU, NZ
82078          Hobsons Bay - Altona, VI, AU
136436         Lower Eyre Peninsula, SA, AU
490                     New South Wales, AU
353                  Northern Territory, AU
799                          Queensland, AU
175763                  Rottnest Island, AU
3977                    South Australia, AU
174885       South Australia, marine waters
248336        South East Queensland, QL, AU
69569                          Tasmania, AU
992                            Victoria, AU
777                   Western Australia, AU
256395                     Yarrabah, QL, AU
Name: place_display_name, dtype: object

In [9]:
# list of unique Aust place names
placedf = df['place_name'].drop_duplicates().sort_values()
placedf = placedf[pd.Series(placedf).str.contains("AU|Australia|AUSTRALIA")]
placedf

779                               Australia
84944     Australia Exclusive Economic Zone
237563         Australian Capital Territory
3977                        South Australia
174885       South Australia, marine waters
777                       Western Australia
Name: place_name, dtype: object

In [10]:
# list of unique Aust Govt urls
urldf = df['url'].drop_duplicates().sort_values()
urldf = urldf[pd.Series(urldf).str.contains(".gov.au")]
urldf

248834     https://www.environment.gov.au/epbc/about/epb...
5851      http://environment.gov.au/cgi-bin/sprat/public...
214467    http://environment.gov.au/cgi-bin/sprat/public...
2438      http://www.environment.gov.au/biodiversity/thr...
175194    http://www.environment.gov.au/biodiversity/thr...
                                ...                        
4765      https://www.environment.vic.gov.au/__data/asse...
992       https://www.environment.vic.gov.au/conserving-...
4820      https://www.legislation.qld.gov.au/view/html/i...
76459     https://www.legislation.sa.gov.au/LZ/C/A/NATIO...
268574    https://www.legislation.sa.gov.au/lz?path=%2FC...
Name: url, Length: 5262, dtype: object

In [11]:
# filter out all of these concat all, along with any records created by us (user id=708886) and remove duplicates
dfaus = pd.concat([df.apply(lambda row: row[df['place_display_name'].isin(placedisplaydf)]),
                   df.apply(lambda row: row[df['place_name'].isin(placedf)]),
                   df.apply(lambda row: row[df['url'].isin(urldf)]),
                   df.apply(lambda row: row[df['user_id'] == '708886'])]).drop_duplicates()
dfaus.sort_values(['taxon_id','user_id'])

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,iucn,created_at,updated_at,place_name,place_display_name
106517,268186,100080,1138587,6744,,,Not listed,,,open,10,2022-10-25 05:20:17.532504,2022-10-25 05:20:17.532504,Australia,Australia
101049,268138,100118,1138587,6744,,,Not listed,,,open,10,2022-10-25 03:32:38.344862,2022-10-25 03:32:38.344862,Australia,Australia
104990,268168,100127,1138587,6744,,,Not listed,,,open,10,2022-10-25 03:58:15.923985,2022-10-25 03:58:15.923985,Australia,Australia
257147,271765,1002188,708886,7308,,Qld Department of Environment and Science,Critically Endangered,https://apps.des.qld.gov.au/species-search/det...,,obscured,30,2023-02-09 20:05:05.626344,2023-02-09 20:05:05.626344,Queensland,"Queensland, AU"
256530,271143,1002207,708886,7308,,Qld Department of Environment and Science,Special least concern,https://apps.des.qld.gov.au/species-search/det...,,open,30,2023-02-09 20:04:04.4041,2023-02-09 20:04:04.4041,Queensland,"Queensland, AU"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
259659,264607,99971,3249428,7830,,"Victorian Department of Energy, Environment an...",Endangered,https://bie.ala.org.au/species/https://biodive...,,open,40,2022-06-06 17:00:58.905798,2023-02-09 20:09:32.699567,Victoria,"Victoria, AU"
259041,273522,99972,708886,6829,,Threatened Species Protection Act 1995,Vulnerable,https://bie.ala.org.au/species/https://biodive...,,open,30,2023-02-09 20:08:14.660235,2023-11-09 07:18:29.894362,Tasmania,"Tasmania, AU"
260913,153613,99973,708886,6827,16654,"WA Deparment of Biodiversity, Conservation and...",Critically Endangered,https://bie.ala.org.au/species/https://biodive...,,obscured,50,2019-07-23 00:12:11.19598,2023-02-09 20:12:22.866063,Western Australia,"Western Australia, AU"
238963,153742,99974,708886,6827,16654,"WA Deparment of Biodiversity, Conservation and...",Endangered,https://bie.ala.org.au/species/https://biodive...,,obscured,40,2019-07-23 00:12:41.014669,2023-02-09 20:12:23.102495,Western Australia,"Western Australia, AU"


## Retrieve taxon info from iNaturalist
The above file contains only the taxon identifier. Retrieve the full taxon name and classifications.

In [14]:
%%script echo skipping # comment this line to download dataset from the web and save locally - add inaturalist-taxonomy.dwca.zip to .gitignore

# save the file to the source data directory
import requests

url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

r = requests.get(url)
with open(dataindir + filename, 'wb') as f:
    f.write(r.content)
# reminder: add inaturalist-taxonomy.dwca.zip to .gitignore

In [15]:
# open the file in the source data directory and read the taxa.csv file
import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(dataindir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive)
z.close()

inattaxa.head(10)

Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2023-04-18T02:04:33Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...
3,4,https://www.inaturalist.org/taxa/4,https://www.inaturalist.org/taxa/4,https://www.inaturalist.org/taxa/3,Animalia,Chordata,Aves,Gruiformes,,,,,2019-10-19T15:14:18Z,Gruiformes,order,http://www.catalogueoflife.org/annual-checklis...
4,5,https://www.inaturalist.org/taxa/5,https://www.inaturalist.org/taxa/5,https://www.inaturalist.org/taxa/4,Animalia,Chordata,Aves,Gruiformes,Aramidae,,,,2022-03-24T16:38:28Z,Aramidae,family,http://www.birdlife.org/datazone/speciessearch...
5,6,https://www.inaturalist.org/taxa/6,https://www.inaturalist.org/taxa/6,https://www.inaturalist.org/taxa/5,Animalia,Chordata,Aves,Gruiformes,Aramidae,Aramus,,,2020-02-11T06:43:19Z,Aramus,genus,http://www.birdlife.org/datazone/speciessearch...
6,7,https://www.inaturalist.org/taxa/7,https://www.inaturalist.org/taxa/7,https://www.inaturalist.org/taxa/6,Animalia,Chordata,Aves,Gruiformes,Aramidae,Aramus,guarauna,,2022-03-30T18:35:55Z,Aramus guarauna,species,http://www.birdlife.org/datazone/speciesfactsh...
7,12,https://www.inaturalist.org/taxa/12,https://www.inaturalist.org/taxa/12,https://www.inaturalist.org/taxa/71262,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,,,,2022-03-24T16:37:54Z,Cariamidae,family,http://www.birdlife.org/datazone/speciessearch...
8,13,https://www.inaturalist.org/taxa/13,https://www.inaturalist.org/taxa/13,https://www.inaturalist.org/taxa/12,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,Cariama,,,2018-12-19T08:58:24Z,Cariama,genus,http://www.birdlife.org/datazone/speciessearch...
9,14,https://www.inaturalist.org/taxa/14,https://www.inaturalist.org/taxa/14,https://www.inaturalist.org/taxa/13,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,Cariama,cristata,,2021-07-06T02:04:43Z,Cariama cristata,species,http://www.birdlife.org/datazone/speciesfactsh...


In [16]:
len(inattaxa) # it's quite big

1277616

In [17]:
# left join to filter just the taxon that have statuses that we're interested in
austtaxaids = dfaus['taxon_id'].drop_duplicates()
inattaxa['id'] = inattaxa['id'].astype(str)
inataustaxa = pd.merge(austtaxaids, inattaxa, how="left", left_on='taxon_id', right_on='id')
inataustaxa.sort_values('id')

Unnamed: 0,taxon_id,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
269,100080,100080,https://www.inaturalist.org/taxa/100080,https://www.inaturalist.org/taxa/100080,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,bleekeri,,2019-11-24T03:20:37Z,Epinephelus bleekeri,species,http://www.fishbase.org
229,100118,100118,https://www.inaturalist.org/taxa/100118,https://www.inaturalist.org/taxa/100118,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,malabaricus,,2019-11-23T03:14:55Z,Epinephelus malabaricus,species,http://www.fishbase.org
254,100127,100127,https://www.inaturalist.org/taxa/100127,https://www.inaturalist.org/taxa/100127,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,polyphekadion,,2019-02-06T23:51:50Z,Epinephelus polyphekadion,species,http://www.fishbase.org
3388,1002188,1002188,https://www.inaturalist.org/taxa/1002188,https://www.inaturalist.org/taxa/1002188,https://www.inaturalist.org/taxa/60445,Plantae,Tracheophyta,Polypodiopsida,Hymenophyllales,Hymenophyllaceae,Hymenophyllum,whitei,,2020-01-11T06:34:24Z,Hymenophyllum whitei,species,
2906,1002207,1002207,https://www.inaturalist.org/taxa/1002207,https://www.inaturalist.org/taxa/1002207,https://www.inaturalist.org/taxa/142600,Plantae,Tracheophyta,Polypodiopsida,Schizaeales,Schizaeaceae,Actinostachys,wagneri,,2020-01-11T06:34:46Z,Actinostachys wagneri,species,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7037,1076724,,,,,,,,,,,,,,,,
7038,1251810,,,,,,,,,,,,,,,,
7062,1251860,,,,,,,,,,,,,,,,
7087,1076433,,,,,,,,,,,,,,,,


## Retrieve inactive taxa names from API

In [18]:
# looking at those that didn't match in the left join - Inactive or problem taxon - need to go to the API to get the detail
unmatchedtaxonids = inataustaxa[inataustaxa['id'].isna()]['taxon_id']
unmatchedtaxonids

7       1262199
30       508987
67      1264437
75      1264442
81       370219
         ...   
7037    1076724
7038    1251810
7062    1251860
7087    1076433
8122      50744
Name: taxon_id, Length: 164, dtype: object

In [20]:
# %%script echo skipping # comment this line to download dataset from the web and save locally - add inaturalist-taxonomy.dwca.zip to .gitignore

# to go the API to retrieve taxon names
import requests
import json
from time import sleep

apiurlbase = "https://api.inaturalist.org/v1/taxa/"
taxonlist = []
i = 1
for unmatchedid in unmatchedtaxonids:
    print(str(i) + " " + unmatchedid)
    apiurl = apiurlbase + str(unmatchedid)
    response = requests.request("GET", apiurl)
    payload = json.loads(response.text)
    if (payload['total_results'] == 1):
        r = payload['results'][0]
        try:
            common_name = r['preferred_common_name']
        except KeyError:
            common_name = ""
        taxonlist.append({'taxon_id': unmatchedid,
                      'name':r['name'],
                      'preferred_common_name':common_name,
                      'is_active':r['is_active'],
                      'observation_count':r['observations_count'],'current_synonymous_taxon_ids':r['current_synonymous_taxon_ids']})
    else:
        print("Warning: taxon_id " + unmatchedid + "returns more than one result from inaturalist")
    sleep(1)
    i+=1

pd.DataFrame(taxonlist).to_csv(dataindir + "inat-aust-inactive-taxon.csv",index = False)

1 1262199
2 508987
3 1264437
4 1264442
5 370219
6 1202976
7 1061113
8 116845
9 538087
10 1064159
11 4075
12 319393
13 1094414
14 954413
15 586067
16 144487
17 937255
18 937256
19 937266
20 405825
21 1136642
22 1275718
23 869830
24 145357
25 403648
26 491475
27 899197
28 138100
29 32164
30 340741
31 654262
32 158801
33 37279
34 140695
35 109170
36 602508
37 162565
38 634645
39 369372
40 770039
41 770117
42 83595
43 770200
44 339972
45 770008
46 769948
47 40743
48 770189
49 45206
50 427051
51 769981
52 770032
53 769943
54 425785
55 851155
56 145672
57 770132
58 541617
59 103669
60 4072
61 369374
62 851221
63 770186
64 103668
65 369320
66 109527
67 769954
68 770111
69 25242
70 145456
71 770035
72 318740
73 208141
74 42959
75 323867
76 1091300
77 1038965
78 525439
79 1132352
80 733329
81 460983
82 560655
83 534404
84 323944
85 19245
86 1251731
87 4812
88 19010
89 4785
90 369278
91 1074504
92 1230875
93 1276243
94 1108936
95 5315
96 42886
97 42887
98 870447
99 323753
100 441245
101 506956
1

In [21]:
inactivetaxa = pd.read_csv(dataindir+"inat-aust-inactive-taxon.csv", dtype=str)
inactivetaxa

Unnamed: 0,taxon_id,name,preferred_common_name,is_active,observation_count,current_synonymous_taxon_ids
0,1262199,Caleana dixonii,Sandplain Duck Orchid,False,0,[]
1,508987,Diomedea epomophora,Royal Albatross,False,0,"[4124, 1506470]"
2,1264437,Arctophoca forsteri,New Zealand Fur Seal,False,0,[41752]
3,1264442,Arctophoca tropicalis,Subantarctic Fur Seal,False,0,[41753]
4,370219,Geodorum densiflorum,Shepherd's Crook Orchid,False,1,[1453516]
...,...,...,...,...,...,...
159,1076724,Eucalyptus lateritica,,False,0,[1472798]
160,1251810,Eucalyptus leprophloia,,False,0,[1472799]
161,1251860,Eucalyptus pruiniramis,,False,0,[1472826]
162,1076433,Eucalyptus erectifolia,,False,0,[1472783]


The taxa above are typically marked as inactive, which means they have no observations in iNaturalist. Collate them into the taxa list, and merge the name into the scientificName field for later use.


In [22]:
# collate the status and the taxon info into a single file to use for the state work
# scientificName is 1. if id field is null, the name field or 2. else it's scientificName
alltaxa = pd.merge(inataustaxa, inactivetaxa, how="left")
alltaxa['scientificName'] = alltaxa.apply(lambda x: x['name'] if pd.isnull(x['id']) else x['scientificName'],axis=1)
alltaxa = alltaxa.drop(['name','observation_count'],axis=1)
alltaxa


Unnamed: 0,taxon_id,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,918383,918383,https://www.inaturalist.org/taxa/918383,https://www.inaturalist.org/taxa/918383,https://www.inaturalist.org/taxa/430819,Plantae,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Chiloschista,phyllorhiza,,2023-09-27T02:32:53Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
1,1247288,1247288,https://www.inaturalist.org/taxa/1247288,https://www.inaturalist.org/taxa/1247288,https://www.inaturalist.org/taxa/184926,Plantae,Tracheophyta,Magnoliopsida,Rosales,Rhamnaceae,Pomaderris,bodalla,,2021-08-27T06:18:35Z,Pomaderris bodalla,species,https://eol.org/pages/49432063,,,
2,1448721,1448721,https://www.inaturalist.org/taxa/1448721,https://www.inaturalist.org/taxa/1448721,https://www.inaturalist.org/taxa/1448501,Plantae,Tracheophyta,Magnoliopsida,Gentianales,Apocynaceae,Leichhardtia,glandulifera,,2023-02-12T02:51:48Z,Leichhardtia glandulifera,species,https://powo.science.kew.org/taxon/urn:lsid:ip...,,,
3,1120831,1120831,https://www.inaturalist.org/taxa/1120831,https://www.inaturalist.org/taxa/1120831,https://www.inaturalist.org/taxa/576108,Plantae,Tracheophyta,Magnoliopsida,Oxalidales,Cunoniaceae,Acrophyllum,australe,,2021-03-14T08:51:19Z,Acrophyllum australe,species,http://plantsoftheworldonline.org/,,,
4,577809,577809,https://www.inaturalist.org/taxa/577809,https://www.inaturalist.org/taxa/577809,https://www.inaturalist.org/taxa/543990,Plantae,Tracheophyta,Magnoliopsida,Myrtales,Myrtaceae,Lenwebbia,prominens,,2022-07-05T21:45:25Z,Lenwebbia prominens,species,http://www.catalogueoflife.org/annual-checklis...,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8127,779693,779693,https://www.inaturalist.org/taxa/779693,https://www.inaturalist.org/taxa/779693,https://www.inaturalist.org/taxa/421546,Animalia,Arthropoda,Arachnida,Araneae,Theraphosidae,Selenocosmia,crassipes,,2020-12-20T02:39:27Z,Selenocosmia crassipes,species,,,,
8128,1244660,1244660,https://www.inaturalist.org/taxa/1244660,https://www.inaturalist.org/taxa/1244660,https://www.inaturalist.org/taxa/92930,Plantae,Tracheophyta,Magnoliopsida,Sapindales,Rutaceae,Zieria,gymnocarpa,,2021-07-28T04:47:17Z,Zieria gymnocarpa,species,,,,
8129,1467834,1467834,https://www.inaturalist.org/taxa/1467834,https://www.inaturalist.org/taxa/1467834,https://www.inaturalist.org/taxa/83598,Plantae,Tracheophyta,Magnoliopsida,Dilleniales,Dilleniaceae,Hibbertia,charlesii,,2023-05-15T23:10:33Z,Hibbertia charlesii,species,,,,
8130,1504913,1504913,https://www.inaturalist.org/taxa/1504913,https://www.inaturalist.org/taxa/1504913,https://www.inaturalist.org/taxa/324014,Plantae,Tracheophyta,Liliopsida,Asparagales,Asphodelaceae,Caesia,arcuata,,2023-10-26T00:54:40Z,Caesia arcuata,species,https://powo.science.kew.org/taxon/urn:lsid:ip...,,,


In [23]:
# now merge back with the original australian statuses and save locally
taxastatus = pd.merge(dfaus,alltaxa,how="left", on="taxon_id")
#taxastatus
#taxastatus.drop(['id_y'])
taxastatus = taxastatus.rename(columns={'id_x':'id'})
taxastatus

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2023-09-27T02:32:53Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
1,180721,1247288,222137,6825,,New South Wales Office of Environment and Heri...,Vulnerable,https://www.environment.nsw.gov.au/threateneds...,,open,...,Pomaderris,bodalla,,2021-08-27T06:18:35Z,Pomaderris bodalla,species,https://eol.org/pages/49432063,,,
2,276823,1448721,708886,6827,16654,WA Department of Environment and Convservation,endangered,https://lists.ala.org.au/speciesListItem/list/...,,obscured,...,Leichhardtia,glandulifera,,2023-02-12T02:51:48Z,Leichhardtia glandulifera,species,https://powo.science.kew.org/taxon/urn:lsid:ip...,,,
3,166610,1120831,3669610,6744,,Australian Government,VU,http://www.environment.gov.au/epbc,,,...,Acrophyllum,australe,,2021-03-14T08:51:19Z,Acrophyllum australe,species,http://plantsoftheworldonline.org/,,,
4,164339,577809,58320,7308,,Qld Department of Environment and Science,Near Threatened,https://apps.des.qld.gov.au/species-search/det...,,,...,Lenwebbia,prominens,,2022-07-05T21:45:25Z,Lenwebbia prominens,species,http://www.catalogueoflife.org/annual-checklis...,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9796,169061,1244660,409010,,,Qld Department of Environment and Science,Critically Endangered,https://apps.des.qld.gov.au/species-search/det...,,,...,Zieria,gymnocarpa,,2021-07-28T04:47:17Z,Zieria gymnocarpa,species,,,,
9797,278759,1467834,624851,,,"Western Australian Herbarium, Biodiversity an...",Priority Two,https://florabase.dpaw.wa.gov.au/browse/profil...,"Poorly known, some records from Conservation a...",obscured,...,Hibbertia,charlesii,,2023-05-15T23:10:33Z,Hibbertia charlesii,species,,,,
9798,280488,1251881,84719,,,New South Wales Office of Environment and Heri...,Endangered,https://www.environment.nsw.gov.au/threateneds...,,obscured,...,Eucalyptus,pachycalyx,banyabba,2021-07-28T03:30:28Z,Eucalyptus pachycalyx banyabba,subspecies,https://eol.org/pages/50186382,,,
9799,283824,1504913,1138587,,,Conservation Codes for Western Australian Flora,Priority 1,https://florabase.dbca.wa.gov.au/browse/profil...,,obscured,...,Caesia,arcuata,,2023-10-26T00:54:40Z,Caesia arcuata,species,https://powo.science.kew.org/taxon/urn:lsid:ip...,,,


In [24]:
taxastatus.to_csv(dataindir + "inat-aust-status-taxa.csv", index=False)

In [25]:
taxastatus.groupby(['taxonRank'])['taxonRank'].count()

taxonRank
complex          2
form             1
genus            2
hybrid          20
species       8636
subspecies     790
variety        148
Name: taxonRank, dtype: int64

## Notes about conservation status inheritance in inaturalist:
Adding a conservation status for a higher level taxon affects observations of all the species in this taxon. Please do not add statuses for taxa that contain species that have no status because that will incorrectly obscure coordinates for observations of those species.

### Example:
Genus Acriopsis - https://www.inaturalist.org/taxa/425476-Acriopsis

The inherited species records are not in the export and can't be edited, but they do appear on the species pages.
The changes are to set _Acriopsis emarginata_ to Vulnerable. The rest will remain least concern.

In [26]:
eg = pd.DataFrame(columns=('TaxonID', 'Name', 'Status','Rank','iNat species page','iNat edit taxon','Export'))
eg.loc[1]=['425476','Acriopsis','LC/obscured','genus','No ','Yes','Yes']
eg.loc[2]=['1141144','Acriopsis emarginata','LC/obscured','species','Yes','No','No']
eg.loc[3]=['425475','Acriopsis liliifolia','LC/obscured','species','Yes','No','No']
eg.loc[4]=['427833','Acriopsis ridleyi','LC/obscured','species','Yes','No','No']
eg.loc[5]=['1037999','Acriopsis indica','LC/obscured','Species','Yes','No','No']
eg

Unnamed: 0,TaxonID,Name,Status,Rank,iNat species page,iNat edit taxon,Export
1,425476,Acriopsis,LC/obscured,genus,No,Yes,Yes
2,1141144,Acriopsis emarginata,LC/obscured,species,Yes,No,No
3,425475,Acriopsis liliifolia,LC/obscured,species,Yes,No,No
4,427833,Acriopsis ridleyi,LC/obscured,species,Yes,No,No
5,1037999,Acriopsis indica,LC/obscured,Species,Yes,No,No
