# Collate iNaturalist taxon status data

**Steps covered in this notebook:**
1. Extract all the current statuses in Australian jurisdictions in the export (eg.AU or Australia place, .gov.au in the URL or user 708886)
2. Resolve the taxon name for each using the iNaturalist Taxon DwCA at https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip (~350MB) or the iNat taxon API (if the taxon is inactive in iNaturalist)

File output: `inat-aust-status-taxa.csv` containing Australian conservation statuses,

**Next steps:**
State by state establish the changes that need to be made:
    a. new - any new species that appear in the state lists but do not have a status in inaturalist (ADD)
    b. updates - any changes to statuses (update template, action='UPDATE')
    c. removals - any statuses which were added by us previously (user_id = 708886) list which are incorrect (update template, action='REMOVE')
    d. flags - are there any statuses by other users that need to be flagged?

## 1. Read in the iNaturalist Conservation Statuses export
Find the Australian statuses by querying:
* place names and display place names containing `AU`, `Australia` etc
* urls containing the string `.gov.au`
* records with a user id 708886 (Peggy who submitted the last round of statuses)

In [5]:
import os
import pandas as pd
projectdir = os.path.dirname(os.getcwd()) # get the parent directory of the current directory
dataindir = projectdir + "/data/in/"
dataoutdir = projectdir + "/data/out/"
df = pd.read_csv(dataindir + "inaturalist-australia-9/inaturalist-australia-9-conservation_statuses.csv", encoding='UTF-8', na_filter=False, dtype=str)
df

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,iucn,created_at,updated_at,place_name,place_display_name
0,210936,78570,,,,IUCN Red List,LC,https://www.iucnredlist.org/species/109716665/...,,,10,2021-09-28 11:12:51.830044,2021-12-21 04:53:02.568719,,
1,228106,219800,119123,6883,,NatureServe,S2,https://explorer.natureserve.org/,,,40,2021-12-16 19:59:11.692913,2021-12-21 13:14:46.38026,Ontario,"Ontario, CA"
2,224080,239472,51061,50,,NatureServe,S2S3,https://explorer.natureserve.org/Taxon/ELEMENT...,,,0,2021-11-24 18:14:30.91353,2021-11-24 18:14:30.91353,Nevada,"Nevada, US"
3,228110,219828,119123,6834,,NatureServe,S2S4,https://explorer.natureserve.org/,,,30,2021-12-16 19:59:12.175319,2021-12-21 13:18:11.648863,Alberta,"Alberta, CA"
4,294175,1433836,4350079,7236,,BAFU,NT,https://www.wsl.ch/map_fungi/search?taxon=32642,,open,20,2024-10-03 23:22:32.375474,2024-10-03 23:22:32.375474,Switzerland,Switzerland
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
282266,298475,807416,4320069,10572,,Inventaire national du patrimoine naturel (INPN),EN,https://inpn.mnhn.fr/espece/cd_nom/112293/tab/...,,open,40,2024-12-02 07:37:00.68641,2024-12-02 07:37:00.68641,Bourgogne,"Bourgogne, FR"
282267,298480,1249122,7739024,,,"Espírito Santo, São Paulo, Rio de Janeiro",EN,https://proflora.jbrj.gov.br/html/Ischnosiphon...,,obscured,40,2024-12-02 16:08:05.561432,2024-12-02 16:16:38.386908,,
282268,298481,1219541,7739024,130979,,BANCO DE DATOS DE BIODIVERSIDAD DE CANARIAS,NE,https://www.biodiversidadcanarias.es/biota/esp...,,obscured,0,2024-12-02 17:26:55.708176,2024-12-02 17:26:55.708176,Canary Islands Archipelago,"Canary Islands Archipelago, ES"
282269,294211,160210,887173,,,NatureServe,G1,https://explorer.natureserve.org/Taxon/ELEMENT...,,obscured,40,2024-10-04 17:46:00.77622,2024-12-02 23:10:38.292209,,


In [6]:
# list of unique Aust place display names
placedisplaydf = df['place_display_name'].drop_duplicates().sort_values()
placedisplaydf = placedisplaydf[pd.Series(placedisplaydf).str.contains("AU|Australia|AUSTRALIA")]
placedisplaydf

96                                Australia
85278     Australia Exclusive Economic Zone
188        Australian Capital Territory, AU
256627                Cairns - Pt B, QL, AU
174033         Great Barrier Island, AU, NZ
82086          Hobsons Bay - Altona, VI, AU
270773             Lord Howe Island, NS, AU
136457         Lower Eyre Peninsula, SA, AU
492                     New South Wales, AU
354                  Northern Territory, AU
802                          Queensland, AU
2810                    South Australia, AU
174926       South Australia, marine waters
248521        South East Queensland, QL, AU
33968                          Tasmania, AU
995                            Victoria, AU
779                   Western Australia, AU
281854                     Yarrabah, QL, AU
Name: place_display_name, dtype: object

In [9]:
# list of unique Aust place names
placedf = df['place_name'].drop_duplicates().sort_values()
placedf = placedf[pd.Series(placedf).str.contains(", AU|Australia|AUSTRALIA")]
placedf

96                                Australia
85278     Australia Exclusive Economic Zone
188            Australian Capital Territory
2810                        South Australia
174926       South Australia, marine waters
779                       Western Australia
Name: place_name, dtype: object

In [10]:
# list of unique Aust Govt urls
urldf = df['url'].drop_duplicates().sort_values()
urldf = urldf[pd.Series(urldf).str.contains(".gov.au")]
urldf

249018     https://www.environment.gov.au/epbc/about/epb...
5906      http://environment.gov.au/cgi-bin/sprat/public...
214589    http://environment.gov.au/cgi-bin/sprat/public...
2447      http://www.environment.gov.au/biodiversity/thr...
175253    http://www.environment.gov.au/biodiversity/thr...
                                ...                        
995       https://www.environment.vic.gov.au/conserving-...
4850      https://www.legislation.qld.gov.au/view/html/i...
76468     https://www.legislation.sa.gov.au/LZ/C/A/NATIO...
268873    https://www.legislation.sa.gov.au/lz?path=%2FC...
270581    https://www.threatenedspecieslink.tas.gov.au/P...
Name: url, Length: 5340, dtype: object

In [11]:
# filter out all of these concat all, along with any records created by us (user id=708886) and remove duplicates
dfaus = pd.concat([df.apply(lambda row: row[df['place_display_name'].isin(placedisplaydf)]),
                   df.apply(lambda row: row[df['place_name'].isin(placedf)]),
                   df.apply(lambda row: row[df['url'].isin(urldf)]),
                   df.apply(lambda row: row[df['user_id'] == '708886'])]).drop_duplicates()
dfaus.sort_values(['taxon_id','user_id'])

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,iucn,created_at,updated_at,place_name,place_display_name
106543,268186,100080,1138587,6744,,,Not listed,,,open,10,2022-10-25 05:20:17.532504,2022-10-25 05:20:17.532504,Australia,Australia
101073,268138,100118,1138587,6744,,,Not listed,,,open,10,2022-10-25 03:32:38.344862,2022-10-25 03:32:38.344862,Australia,Australia
105016,268168,100127,1138587,6744,,,Not listed,,,open,10,2022-10-25 03:58:15.923985,2022-10-25 03:58:15.923985,Australia,Australia
257382,271765,1002188,708886,7308,,Qld Department of Environment and Science,Critically Endangered,https://apps.des.qld.gov.au/species-search/det...,,obscured,30,2023-02-09 20:05:05.626344,2023-02-09 20:05:05.626344,Queensland,"Queensland, AU"
256764,271143,1002207,708886,7308,,Qld Department of Environment and Science,Special least concern,https://apps.des.qld.gov.au/species-search/det...,,open,30,2023-02-09 20:04:04.4041,2023-02-09 20:04:04.4041,Queensland,"Queensland, AU"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
259909,264607,99971,3249428,7830,,"Victorian Department of Energy, Environment an...",Endangered,https://bie.ala.org.au/species/https://biodive...,,open,40,2022-06-06 17:00:58.905798,2023-02-09 20:09:32.699567,Victoria,"Victoria, AU"
259287,273522,99972,708886,6829,,Threatened Species Protection Act 1995,Vulnerable,https://bie.ala.org.au/species/https://biodive...,,open,30,2023-02-09 20:08:14.660235,2023-11-09 07:18:29.894362,Tasmania,"Tasmania, AU"
261170,153613,99973,708886,6827,16654,"WA Deparment of Biodiversity, Conservation and...",Critically Endangered,https://bie.ala.org.au/species/https://biodive...,,obscured,50,2019-07-23 00:12:11.19598,2023-02-09 20:12:22.866063,Western Australia,"Western Australia, AU"
239129,153742,99974,708886,6827,16654,"WA Deparment of Biodiversity, Conservation and...",Endangered,https://bie.ala.org.au/species/https://biodive...,,obscured,40,2019-07-23 00:12:41.014669,2023-02-09 20:12:23.102495,Western Australia,"Western Australia, AU"


## Retrieve taxon info from iNaturalist
The above file contains only the taxon identifier. Retrieve the full taxon name and classifications.

In [14]:
%%script echo skipping # comment this line to download dataset from the web and save locally - add inaturalist-taxonomy.dwca.zip to .gitignore

# save the file to the source data directory
import requests

url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

r = requests.get(url)
with open(dataindir + filename, 'wb') as f:
    f.write(r.content)
# reminder: add inaturalist-taxonomy.dwca.zip to .gitignore

In [12]:
# open the file in the source data directory and read the taxa.csv file
import zipfile
url = "https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip"
filename = url.split("/")[-1]

z=zipfile.ZipFile(dataindir + filename)

with z.open('taxa.csv') as from_archive:
    inattaxa = pd.read_csv(from_archive)
z.close()

inattaxa.head(10)

Unnamed: 0,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
0,1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/1,https://www.inaturalist.org/taxa/48460,Animalia,,,,,,,,2021-11-02T06:05:44Z,Animalia,kingdom,http://www.catalogueoflife.org/annual-checklis...
1,2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/2,https://www.inaturalist.org/taxa/1,Animalia,Chordata,,,,,,,2021-11-23T00:40:18Z,Chordata,phylum,http://www.catalogueoflife.org/annual-checklis...
2,3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/3,https://www.inaturalist.org/taxa/355675,Animalia,Chordata,Aves,,,,,,2023-04-18T02:04:33Z,Aves,class,http://www.catalogueoflife.org/annual-checklis...
3,4,https://www.inaturalist.org/taxa/4,https://www.inaturalist.org/taxa/4,https://www.inaturalist.org/taxa/3,Animalia,Chordata,Aves,Gruiformes,,,,,2019-10-19T15:14:18Z,Gruiformes,order,http://www.catalogueoflife.org/annual-checklis...
4,5,https://www.inaturalist.org/taxa/5,https://www.inaturalist.org/taxa/5,https://www.inaturalist.org/taxa/4,Animalia,Chordata,Aves,Gruiformes,Aramidae,,,,2022-03-24T16:38:28Z,Aramidae,family,http://www.birdlife.org/datazone/speciessearch...
5,6,https://www.inaturalist.org/taxa/6,https://www.inaturalist.org/taxa/6,https://www.inaturalist.org/taxa/5,Animalia,Chordata,Aves,Gruiformes,Aramidae,Aramus,,,2020-02-11T06:43:19Z,Aramus,genus,http://www.birdlife.org/datazone/speciessearch...
6,7,https://www.inaturalist.org/taxa/7,https://www.inaturalist.org/taxa/7,https://www.inaturalist.org/taxa/6,Animalia,Chordata,Aves,Gruiformes,Aramidae,Aramus,guarauna,,2022-03-30T18:35:55Z,Aramus guarauna,species,http://www.birdlife.org/datazone/speciesfactsh...
7,12,https://www.inaturalist.org/taxa/12,https://www.inaturalist.org/taxa/12,https://www.inaturalist.org/taxa/71262,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,,,,2022-03-24T16:37:54Z,Cariamidae,family,http://www.birdlife.org/datazone/speciessearch...
8,13,https://www.inaturalist.org/taxa/13,https://www.inaturalist.org/taxa/13,https://www.inaturalist.org/taxa/12,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,Cariama,,,2018-12-19T08:58:24Z,Cariama,genus,http://www.birdlife.org/datazone/speciessearch...
9,14,https://www.inaturalist.org/taxa/14,https://www.inaturalist.org/taxa/14,https://www.inaturalist.org/taxa/13,Animalia,Chordata,Aves,Cariamiformes,Cariamidae,Cariama,cristata,,2021-07-06T02:04:43Z,Cariama cristata,species,http://www.birdlife.org/datazone/speciesfactsh...


In [13]:
len(inattaxa) # it's quite big

1344422

In [14]:
# left join to filter just the taxon that have statuses that we're interested in
austtaxaids = dfaus['taxon_id'].drop_duplicates()
inattaxa['id'] = inattaxa['id'].astype(str)
inataustaxa = pd.merge(austtaxaids, inattaxa, how="left", left_on='taxon_id', right_on='id')
inataustaxa.sort_values('id')

Unnamed: 0,taxon_id,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references
302,100080,100080,https://www.inaturalist.org/taxa/100080,https://www.inaturalist.org/taxa/100080,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,bleekeri,,2019-11-24T03:20:37Z,Epinephelus bleekeri,species,http://www.fishbase.org
262,100118,100118,https://www.inaturalist.org/taxa/100118,https://www.inaturalist.org/taxa/100118,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,malabaricus,,2019-11-23T03:14:55Z,Epinephelus malabaricus,species,http://www.fishbase.org
287,100127,100127,https://www.inaturalist.org/taxa/100127,https://www.inaturalist.org/taxa/100127,https://www.inaturalist.org/taxa/49497,Animalia,Chordata,Actinopterygii,Perciformes,Epinephelidae,Epinephelus,polyphekadion,,2019-02-06T23:51:50Z,Epinephelus polyphekadion,species,http://www.fishbase.org
3402,1002188,1002188,https://www.inaturalist.org/taxa/1002188,https://www.inaturalist.org/taxa/1002188,https://www.inaturalist.org/taxa/60445,Plantae,Tracheophyta,Polypodiopsida,Hymenophyllales,Hymenophyllaceae,Hymenophyllum,whitei,,2020-01-11T06:34:24Z,Hymenophyllum whitei,species,
2944,1002207,1002207,https://www.inaturalist.org/taxa/1002207,https://www.inaturalist.org/taxa/1002207,https://www.inaturalist.org/taxa/142600,Plantae,Tracheophyta,Polypodiopsida,Schizaeales,Schizaeaceae,Actinostachys,wagneri,,2024-06-08T16:08:26Z,Actinostachys wagneri,species,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8151,1289564,,,,,,,,,,,,,,,,
8164,904699,,,,,,,,,,,,,,,,
8172,1556562,,,,,,,,,,,,,,,,
8297,123102,,,,,,,,,,,,,,,,


## Retrieve inactive taxa names from API

In [15]:
# looking at those that didn't match in the left join - Inactive or problem taxon - need to go to the API to get the detail
unmatchedtaxonids = inataustaxa[inataustaxa['id'].isna()]['taxon_id']
unmatchedtaxonids

8       1262199
14       993605
28       634763
71      1264437
79      1264442
         ...   
8151    1289564
8164     904699
8172    1556562
8297     123102
8300      50744
Name: taxon_id, Length: 224, dtype: object

In [16]:
# %%script echo skipping # comment this line to download dataset from the web and save locally - add inaturalist-taxonomy.dwca.zip to .gitignore

# to go the API to retrieve taxon names
import requests
import json
from time import sleep

apiurlbase = "https://api.inaturalist.org/v1/taxa/"
taxonlist = []
i = 1
for unmatchedid in unmatchedtaxonids:
    print(str(i) + " " + unmatchedid)
    apiurl = apiurlbase + str(unmatchedid)
    response = requests.request("GET", apiurl)
    payload = json.loads(response.text)
    if (payload['total_results'] == 1):
        r = payload['results'][0]
        try:
            common_name = r['preferred_common_name']
        except KeyError:
            common_name = ""
        taxonlist.append({'taxon_id': unmatchedid,
                      'name':r['name'],
                      'preferred_common_name':common_name,
                      'is_active':r['is_active'],
                      'observation_count':r['observations_count'],'current_synonymous_taxon_ids':r['current_synonymous_taxon_ids']})
    else:
        print("Warning: taxon_id " + unmatchedid + "returns more than one result from inaturalist")
    sleep(1)
    i+=1

pd.DataFrame(taxonlist).to_csv(dataindir + "inat-aust-inactive-taxon.csv",index = False)

1 1262199
2 993605
3 634763
4 1264437
5 1264442
6 370219
7 1202976
8 892273
9 334722
10 1522400
11 1522401
12 1061113
13 1522416
14 1522427
15 116845
16 850736
17 538087
18 1064159
19 4075
20 319393
21 1094414
22 954413
23 586067
24 937255
25 937256
26 937266
27 508987
28 405825
29 966091
30 1136642
31 1275718
32 1535105
33 869830
34 145357
35 403648
36 491475
37 899197
38 138100
39 32164
40 654262
41 158801
42 37279
43 140695
44 109170
45 602508
46 162565
47 634645
48 144454
49 369372
50 770039
51 770117
52 83595
53 770200
54 339972
55 770008
56 769948
57 770189
58 45206
59 427051
60 42773
61 769981
62 508990
63 770032
64 769943
65 425785
66 851155
67 145672
68 880603
69 770132
70 541617
71 103669
72 4072
73 369374
74 851221
75 40199
76 770186
77 103668
78 369320
79 109527
80 769954
81 770111
82 25242
83 145456
84 770035
85 318740
86 208141
87 42959
88 323867
89 1091300
90 1170290
91 1038965
92 525439
93 1132352
94 413471
95 733329
96 460983
97 560655
98 534404
99 5003
100 1248896
101

In [17]:
inactivetaxa = pd.read_csv(dataindir+"inat-aust-inactive-taxon.csv", dtype=str)
inactivetaxa

Unnamed: 0,taxon_id,name,preferred_common_name,is_active,observation_count,current_synonymous_taxon_ids
0,1262199,Caleana dixonii,Sandplain Duck Orchid,False,0,[]
1,993605,Acianthus amplexicaulis,,False,0,[1580371]
2,634763,Rutidosis leptorhynchoides,button wrinklewort,False,0,[1255168]
3,1264437,Arctophoca forsteri,New Zealand Fur Seal,False,0,[41752]
4,1264442,Arctophoca tropicalis,Subantarctic Fur Seal,False,0,[41753]
...,...,...,...,...,...,...
219,1289564,Parvipsitta porphyrocephala,Purple-crowned Lorikeet,False,0,[1584102]
220,904699,Zeugodacus cucurbitae,Melon Fly,False,0,[448876]
221,1556562,Carcharhinus spallanzani,Spottail Shark,False,0,[96767]
222,123102,Turdus poliocephalus erythropleurus,Christmas Island Thrush,False,0,[1574874]


The taxa above are typically marked as inactive, which means they have no observations in iNaturalist. Collate them into the taxa list, and merge the name into the scientificName field for later use.


In [18]:
# collate the status and the taxon info into a single file to use for the state work
# scientificName is 1. if id field is null, the name field or 2. else it's scientificName
alltaxa = pd.merge(inataustaxa, inactivetaxa, how="left")
alltaxa['scientificName'] = alltaxa.apply(lambda x: x['name'] if pd.isnull(x['id']) else x['scientificName'],axis=1)
alltaxa = alltaxa.drop(['name','observation_count'],axis=1)
alltaxa


Unnamed: 0,taxon_id,id,taxonID,identifier,parentNameUsageID,kingdom,phylum,class,order,family,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,46311,46311,https://www.inaturalist.org/taxa/46311,https://www.inaturalist.org/taxa/46311,https://www.inaturalist.org/taxa/46310,Animalia,Chordata,Mammalia,Sirenia,Dugongidae,Dugong,dugon,,2020-05-04T14:42:25Z,Dugong dugon,species,http://www.catalogueoflife.org/annual-checklis...,,,
1,517069,517069,https://www.inaturalist.org/taxa/517069,https://www.inaturalist.org/taxa/517069,https://www.inaturalist.org/taxa/517030,Animalia,Chordata,Amphibia,Anura,Hylidae,Ranoidea,aurea,,2023-09-23T05:53:18Z,Ranoidea aurea,species,http://research.amnh.org/vz/herpetology/amphibia/,,,
2,918383,918383,https://www.inaturalist.org/taxa/918383,https://www.inaturalist.org/taxa/918383,https://www.inaturalist.org/taxa/430819,Plantae,Tracheophyta,Liliopsida,Asparagales,Orchidaceae,Chiloschista,phyllorhiza,,2023-09-27T02:32:53Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
3,1247288,1247288,https://www.inaturalist.org/taxa/1247288,https://www.inaturalist.org/taxa/1247288,https://www.inaturalist.org/taxa/184926,Plantae,Tracheophyta,Magnoliopsida,Rosales,Rhamnaceae,Pomaderris,bodalla,,2021-08-27T06:18:35Z,Pomaderris bodalla,species,https://eol.org/pages/49432063,,,
4,1448721,1448721,https://www.inaturalist.org/taxa/1448721,https://www.inaturalist.org/taxa/1448721,https://www.inaturalist.org/taxa/1448501,Plantae,Tracheophyta,Magnoliopsida,Gentianales,Apocynaceae,Leichhardtia,glandulifera,,2023-02-12T02:51:48Z,Leichhardtia glandulifera,species,https://powo.science.kew.org/taxon/urn:lsid:ip...,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8305,1467834,1467834,https://www.inaturalist.org/taxa/1467834,https://www.inaturalist.org/taxa/1467834,https://www.inaturalist.org/taxa/83598,Plantae,Tracheophyta,Magnoliopsida,Dilleniales,Dilleniaceae,Hibbertia,charlesii,,2023-05-15T23:10:33Z,Hibbertia charlesii,species,,,,
8306,402778,402778,https://www.inaturalist.org/taxa/402778,https://www.inaturalist.org/taxa/402778,https://www.inaturalist.org/taxa/379024,Plantae,Tracheophyta,Magnoliopsida,Santalales,Loranthaceae,Ileostylus,micranthus,,2023-06-16T06:48:05Z,Ileostylus micranthus,species,,,,
8307,552264,552264,https://www.inaturalist.org/taxa/552264,https://www.inaturalist.org/taxa/552264,https://www.inaturalist.org/taxa/552260,Animalia,Chordata,Reptilia,Squamata,Scincidae,Carinascincus,orocryptus,,2018-11-18T00:17:16Z,Carinascincus orocryptus,species,,,,
8308,1574874,1574874,https://www.inaturalist.org/taxa/1574874,https://www.inaturalist.org/taxa/1574874,https://www.inaturalist.org/taxa/12705,Animalia,Chordata,Aves,Passeriformes,Turdidae,Turdus,erythropleurus,,2024-12-16T19:18:57Z,Turdus erythropleurus,species,,,,


In [19]:
# now merge back with the original australian statuses and save locally
taxastatus = pd.merge(dfaus,alltaxa,how="left", on="taxon_id")
#taxastatus
#taxastatus.drop(['id_y'])
taxastatus = taxastatus.rename(columns={'id_x':'id'})
taxastatus

Unnamed: 0,id,taxon_id,user_id,place_id,source_id,authority,status,url,description,geoprivacy,...,genus,specificEpithet,infraspecificEpithet,modified,scientificName,taxonRank,references,preferred_common_name,is_active,current_synonymous_taxon_ids
0,285391,46311,1138587,6744,,,Not listed,,,open,...,Dugong,dugon,,2020-05-04T14:42:25Z,Dugong dugon,species,http://www.catalogueoflife.org/annual-checklis...,,,
1,285328,517069,1138587,12986,,Nature Conservation Act 2014 (ACT),VU,,,obscured,...,Ranoidea,aurea,,2023-09-23T05:53:18Z,Ranoidea aurea,species,http://research.amnh.org/vz/herpetology/amphibia/,,,
2,234788,918383,702203,9994,,Atlas of Living Australia,NT,https://bie.ala.org.au/species/https://id.biod...,,,...,Chiloschista,phyllorhiza,,2023-09-27T02:32:53Z,Chiloschista phyllorhiza,species,http://www.catalogueoflife.org/annual-checklis...,,,
3,285396,46311,1138587,9994,,,Not listed,,,open,...,Dugong,dugon,,2020-05-04T14:42:25Z,Dugong dugon,species,http://www.catalogueoflife.org/annual-checklis...,,,
4,180721,1247288,222137,6825,,New South Wales Office of Environment and Heri...,Vulnerable,https://www.environment.nsw.gov.au/threateneds...,,open,...,Pomaderris,bodalla,,2021-08-27T06:18:35Z,Pomaderris bodalla,species,https://eol.org/pages/49432063,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10067,285267,402778,1138587,7333,,Environment Protection and Biodiversity Conser...,VU,https://www.environment.gov.au/cgi-bin/sprat/p...,,open,...,Ileostylus,micranthus,,2023-06-16T06:48:05Z,Ileostylus micranthus,species,,,,
10068,159365,40179,425992,144315,,Brisbane City Council,Significant,https://www.brisbane.qld.gov.au/sites/default/...,,open,...,Phascogale,tapoatafa,,2023-06-06T12:22:39Z,Phascogale tapoatafa,species,http://www.catalogueoflife.org/annual-checklis...,,,
10069,290896,552264,702203,,,Autralian Government EPBC Act,EN,https://www.environment.gov.au/cgi-bin/sprat/p...,Endangered,open,...,Carinascincus,orocryptus,,2018-11-18T00:17:16Z,Carinascincus orocryptus,species,,,,
10070,294806,1574874,2366151,7616,,Environmental Protection and Biodiversity Cons...,Endangered,http://www.environment.gov.au/cgi-bin/sprat/pu...,,obscured,...,Turdus,erythropleurus,,2024-12-16T19:18:57Z,Turdus erythropleurus,species,,,,


In [20]:
taxastatus.to_csv(dataindir + "inat-aust-status-taxa.csv", index=False)

In [21]:
taxastatus.groupby(['taxonRank'])['taxonRank'].count()

taxonRank
complex          2
form             1
genus            2
hybrid          21
species       8814
subspecies     808
variety        142
Name: taxonRank, dtype: int64

## Notes about conservation status inheritance in inaturalist:
Adding a conservation status for a higher level taxon affects observations of all the species in this taxon. Please do not add statuses for taxa that contain species that have no status because that will incorrectly obscure coordinates for observations of those species.

### Example:
Genus Acriopsis - https://www.inaturalist.org/taxa/425476-Acriopsis

The inherited species records are not in the export and can't be edited, but they do appear on the species pages.
The changes are to set _Acriopsis emarginata_ to Vulnerable. The rest will remain least concern.

In [22]:
eg = pd.DataFrame(columns=('TaxonID', 'Name', 'Status','Rank','iNat species page','iNat edit taxon','Export'))
eg.loc[1]=['425476','Acriopsis','LC/obscured','genus','No ','Yes','Yes']
eg.loc[2]=['1141144','Acriopsis emarginata','LC/obscured','species','Yes','No','No']
eg.loc[3]=['425475','Acriopsis liliifolia','LC/obscured','species','Yes','No','No']
eg.loc[4]=['427833','Acriopsis ridleyi','LC/obscured','species','Yes','No','No']
eg.loc[5]=['1037999','Acriopsis indica','LC/obscured','Species','Yes','No','No']
eg

Unnamed: 0,TaxonID,Name,Status,Rank,iNat species page,iNat edit taxon,Export
1,425476,Acriopsis,LC/obscured,genus,No,Yes,Yes
2,1141144,Acriopsis emarginata,LC/obscured,species,Yes,No,No
3,425475,Acriopsis liliifolia,LC/obscured,species,Yes,No,No
4,427833,Acriopsis ridleyi,LC/obscured,species,Yes,No,No
5,1037999,Acriopsis indica,LC/obscured,Species,Yes,No,No
