# EPBC Conservation List

List in ALA Species List tool: __[EPBC Act Threatened Species (dr656)](https://lists.ala.org.au/speciesListItem/list/dr656)__

This dataset contains information about species of national environmental significance as listed in the Environment Protection and Biodiversity Conservation Act 1999 (EPBC Act). Data provided includes:
- species names and threatened status
- indicative occurrence within each state, territory and marine area
- links to further information in the Species Profile and Threats Database (SPRAT).

## Documentation
* __[Threatened Species and Ecological Communities of National Environmental Significance (on data.gov.au)](https://data.gov.au/data/dataset/threatened-species-state-lists)__
* __[Dataset landing page](https://data.gov.au/data/dataset/threatened-species-state-lists/resource/78401dce-1f40-49d3-92c4-3713d6e34974)__
* __[Direct data download](https://data.gov.au/data/dataset/ae652011-f39e-4c6c-91b8-1dc2d2dfee8f/resource/78401dce-1f40-49d3-92c4-3713d6e34974/download/20221005spcs.csv)__
* __[API](https://data.gov.au/data/api/1/util/snippet/api_info.html?resource_id=78401dce-1f40-49d3-92c4-3713d6e34974)__

## Setup

In [48]:
import pandas as pd
#projectDir = "/Users/oco115/PycharmProjects/authoritative-lists/"
projectDir = "/Users/new330/IdeaProjects/authoritative-lists/"
sourceDataDir = projectDir + "source-data/EPBCA/"
processedDataDir = projectDir + "current-lists/conservation-lists/"

listURL = "https://data.gov.au/data/dataset/ae652011-f39e-4c6c-91b8-1dc2d2dfee8f/resource/78401dce-1f40-49d3-92c4-3713d6e34974/download/20221005spcs.csv"
filename = "20221005spcs.csv"
listToolURL = "https://lists.ala.org.au/speciesListItem/list/dr656"

## Download file
... or read from cached version

In [37]:
%%script echo skipping # comment this line to download dataset from API

import urllib.request, json
import certifi
import ssl

with urllib.request.urlopen(listURL,context=ssl.create_default_context(cafile=certifi.where())) as url:
 data = pd.read_csv(url)
data.to_csv(sourceDataDir + filename,encoding="UTF-8",index=False)

skipping # comment this line to download dataset from API


In [38]:
epbcreport = pd.read_csv(sourceDataDir + filename,skiprows=0)
epbcreport.head(5)

Unnamed: 0,Scientific Name,Common Name,Current Scientific Name,Threatened status,ACT,NSW,NT,QLD,SA,TAS,...,Profile,Date extracted,NSL Name,Family,Genus,Species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author
0,Abutilon julianae,Norfolk Island Abutilon,-,Critically Endangered,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,-,-,Endl.,-
1,Acacia ammophila,-,-,Vulnerable,-,-,-,Yes,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,-,-,Pedley,-
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",-,Vulnerable,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,-,-,C.A.Gardner ex Court,-
3,Acacia aphylla,Leafless Rock Wattle,-,Vulnerable,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,-,-,Maslin,-
4,Acacia aprica,Blunt Wattle,-,Endangered,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,-,-,Maslin A.R.Chapman,-


## Check values

In [39]:
epbcreport.groupby("Kingdom").size().sort_values(ascending=False)

Kingdom
Plantae     1411
Animalia     562
dtype: int64

In [40]:
epbcreport.groupby("Threatened status").size().sort_values(ascending=False)

Threatened status
Vulnerable                793
Endangered                748
Critically Endangered     319
Extinct                   104
Conservation Dependent      8
Extinct in the wild         1
dtype: int64

## Clean up column headers
* Darwin Core terms column mappings
* Remove multiple spaces from column names
* Remove () characters from column names

In [41]:
epbcreport = epbcreport.rename(columns=
                               {'Scientific Name': 'scientificName',
                                'Common Name': 'vernacularName',
                                'Threatened status': 'status',
                                'Kingdom': 'kingdom',
                                'Phylum': 'phylum',
                                'Class': 'class',
                                'Order': 'order',
                                'Family': 'family',
                                'Genus': 'genus',
                                'Species': 'species',
                                'ACT':'Australian Capital Territory',
                                'NSW': 'New South Wales',
                                'NT': 'Northern Territory',
                                'QLD': 'Queensland',
                                'SA': 'South Australia',
                                'TAS': 'Tasmania',
                                'VIC': 'Victoria',
                                'WA': 'Western Australia',
                                'ACI': 'Ashmore and Cartier Islands',
                                'CKI': 'Cocos (Keeling) Islands',
                                'CI': 'Christmas Island',
                                'CSI': 'Coral Sea Islands',
                                'JBT': 'Jervis Bay Territory',
                                'NFI': 'Norfolk Island',
                                'HMI': 'Heard and McDonald Islands',
                                'AAT': 'Australian Antarctic Territory',
                                'CMA': 'Commonwealth Marine Area',
                                'Listed SPRAT TaxonID' : 'listed sprat taxonID',
                                'Current SPRAT TaxonID': 'current sprat taxonID',
                                'NSL Name': 'nsl name'
                                })

epbcreport['sourceStatus'] = epbcreport['status']

epbcreport.columns = epbcreport.columns.str.replace("  ", "", regex=True) # remove multiple spacesfrom column names
epbcreport.columns = epbcreport.columns.str.replace(r"[().: ]", " ", regex=True) # remove : () from column names
epbcreport.columns = epbcreport.columns.str.strip()
epbcreport.head(5)

Unnamed: 0,scientificName,vernacularName,Current Scientific Name,status,Australian Capital Territory,New South Wales,Northern Territory,Queensland,South Australia,Tasmania,...,Date extracted,nsl name,family,genus,species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author,sourceStatus
0,Abutilon julianae,Norfolk Island Abutilon,-,Critically Endangered,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,-,-,Endl.,-,Critically Endangered
1,Acacia ammophila,-,-,Vulnerable,-,-,-,Yes,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,-,-,Pedley,-,Vulnerable
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",-,Vulnerable,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,-,-,C.A.Gardner ex Court,-,Vulnerable
3,Acacia aphylla,Leafless Rock Wattle,-,Vulnerable,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,-,-,Maslin,-,Vulnerable
4,Acacia aprica,Blunt Wattle,-,Endangered,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,-,-,Maslin A.R.Chapman,-,Endangered


 ## Data cleaning
 * status is not empty
 * correct date formats
 * remove "-" from empty fields

In [42]:
epbc = epbcreport[epbcreport["status"].notna()]
# convert dates
epbc['Date extracted'] = pd.to_datetime(epbc['Date extracted'])
epbc['Date extracted'] = epbc['Date extracted'].dt.strftime('%Y-%m-%d')
epbc = epbc.replace('-','')
epbc

Unnamed: 0,scientificName,vernacularName,Current Scientific Name,status,Australian Capital Territory,New South Wales,Northern Territory,Queensland,South Australia,Tasmania,...,Date extracted,nsl name,family,genus,species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author,sourceStatus
0,Abutilon julianae,Norfolk Island Abutilon,,Critically Endangered,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,,,Endl.,,Critically Endangered
1,Acacia ammophila,,,Vulnerable,,,,Yes,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,,,Pedley,,Vulnerable
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",,Vulnerable,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,,,C.A.Gardner ex Court,,Vulnerable
3,Acacia aphylla,Leafless Rock Wattle,,Vulnerable,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,,,Maslin,,Vulnerable
4,Acacia aprica,Blunt Wattle,,Endangered,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,,,Maslin A.R.Chapman,,Endangered
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1968,Zosterops albogularis,"White-chested White-eye, Norfolk Island Silvereye",,Extinct,,,,,,,...,2022-10-05,,Zosteropidae,Zosterops,albogularis,,,"Gould, 1837",,Extinct
1969,Zosterops strenuus,Robust White-eye,,Extinct,,Yes,,,,,...,2022-10-05,,Zosteropidae,Zosterops,strenuus,,,"Gould, 1855",,Extinct
1970,Zyzomys maini,"Arnhem Rock-rat, Arnhem Land Rock-rat, Kodjperr",,Vulnerable,,,Yes,,,,...,2022-10-05,,Muridae,Zyzomys,maini,,,"Kitchener,1989",,Vulnerable
1971,Zyzomys palatalis,"Carpentarian Rock-rat, Aywalirroomoo",,Endangered,,,Yes,,,,...,2022-10-05,,Muridae,Zyzomys,palatalis,,,"Kitchener,1989",,Endangered


## Check values

In [45]:
epbcreport.groupby("kingdom").size().sort_values(ascending=False)

kingdom
Plantae     1411
Animalia     562
dtype: int64

In [47]:
epbcreport.groupby("status").size().sort_values(ascending=False)

status
Vulnerable                793
Endangered                748
Critically Endangered     319
Extinct                   104
Conservation Dependent      8
Extinct in the wild         1
dtype: int64

## Write to CSV

In [None]:
epbc.to_csv(processedDataDir + "EPBC-conservation.csv",encoding="UTF-8",index=False)

# List check intructions

1. Load the list above into the lists-test tool for this data resource
2. Run the reports below to compare to production. Send the changelog.csv to check. Correct any issues.
3. Save the production list into the `historical lists` directory by uncommenting the code section below.
4. Load the list into production


In [55]:
def download_ala_list(url:str):
 with urllib.request.urlopen(url,context=ssl.create_default_context(cafile=certifi.where())) as url:
  data = json.loads(url.read().decode())
  data = pd.json_normalize(data)
  return data

# download the prod list
prodListUrl = "https://lists.ala.org.au/ws/speciesListItems/dr656?max=10000"
prodList = download_ala_list(prodListUrl)
prodList

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid
0,4075727,Abutilon julianae,Norfolk Island Abutilon,Abutilon julianae,https://id.biodiversity.org.au/node/apni/2900707,dr656
1,4074225,Acacia ammophila,,Acacia ammophila,https://id.biodiversity.org.au/node/apni/2899480,dr656
2,4074678,Acacia anomala,Grass Wattle,Acacia anomala,https://id.biodiversity.org.au/node/apni/2914483,dr656
3,4074871,Acacia aphylla,Leafless Rock Wattle,Acacia aphylla,https://id.biodiversity.org.au/node/apni/2913504,dr656
4,4074515,Acacia aprica,Blunt Wattle,Acacia aprica,https://id.biodiversity.org.au/node/apni/2903843,dr656
...,...,...,...,...,...,...
1953,4074076,Zosterops albogularis,Norfolk Island Silvereye,Zosterops albogularis,https://biodiversity.org.au/afd/taxa/a1fe2952-...,dr656
1954,4074179,Zosterops strenuus,Robust White-eye,Zosterops strenuus,https://biodiversity.org.au/afd/taxa/942564ee-...,dr656
1955,4074434,Zyzomys maini,Arnhem Rock-rat,Zyzomys maini,https://biodiversity.org.au/afd/taxa/3f638397-...,dr656
1956,4074755,Zyzomys palatalis,Carpentarian Rock-rat,Zyzomys palatalis,https://biodiversity.org.au/afd/taxa/54aa72cc-...,dr656


In [56]:
testListUrl = "https://lists-test.ala.org.au/ws/speciesListItems/dr656?max=10000"
testList = download_ala_list(testListUrl)
testList

Unnamed: 0,id,name,commonName,scientificName,lsid,dataResourceUid
0,2751018,Abutilon julianae,Norfolk Island Abutilon,Abutilon julianae,https://id.biodiversity.org.au/node/apni/2900707,dr656
1,2752107,Acacia ammophila,,Acacia ammophila,https://id.biodiversity.org.au/node/apni/2899480,dr656
2,2750998,Acacia anomala,Grass Wattle,Acacia anomala,https://id.biodiversity.org.au/node/apni/2914483,dr656
3,2752597,Acacia aphylla,Leafless Rock Wattle,Acacia aphylla,https://id.biodiversity.org.au/node/apni/2913504,dr656
4,2751819,Acacia aprica,Blunt Wattle,Acacia aprica,https://id.biodiversity.org.au/node/apni/2903843,dr656
...,...,...,...,...,...,...
1968,2751451,Zosterops albogularis,Norfolk Island Silvereye,Zosterops albogularis,https://biodiversity.org.au/afd/taxa/a1fe2952-...,dr656
1969,2751295,Zosterops strenuus,Robust White-eye,Zosterops strenuus,https://biodiversity.org.au/afd/taxa/942564ee-...,dr656
1970,2750796,Zyzomys maini,Arnhem Rock-rat,Zyzomys maini,https://biodiversity.org.au/afd/taxa/3f638397-...,dr656
1971,2750811,Zyzomys palatalis,Carpentarian Rock-rat,Zyzomys palatalis,https://biodiversity.org.au/afd/taxa/54aa72cc-...,dr656


In [78]:
# new names
testvsprod = pd.merge(testList,prodList,how='left',on='name')
testvsprod = testvsprod[testvsprod['scientificName_y'].isna()][['name','commonName_x','scientificName_x']]
testvsprod['status'] = 'added'
# old names
prodvstest = pd.merge(prodList,testList,how='left',on='name')
prodvstest = prodvstest[prodvstest['scientificName_y'].isna()][['name','commonName_y','scientificName_y']]
prodvstest['status'] = 'removed'
# union and display in alphabetical orderf
changelist = pd.concat([testvsprod,prodvstest])
changelist[['status','name','scientificName_x','scientificName_y','commonName_x','commonName_y']].sort_values('name')

Unnamed: 0,status,name,scientificName_x,scientificName_y,commonName_x,commonName_y
82,added,Acanthiza pusilla archibaldi,Acanthiza pusilla archibaldi,,King Island Brown Thornbill,
82,removed,Acanthiza pusilla magnirostris,,,,
139,added,Androcalva perkinsiana,Androcalva perkinsiana,,,
142,removed,Angianthus phyllocalymmeus,,,,
149,removed,Anstisia alba,,,,
...,...,...,...,...,...,...
1895,removed,Vincetoxicum forsteri,,,,
1896,removed,Vincetoxicum rupicola,,,,
1897,removed,Vincetoxicum woollsii,,,,
1923,removed,Zanda baudinii,,,,


In [80]:
# save the prod list to the historical lists directory
prodList.to_csv(projectDir + "historical-lists/conservation/EPBC_Act_Threatened_Species.csv",encoding="UTF-8",index=False)
# save the change log
changelist.to_csv(projectDir + "analysis/change-log/202211-EPBC-conservation.csv",encoding="UTF-8",index=False)