# EPBC Conservation List

### List Identifiers
* EPBC Conservation list: __[dr656](https://lists.ala.org.au/speciesListItem/list/dr656)__ (and [dr656 in test](https://lists-test.ala.org.au/speciesListItem/list/dr656))
* BONN Conservation list: __[dr18987](https://lists.ala.org.au/speciesListItem/list/dr18987)__
* CAMBA Conservation list: __[dr18989](https://lists.ala.org.au/speciesListItem/list/dr18989)__
* JAMBA Conservation list: __[dr18988](https://lists.ala.org.au/speciesListItem/list/dr18988)__
* ROKAMBA Conservation list: __[dr18990](https://lists.ala.org.au/speciesListItem/list/dr18990)__

### Collections:
* EPBC:  __[dr656](https://collections.ala.org.au/public/show/dr656)__
* BONN:  __[dr18987](https://collections.ala.org.au/public/show/dr18987)__
* CAMBA:  __[dr18989](https://collections.ala.org.au/public/show/dr18989)__
* JAMBA:  __[dr18988](https://collections.ala.org.au/public/show/dr18988)__
* ROKAMBA:  __[dr18990](https://collections.ala.org.au/public/show/dr18990)__

## Source Data
* __[Threatened Species and Ecological Communities of National Environmental Significance (on data.gov.au)](https://data.gov.au/data/dataset/threatened-species-state-lists)__
* __[Dataset landing page](https://data.gov.au/data/dataset/threatened-species-state-lists/resource/78401dce-1f40-49d3-92c4-3713d6e34974)__
* __[Direct data download](https://data.gov.au/data/dataset/ae652011-f39e-4c6c-91b8-1dc2d2dfee8f/resource/78401dce-1f40-49d3-92c4-3713d6e34974/download/20221005spcs.csv)__
* __[API](https://data.gov.au/data/api/1/util/snippet/api_info.html?resource_id=78401dce-1f40-49d3-92c4-3713d6e34974)__


**Metadata URL**
https://data.gov.au/data/dataset/threatened-species-state-lists

**Metadata Description**

**EPBC**

This dataset contains information about species of national environmental significance as listed in the Environment Protection and Biodiversity Conservation Act 1999 (EPBC Act). Data provided includes:
- species names and threatened status
- indicative occurrence within each state, territory and marine area
- links to further information in the Species Profile and Threats Database (SPRAT).

Threatened species currently listed under the Environment Protection and Biodiversity Conservation Act 1999 (EPBC Act) sourced from the Department of Agriculture, Water and the Environment's Species Profiles and Threats database (SPRAT). Data includes: - Status of EPBC Act threatened, migratory, marine and cetacean species. - Available species recovery documents for EPBC Act species (recovery plan and conservation advice). - Species' taxonomy. - Status of State and Territory Government threatened species (indicative only). Please verify with relevant State and Territory Government agencies. - Species presence within State's, mainland Territory's, external Territory's and the Commonwealth marine area (indicative only). The report presents the current taxonomic understanding of a species. Every effort is made to maintain the accuracy of data that is used in this list. If a species is listed more than once on a statutory list (as a result of taxonomic change), it will be given more than one status in the status column (i.e. "status 1, status 2"). The status and presence of State and Territory Government threatened species should be verified with relevant agencies. Further details available at: https://www.environment.gov.au/cgi-bin/sprat/public/sprat.pl

**BONN**

The Bonn Convention appendices identify migratory taxa at and below the species level, as well as some whole families in Appendix II. The convention definition of migratory species is ‘the entire population or any geographically separate part of the population of any species or lower taxon of wild animals, a significant proportion of whose members cyclically and predictably cross one or more national jurisdictional boundaries’. This definition has been adopted in the EPBC Act. The family listings in Appendix II of the Bonn Convention do not explicitly include all members of each family and, for the purposes of the EPBC Act, the family listings include only those species which are native to Australia and are known to be cyclical and predictable migrants into and out of Australia. The taxonomy of families used in the Bonn Convention appendices is not regularly updated. The following codes are used in the Bonn Convention appendices: A1: species listed explicitly in Appendix 1; A2S: species listed explicitly in Appendix 2; A2H: species is member of a family listed in Appendix 2; and A2S*: species listed as a result of a recent taxonomic revision of a species listed explicitly in Appendix 2. More information is available at https://www.environment.gov.au/cgi-bin/sprat/public/publicshowmigratory.pl

**CAMBA**

This list of migratory species established under section 209 of the EPBC Act includes migratory species included in annexes established under the China-Australia Migratory Bird Agreement (CAMBA).

**JAMBA**

This list of migratory species established under section 209 of the EPBC Act includes migratory species included in annexes established under the Japan-Australia Migratory Bird Agreement (JAMBA).

**ROKAMBA**
This list of migratory species established under section 209 of the EPBC Act includes native, migratory species identified in a list established under, or an instrument made under, an international agreement approved by the Minister, such as the Republic of Korea-Australia Migratory Bird Agreement (ROKAMBA). See http://www.austlii.edu.au/au/other/dfat/treaties/2007/24.html



## Setup

In [6]:
import pandas as pd
import os
import sys
projectDir = "/Users/oco115/PycharmProjects/authoritative-lists/"
# projectDir = "/Users/new330/IdeaProjects/authoritative-lists/"
sys.path.append(os.path.abspath(projectDir + "source-code/includes"))
import list_functions as lf

sourceDataDir = projectDir + "source-data/EPBCA/"
processedDataDir = projectDir + "current-lists/conservation-lists/"

listURL = "https://data.gov.au/data/dataset/ae652011-f39e-4c6c-91b8-1dc2d2dfee8f/resource/78401dce-1f40-49d3-92c4-3713d6e34974/download/20221005spcs.csv"
filename = "20221005spcs.csv"
listToolURL = "https://lists.ala.org.au/speciesListItem/list/dr656"

## Download file
... or read from cached version

In [37]:
%%script echo skipping # comment this line to download dataset from API

import urllib.request, json
import certifi
import ssl

with urllib.request.urlopen(listURL,context=ssl.create_default_context(cafile=certifi.where())) as url:
data = pd.read_csv(url)
data.to_csv(sourceDataDir + filename,encoding="UTF-8",index=False)

skipping # comment this line to download dataset from API


In [91]:
epbcreport = pd.read_csv(sourceDataDir + filename,skiprows=0)
epbcreport.head(5)

Unnamed: 0,Scientific Name,Common Name,Current Scientific Name,Threatened status,ACT,NSW,NT,QLD,SA,TAS,...,Profile,Date extracted,NSL Name,Family,Genus,Species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author
0,Abutilon julianae,Norfolk Island Abutilon,-,Critically Endangered,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,-,-,Endl.,-
1,Acacia ammophila,-,-,Vulnerable,-,-,-,Yes,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,-,-,Pedley,-
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",-,Vulnerable,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,-,-,C.A.Gardner ex Court,-
3,Acacia aphylla,Leafless Rock Wattle,-,Vulnerable,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,-,-,Maslin,-
4,Acacia aprica,Blunt Wattle,-,Endangered,-,-,-,-,-,-,...,http://www.environment.gov.au/cgi-bin/sprat/pu...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,-,-,Maslin A.R.Chapman,-


## Check values

In [92]:
epbcreport.groupby("Kingdom").size().sort_values(ascending=False)

Kingdom
Plantae     1411
Animalia     562
dtype: int64

In [93]:
epbcreport.groupby("Threatened status").size().sort_values(ascending=False)

Threatened status
Vulnerable                793
Endangered                748
Critically Endangered     319
Extinct                   104
Conservation Dependent      8
Extinct in the wild         1
dtype: int64

## Clean up column headers
* Darwin Core terms column mappings
* Remove multiple spaces from column names
* Remove () characters from column names

In [94]:
epbcreport = epbcreport.rename(columns=
                               {'Scientific Name': 'scientificName',
                                'Common Name': 'vernacularName',
                                'Threatened status': 'status',
                                'Kingdom': 'kingdom',
                                'Phylum': 'phylum',
                                'Class': 'class',
                                'Order': 'order',
                                'Family': 'family',
                                'Genus': 'genus',
                                'Species': 'species',
                                'ACT':'Australian Capital Territory',
                                'NSW': 'New South Wales',
                                'NT': 'Northern Territory',
                                'QLD': 'Queensland',
                                'SA': 'South Australia',
                                'TAS': 'Tasmania',
                                'VIC': 'Victoria',
                                'WA': 'Western Australia',
                                'ACI': 'Ashmore and Cartier Islands',
                                'CKI': 'Cocos (Keeling) Islands',
                                'CI': 'Christmas Island',
                                'CSI': 'Coral Sea Islands',
                                'JBT': 'Jervis Bay Territory',
                                'NFI': 'Norfolk Island',
                                'HMI': 'Heard and McDonald Islands',
                                'AAT': 'Australian Antarctic Territory',
                                'CMA': 'Commonwealth Marine Area',
                                'Listed SPRAT TaxonID' : 'Listed Sprat TaxonID',
                                'Current SPRAT TaxonID': 'Current Sprat TaxonID',
                                'NSL Name': 'Nsl Name'
                                })

epbcreport['sourceStatus'] = epbcreport['status']

epbcreport.columns = epbcreport.columns.str.replace("  ", "", regex=True) # remove multiple spacesfrom column names
epbcreport.columns = epbcreport.columns.str.replace(r"[().: ]", " ", regex=True) # remove : () from column names
epbcreport.columns = epbcreport.columns.str.strip()
epbcreport.head(5)

Unnamed: 0,scientificName,vernacularName,Current Scientific Name,status,Australian Capital Territory,New South Wales,Northern Territory,Queensland,South Australia,Tasmania,...,Date extracted,Nsl Name,family,genus,species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author,sourceStatus
0,Abutilon julianae,Norfolk Island Abutilon,-,Critically Endangered,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,-,-,Endl.,-,Critically Endangered
1,Acacia ammophila,-,-,Vulnerable,-,-,-,Yes,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,-,-,Pedley,-,Vulnerable
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",-,Vulnerable,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,-,-,C.A.Gardner ex Court,-,Vulnerable
3,Acacia aphylla,Leafless Rock Wattle,-,Vulnerable,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,-,-,Maslin,-,Vulnerable
4,Acacia aprica,Blunt Wattle,-,Endangered,-,-,-,-,-,-,...,2022-Oct-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,-,-,Maslin A.R.Chapman,-,Endangered


 ## Data cleaning
 * status is not empty
 * correct date formats
 * remove "-" from empty fields

In [95]:
epbc = epbcreport[epbcreport["status"].notna()]
# convert dates
epbc['Date extracted'] = pd.to_datetime(epbc['Date extracted'])
epbc['Date extracted'] = epbc['Date extracted'].dt.strftime('%Y-%m-%d')
epbc = epbc.replace('-','')
epbc

Unnamed: 0,scientificName,vernacularName,Current Scientific Name,status,Australian Capital Territory,New South Wales,Northern Territory,Queensland,South Australia,Tasmania,...,Date extracted,Nsl Name,family,genus,species,Infraspecific Rank,Infraspecies,Species Author,Infraspecies Author,sourceStatus
0,Abutilon julianae,Norfolk Island Abutilon,,Critically Endangered,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/56543,Malvaceae,Abutilon,julianae,,,Endl.,,Critically Endangered
1,Acacia ammophila,,,Vulnerable,,,,Yes,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58340,Fabaceae,Acacia,ammophila,,,Pedley,,Vulnerable
2,Acacia anomala,"Grass Wattle, Chittering Grass Wattle",,Vulnerable,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58556,Fabaceae,Acacia,anomala,,,C.A.Gardner ex Court,,Vulnerable
3,Acacia aphylla,Leafless Rock Wattle,,Vulnerable,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/58565,Fabaceae,Acacia,aphylla,,,Maslin,,Vulnerable
4,Acacia aprica,Blunt Wattle,,Endangered,,,,,,,...,2022-10-05,https://id.biodiversity.org.au/name/apni/149911,Fabaceae,Acacia,aprica,,,Maslin A.R.Chapman,,Endangered
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1968,Zosterops albogularis,"White-chested White-eye, Norfolk Island Silvereye",,Extinct,,,,,,,...,2022-10-05,,Zosteropidae,Zosterops,albogularis,,,"Gould, 1837",,Extinct
1969,Zosterops strenuus,Robust White-eye,,Extinct,,Yes,,,,,...,2022-10-05,,Zosteropidae,Zosterops,strenuus,,,"Gould, 1855",,Extinct
1970,Zyzomys maini,"Arnhem Rock-rat, Arnhem Land Rock-rat, Kodjperr",,Vulnerable,,,Yes,,,,...,2022-10-05,,Muridae,Zyzomys,maini,,,"Kitchener,1989",,Vulnerable
1971,Zyzomys palatalis,"Carpentarian Rock-rat, Aywalirroomoo",,Endangered,,,Yes,,,,...,2022-10-05,,Muridae,Zyzomys,palatalis,,,"Kitchener,1989",,Endangered


## Check values

In [96]:
epbcreport.groupby("kingdom").size().sort_values(ascending=False)

kingdom
Plantae     1411
Animalia     562
dtype: int64

In [97]:
epbcreport.groupby("status").size().sort_values(ascending=False)

status
Vulnerable                793
Endangered                748
Critically Endangered     319
Extinct                   104
Conservation Dependent      8
Extinct in the wild         1
dtype: int64

## Write to CSV

In [98]:
epbc.to_csv(processedDataDir + "EPBC-conservation.csv",encoding="UTF-8",index=False)

# Manual List check

**Instructions**
1. Load the list above into the lists-test tool for this data resource
2. Unskip the below code and Run the reports below to compare to production. Send the changelog.csv to check. Correct any issues.
3. Save the production list into the `historical lists` directory by uncommenting the code section below.
4. Load the list from `current lists` into production

### Conservation List - Download old and new and compare

In [None]:
# %%script echo skipping # comment this line to run this code

import datetime
monthStr = datetime.datetime.now().strftime('%Y%m')
ltype ="C"
# conservation
filename = "NT-conservation.csv"
prodListUrl = "https://lists.ala.org.au/ws/speciesListItems/" + "dr656" + "?max=10000&includeKVP=true"
testListUrl = "https://lists-test.ala.org.au/ws/speciesListItems/" + "dr656" + "?max=10000&includeKVP=true"
changelist = lf.get_changelist(testListUrl, prodListUrl, ltype)
# save the lists locally
changelist.to_csv(projectDir + "analysis/change-log/" + monthStr + "-" + filename, encoding="UTF-8", index=False)

changelist

### Old code in next cell requires check against new code above

In [None]:
#### Old code requires check against new

# def download_ala_list(url:str):
#  with urllib.request.urlopen(url,context=ssl.create_default_context(cafile=certifi.where())) as url:
#   data = json.loads(url.read().decode())
#   data = pd.json_normalize(data)
#   return data
#
# # download the prod list
# prodListUrl = "https://lists.ala.org.au/ws/speciesListItems/dr656?max=10000"
# prodList = download_ala_list(prodListUrl)
# testListUrl = "https://lists-test.ala.org.au/ws/speciesListItems/dr656?max=10000"
# testList = download_ala_list(testListUrl)
#
# # new names
# testvsprod = pd.merge(testList,prodList,how='left',on='name')
# testvsprod = testvsprod[testvsprod['scientificName_y'].isna()][['name','commonName_x','scientificName_x']]
# testvsprod['status'] = 'added'
# # old names
# prodvstest = pd.merge(prodList,testList,how='left',on='name')
# prodvstest = prodvstest[prodvstest['scientificName_y'].isna()][['name','commonName_y','scientificName_y']]
# prodvstest['status'] = 'removed'
# # union and display in alphabetical order
# changelist = pd.concat([testvsprod,prodvstest])
#
# # save the prod list to the historical lists directory
# prodList.to_csv(projectDir + "historical-lists/conservation/EPBC_Act_Threatened_Species.csv",encoding="UTF-8",index=False)
# # save the change log
# changelist.to_csv(projectDir + "analysis/change-log/202211-EPBC-conservation.csv",encoding="UTF-8",index=False)


In [None]:
# changelist[['status','name','scientificName_x','commonName_x']].sort_values('name')

### Download Production lists to Historical Lists directory

In [None]:
prodList = lf.download_ala_list(prodListUrl)  # save the prod list to the historical lists directory
prodList = lf.kvp_to_columns(prodList)
prodList.to_csv(projectDir + "historical-lists/conservation/" + filename, encoding="UTF-8", index=False)
print('Finished downloading historical list')