# QLD Conservation Status List

This notebook downloads the Qld lists from the [Qld Government Open Data Portal](https://data.qld.gov.au) and formats them in Darwin Core for ingestion into the ALA Lists tool.
It will save original lists to the `source-data/QLD` directory, process the lists and save them to `current-lists`.

## Lists in the ALA Species List tool
* Conservation list: __[dr652](https://lists.ala.org.au/speciesListItem/list/dr652)__ ([dr652 in test](https://lists-test.ala.org.au/speciesListItem/list/dr652))
* Collection: __[dr652](https://collections.ala.org.au/public/show/dr652)__

## Source Data
Queensland Nature Conservation Act 1992

* __[Conservation](https://apps.des.qld.gov.au/data-sets/wildlife/wildnet/species.csv)__
* __[Species Codes](https://apps.des.qld.gov.au/data-sets/wildlife/wildnet/species-status-codes.csv)__

**Metadata Description**

**Conservation:** The list of taxa from the Department of Environment and Science’s WildNet database with their classification codes under the Nature Conservation Act 1992: Extinct (EX), Extinct in the wild (PE), Critically Endangered (CR), Endangered (E), Vulnerable (V), Near threatened (NT, Least concern (C), Special least concern (SL) and International (I). It is published weekly as the Conservation status of Queensland wildlife dataset in the Queensland Government Data portal.

**Metadata URL**
* Qld Species (Open Data Portal) https://www.data.qld.gov.au/dataset/conservation-status-of-queensland-wildlife
* Queensland Confidential Species (Open Data Portal) https://www.data.qld.gov.au/dataset/queensland-confidential-species
* Qld Species codes https://www.data.qld.gov.au/dataset/conservation-status-of-queensland-wildlife/resource/6344ea93-cadf-4e0c-9ff4-12dfb18d5f14

In [1]:
import datetime

import pandas as pd
import requests
import io
from ftfy import fix_encoding
import urllib.request, json
import certifi
import ssl
import os
import sys

#projectDir = "/Users/oco115/PycharmProjects/authoritative-lists/"
projectDir = "/Users/new330/IdeaProjects/authoritative-lists/"
sys.path.append(os.path.abspath(projectDir + "source-code/includes"))
import list_functions as lf

sourceDataDir = projectDir + "source-data/QLD/"
statusDir = projectDir + "source-data/status-codes/"
processedDataDir = projectDir + "current-lists/"
state = 'QLD'
monthStr = datetime.datetime.now().strftime('%Y%m')
codesurl =  "https://apps.des.qld.gov.au/data-sets/wildlife/wildnet/species-status-codes.csv"
listurl = "https://apps.des.qld.gov.au/data-sets/wildlife/wildnet/species.csv"

testListDruid = "dr652"
prodListDruid = "dr652"

## Download the raw files from data.qld.gov.au
... save locally

In [2]:
# %%script echo skipping # comment this line to download dataset from API

# Status codes
response = requests.get(codesurl)
rtext = fix_encoding(response.text)
speciescodes = pd.read_csv(io.StringIO(rtext))
speciescodes.to_csv(sourceDataDir + "species-status-codes.csv", index=False)

# Conservation List
response = requests.get(listurl)
rtext = fix_encoding(response.text)
conservationlist = pd.read_csv(io.StringIO(rtext))
conservationlist.to_csv(sourceDataDir + "species.csv", index=False)

## Standardise Status Codes
Some minimal changes to some Qld Nature Conservation Act codes so that they are consistent with other states

In [22]:
speciescodes = pd.read_csv(sourceDataDir + "species-status-codes.csv")
ncastatuscodes = speciescodes[speciescodes['Field'] == "NCA_status"][['Code', 'Code_description']]
ncastatuscodes['Code_description'] = ncastatuscodes['Code_description'].str.replace(" wildlife", "")
ncastatuscodes.loc[
    ncastatuscodes['Code_description'] == "Critically endangered", 'Code_description'] = "Critically Endangered"
ncastatuscodes.loc[ncastatuscodes['Code_description'] == "Near threatened", 'Code_description'] = "Near Threatened"
ncastatuscodes

Unnamed: 0,Code,Code_description
20,C,Least concern
21,CR,Critically Endangered
22,E,Endangered
23,EX,Extinct
24,I,International
25,NT,Near Threatened
26,P,Prohibited
27,PE,Extinct in the wild
28,SL,Special least concern
29,V,Vulnerable


## Conservation List
* Join to the codes to expand the code descriptions.
* Change the field names to `sourceStatus` and `status` as required by the ALA's conservation list processing.
* Remove **Least concern**, **Special least concern** and **no status**
* Expand the endemicity and epbc status codes
* Map sourceStatus to status

In [32]:
conservationlist = pd.read_csv(sourceDataDir + "species.csv")
conservationlist = pd.merge(conservationlist,ncastatuscodes,left_on=['NCA_status'],right_on=['Code'],how="left")

conservationlist = conservationlist.rename(columns=
{   'Scientific_name':'scientificName',
    'Common_name': 'vernacularName',
    'Taxon_author':'scientificNameAuthorship',
    'Family': 'family',
    'NCA_status':'sourceStatus',
    'Code_description':'status',
    'Taxon_Id':'WildNetTaxonID'
})

conservationlist['taxonID'] = 'https://apps.des.qld.gov.au/species-search/details/?id=' + conservationlist['WildNetTaxonID'].astype(str)

# remove unwanted rows
conservationlist = conservationlist[(conservationlist['status'].notna())]
conservationlist = conservationlist[~conservationlist['status'].str.contains('Special least concern', case=False)]
conservationlist = conservationlist[~conservationlist['status'].str.contains('Least concern', case=False)]
# remove unwanted cols
conservationlist = conservationlist.loc[:, ['scientificName', 'vernacularName', 'family', 'WildNetTaxonID','taxonID','status', 'sourceStatus']]
# write to file
conservationlist.to_csv(processedDataDir + 'conservation-lists/QLD-' + prodListDruid + '-conservation.csv',encoding="UTF-8",index=False)
conservationlist

Unnamed: 0,scientificName,vernacularName,family,WildNetTaxonID,taxonID,status,sourceStatus
0,Adelotus brevis,tusked frog,Limnodynastidae,706,https://apps.des.qld.gov.au/species-search/det...,Vulnerable,V
15,Philoria kundagungan,red-and-yellow mountainfrog,Limnodynastidae,687,https://apps.des.qld.gov.au/species-search/det...,Endangered,E
19,Assa darlingtoni,pouched frog,Myobatrachidae,693,https://apps.des.qld.gov.au/species-search/det...,Vulnerable,V
25,Crinia tinnula,wallum froglet,Myobatrachidae,686,https://apps.des.qld.gov.au/species-search/det...,Vulnerable,V
29,Mixophyes fleayi,Fleay's barred frog,Myobatrachidae,675,https://apps.des.qld.gov.au/species-search/det...,Endangered,E
...,...,...,...,...,...,...,...
21824,Macrozamia serpentina,,Zamiaceae,27445,https://apps.des.qld.gov.au/species-search/det...,Endangered,E
21825,Macrozamia viridis,,Zamiaceae,6482,https://apps.des.qld.gov.au/species-search/det...,Endangered,E
21830,Alpinia hylandii,,Zingiberaceae,8948,https://apps.des.qld.gov.au/species-search/det...,Near Threatened,NT
21833,Amomum queenslandicum,,Zingiberaceae,8949,https://apps.des.qld.gov.au/species-search/det...,Vulnerable,V


In [27]:
conservationlist['status'].unique()

array(['Vulnerable', 'Endangered', 'Extinct in the wild',
       'Near Threatened', 'Critically Endangered', 'Extinct'],
      dtype=object)

### Change Logs
Upload the file to the test environment before running the below cell to compare it to the list in production.
- check record counts old vs new and verify count in change log
- send to domain experts for verification

In [33]:
ltype = "C"
changeDir = "Monitoring/Change-logs/"
filename = "QLD-conservation.csv"
changelist = lf.get_changelist(testListDruid, prodListDruid, ltype)
changelist.to_csv(projectDir + changeDir + monthStr + "-" + filename, encoding="UTF-8", index=False)
changelist

get_changelist: Test -  dr652 Prod -  dr652
download_ala_list:  https://lists.ala.org.au/ws/speciesListItems/dr652?max=10000&includeKVP=true
Index(['id', 'name', 'commonName', 'scientificName', 'lsid', 'dataResourceUid',
       'kvpValues'],
      dtype='object')
download_ala_list:  https://lists-test.ala.org.au/ws/speciesListItems/dr652?max=10000&includeKVP=true
Index(['id', 'name', 'commonName', 'scientificName', 'lsid', 'dataResourceUid',
       'kvpValues'],
      dtype='object')


Unnamed: 0,name,scientificName_old,scientificName_new,commonName_old,commonName_new,status_old,status_new,listUpdate
773,Aggreflorum luehmannii,,,,,,Vulnerable,added
774,Aggreflorum pallidum,,,,,,Near Threatened,added
810,Gaudium venustum,,,,,,Vulnerable,added
896,Anacolosa australis,,Anacolosa,,,,Near Threatened,added
1170,Bergera crenulata,,Murraya,,,,Endangered,added
...,...,...,...,...,...,...,...,...
1315,Pimelea leptospermoides,,Pimelea leptospermoides,,,Near threatened,Near Threatened,status change
1318,Pimelea umbratica,,Pimelea umbratica,,,Near threatened,Near Threatened,status change
1321,Bubbia queenslandiana subsp. queenslandiana,,Bubbia queenslandiana subsp. queenslandiana,,Winterwood,Near threatened,Near Threatened,status change
1328,Macrozamia longispina,,Macrozamia longispina,,,Near threatened,Near Threatened,status change


In [12]:
# Download historical lists from Production
prodListUrl = "https://lists.ala.org.au/ws/speciesListItems/" + proddr + "?max=10000&includeKVP=true"
prodList = lf.download_ala_specieslist(prodListUrl)  # save the prod list to the historical lists directory
prodList = lf.kvp_to_columns(prodList)
prodList.to_csv(projectDir + "historical-lists/conservation/" + filename, encoding="UTF-8", index=False)
print('Finished downloading conservation historical list')

download_ala_list:  https://lists.ala.org.au/ws/speciesListItems/dr652?max=10000&includeKVP=true
Index(['id', 'name', 'commonName', 'scientificName', 'lsid', 'dataResourceUid',
       'kvpValues'],
      dtype='object')
Finished downloading conservation historical list
