# Summary report exported as a workbook

Author: [José R. Ferrer-Paris](https://github.com/jrfep)

This notebook:
- Reads information from the database,
- Exports the list of records as a flat CSV file, and
- Creates a workbook with:
    - Authoring information and instruction
    - Summary table for species with links
    - Trait codes and descriptions
    - Vocabularies
    - List of references

The outputs of this notebook are available as a dataset record at:

> Ferrer-Paris, José R.; Keith, D A (2024). Fire Ecology Traits for Plants: Database exports. figshare. Dataset. https://doi.org/10.6084/m9.figshare.24125088.v1

## Setup

These sections include basic set up for the project

### Import modules

In [1]:
# work with paths in operating system
from pathlib import Path
import os
import sys
# datetime support
import datetime

# work with xlsx workbooks
import openpyxl
from openpyxl import Workbook
from openpyxl.worksheet.table import Table, TableStyleInfo
from openpyxl.styles import Alignment, PatternFill, Border, Font # Side, Alignment, Protection,
from openpyxl.formatting import Rule
from openpyxl.styles.differential import DifferentialStyle
from openpyxl.worksheet.datavalidation import DataValidation
from openpyxl.comments import Comment
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.utils import get_column_letter

# For database connection
from configparser import ConfigParser
import psycopg2
from psycopg2.extras import DictCursor

# Pandas for calculations
import pandas as pd
# Pyprojroot for easier handling of working directory
import pyprojroot


### Define paths for input and output

In [2]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
sys.path.append(str(repodir))
inputdir = repodir / "data" / "output-report"

### Load own functions
Load functions from `lib` folder, we will use a function to read db credentials, one for executing database queries and three functions for extracting data from the reference description string

In [3]:
from lib.parseparams import read_dbparams
from lib.firevegdb import dbquery
import lib.firevegxport as fvx

## Read information from database

### Database connection parameters
Database credentials are stored in a `database.ini` file.

In [4]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', section='aws-lght-sl')

### Database queries

In [5]:
qrystr= """
SELECT visit_id, visit_date, sample_nr, species, species_code::int, "currentScientificNameCode" as bionet_code,
resprout_organ, seedbank,
adults_unburnt, resprouts_live, resprouts_died, resprouts_kill, resprouts_reproductive,
recruits_live, recruits_reproductive, recruits_died,
scorch, life_stage, comments
FROM form.quadrat_samples
LEFT JOIN species.caps
ON species_code::text="speciesCode_Synonym"
ORDER BY visit_id,visit_date,sample_nr,species_code
;
"""
res = dbquery(qrystr, dbparams)
df = pd.DataFrame(res,columns=['visit_id', 'visit_date', 'sample_nr', 'species', 'species_code','bionet_code','resprout_organ', 'seedbank',
'adults_unburnt', 'resprouts_live', 'resprouts_died', 'resprouts_kill', 'resprouts_reproductive',
'recruits_live', 'recruits_reproductive', 'recruits_died',
'scorch', 'life_stage', 'comments'])

In [6]:
#df

In [7]:
qrystr="""
WITH fh AS (SELECT site_label,MAX(latest_date) as latest_fire
FROM form.field_site
LEFT JOIN form.fire_history USING (site_label)
GROUP BY site_label)
SELECT s.site_label, location_description, elevation, ST_X(ST_Transform(geom,4326)) as longitude, ST_Y(ST_Transform(geom,4326)) as latitude,
visit_date, visit_description, userkey, givennames, surname, observerlist, survey_name, latest_fire
FROM form.field_visit 
LEFT JOIN form.field_site s on visit_id=s.site_label 
LEFT JOIN form.observerid ON mainobserver=userkey
LEFT JOIN fh ON visit_id=fh.site_label
ORDER BY survey_name,site_label,visit_date;
"""

sitelist = dbquery(qrystr, dbparams)

In [8]:
qrystr= """
SELECT distinct species, species_code::int, 
"speciesID" as species_id, "taxonRank", family, 
"scientificName",
"scientificNameAuthorship",
"vernacularName", 
"isCurrent",
"currentScientificNameCode" as sppcode,
"currentScientificName",
"sortOrder"
FROM form.quadrat_samples
LEFT JOIN species.caps
ON species_code::text="speciesCode_Synonym"
ORDER BY "sortOrder"
"""

spplist = dbquery(qrystr, dbparams)
sppdf = pd.DataFrame(spplist,columns=[ 'species', 'species code',
                                   'species id', 'taxon rank', 'family',
                                  'scientific name','authorship','vernacular name',
                                   'current','current code','current name', 
                                   'sort order'])

In [9]:
sppdf

Unnamed: 0,species,species code,species id,taxon rank,family,scientific name,authorship,vernacular name,current,current code,current name,sort order
0,Pseuderanthemum variabile,1010.0,9013.0,Species,Acanthaceae,Pseuderanthemum variabile,(R.Br.) Radlk.,Pastel Flower,true,1010,Pseuderanthemum variabile,5698.0
1,Sambucus australasica,1953.0,3692.0,Species,Adoxaceae,Sambucus australasica,(Lindl.) Fritsch,Native Elderberry,true,1953,Sambucus australasica,5723.0
2,Tetragonia tetragonoides,1040.0,3407.0,Species,Aizoaceae,Tetragonia tetragonoides,(Pall.) Kuntze,New Zealand Spinach,false,11185,Tetragonia tetragonioides,5804.0
3,Akania bidwillii,8978.0,10372.0,Species,Akaniaceae,Akania bidwillii,(Hogg) Mabb.,Turnipwood,true,8978,Akania bidwillii,5814.0
4,Ptilotus exaltatus var. exaltatus,6599.0,8650.0,Variety,Amaranthaceae,Ptilotus exaltatus var. exaltatus,Nees ex Lehm.,Tall Mulla Mulla,true,6599,Ptilotus exaltatus var. exaltatus,5902.0
...,...,...,...,...,...,...,...,...,...,...,...,...
1109,Vittadinia spp.,,,,,,,,,,,
1110,Vittadinia spp.,,,,,,,,,,,
1111,Waitzia spp.,,,,,,,,,,,
1112,Yellow daisy,,,,,,,,,,,


## Export to CSV

In [10]:
df.to_csv(inputdir / "fireveg-field-records.csv")

## Create workbook

### Styles
Define styles to be used across the workbook

In [11]:
cent_align=Alignment(horizontal='center', vertical='center', wrap_text=False)
wrap_align=Alignment(horizontal='left', vertical='top', wrap_text=True)

fontSmall = Font(size = "9")


sheet_colors = {"intro": "1072BA" , "summary": "5AFF5A", "default":"505050", "addentry": "20CA82"}

table_style={"Instructions":TableStyleInfo(name="TableStyleMedium9", showFirstColumn=True, showLastColumn=False, 
                                           showRowStripes=True, showColumnStripes=False),
             "Contributor": TableStyleInfo(name="TableStyleMedium18", showFirstColumn=True,
                       showLastColumn=False, showRowStripes=False, showColumnStripes=False),
             "Lists": TableStyleInfo(name="TableStyleMedium14", showFirstColumn=True,
                       showLastColumn=False, showRowStripes=False, showColumnStripes=False),
             "Info":  TableStyleInfo(name="TableStyleMedium18", showFirstColumn=True,
                       showLastColumn=False, showRowStripes=False, showColumnStripes=False),
             "Vocabularies": TableStyleInfo(name="TableStyleMedium14", showFirstColumn=True,
                       showLastColumn=False, showRowStripes=False, showColumnStripes=False),
             "Entry": TableStyleInfo(name="TableStyleMedium18", showFirstColumn=False,
                       showLastColumn=False, showRowStripes=False, showColumnStripes=False)

             }




helper functions:

In [12]:
def addComment(value):
    comment = Comment(value, 'JRFP')
    comment.width = 150
    comment.height = 70
    return(comment)

### Create workbook and worksheets

In [13]:
wb = Workbook()

In [14]:
wsheets = (
    {"title": "About", "colWidths":[("A",90),("B",40)], "tabColor":"intro","active":True},
    {"title": "Sites", "colWidths":[(("A","B"),20),
                                    (("D"),20),
                                    (("C","G","I","J"),45),
                                    (("E","F","H"),12),
                                    ], "tabColor":"summary"},
     {"title": "Samples", "colWidths":[(("A","B","E","F"),12),
                                       (("C"),7),
                                       (("G","H"),20),
                                       (("D","Q"),45),
                                       (("I","J","K","L","M","N","O","P"),6)], "tabColor":"summary"},
   {"title": "Species", "colWidths":[(("A","E"),12),
                                     (("B","I"),45),
                                     (("C","D",),30),
                                     (("F","G","H"),6)], "tabColor":"summary"},
    {"title": "Vocabularies", "colWidths":[("A",12),("B",30),("C",70)], "tabColor":"default"},
   
    )
for item in wsheets:
    if "active" in item.keys():
        ws = wb.active
        ws.title = item['title']
    else:
        ws = wb.create_sheet(item['title'])
    for k in item['colWidths']:
        for j in k[0]:
            ws.column_dimensions[j].width = k[1]
    ws.sheet_properties.tabColor = sheet_colors[item["tabColor"]]


### `About` worksheet

In [15]:
ws = wb["About"]

info = ("Fire Ecology Traits for Plants",
        "Version 1.00 (April 2022)",
        "This data export reflects the status of the database on the %s" % datetime.date.today().strftime('%d %b %Y'),
        "Developed by  José R. Ferrer-Paris and David Keith",
        "Centre for Ecosystem Science / University of New South Wales",
       "Please cite this work as:",
        "Ferrer-Paris, J. R. and Keith, D. A. (2024) Fire Ecology Traits for Plants: Database export. figshare. DOI: 10.6084/m9.figshare.24125088", 
        #"DISCLAIMER:",
        #"DATA IS NOT READY FOR FINAL USE OR CRITICAL APPLICATIONS AND YOU SHOULD NOT DISTRIBUTE THIS DATA."
        )

k = 1
for row in info:
    ws.cell(k,1,value=row)
    ws.cell(k,1).alignment=wrap_align
    k=k+1
    
ws.cell(1,1).style='Title'
ws.cell(5,1).hyperlink='https://www.unsw.edu.au/research/ecosystem'
ws.cell(5,1).style='Hyperlink'

# Disclaimer
ws.cell(8,1).font=Font(color="FF0000", bold=True,italic=False) 
ws.cell(9,1).font=Font(color="FF0000", italic=True) 


supporters = ({'institution':"University of New South Wales",'url':"https://www.unsw.edu.au/"},
              {'institution':"NSW Bushfire Research Hub",'url':"https://www.bushfirehub.org/"},
              {'institution':"NESP Threatened Species Recovery Hub",'url':"https://www.nespthreatenedspecies.edu.au/"},
              {'institution':"NSW Department of Planning & Environment",'url':"https://www.planning.nsw.gov.au/"})

k=k+2
ws.cell(k-1,1,value="This work has been supported by:")
for item in supporters:
    cell=ws.cell(k,1)
    cell.value=item['institution']
    cell.hyperlink=item['url']
    cell.style = "Hyperlink"
    k=k+1

k=k+2
description = (
              "Taxonomic nomenclature following BioNET (data export from February 2022)",
              "Data in the report is summarised based on BioNET fields 'currentScientificName' and 'currentScientificNameCode'",
              #"For general description of the traits, please refer to the 'Trait description' sheet",
              #"Vocabularies for categorical traits are available in the 'Vocabularies' sheet",
              #"For categorical traits the values in the 'Summary' sheet show the different values reported in the literature records separated by slashes.",
              # "If more than one category has been reported, the values are ordered from higher to lower 'weight', categories receiving less than 10% weight are in round brackets, categories with less than 5% in square brackets",
              #"The default weight is calculated by multiplying the number of times a value is reported (nr. of records) with the weight given to each record (default to 1), and divided by the weight of all records for a given species.",
              #"Default weights  overridden by expert advice to the administrator will be marked, with justification given in the Notes column of the output.",
              #"An asterisk (*) in a trait cell indicates a potential data entry error or uncertainty in the assignment of a trait category or value.",
              #"'Import/Entry sources' refer to references that were imported directly using automated scripts or manual entry. These include: 1) Primary observations of traits from published research or reports; and 2) Compilations of data (e.g. databases, spreadsheets, published reviews) that include two or more sources of primary observations.",
              #"'Indirect sources' refer to references that were cited in Import/Entry sources, where the latter are compilations of multiple primary sources (see Import/Entry sources). Information from indirect sources may have been modified when it was incorporated into those compilations. The original source of primary trait observations has not yet been verified prior to import into this database. When the primary source is reviewed and the trait values are verified, these records will be attributed to the primary source as 'Import/Entry sources'.",
              #"Some sheets are protected to avoid accidental changes, but they are not password protected. If you need to filter and reorder entries in the table, please unprotect the sheet first.",
              )

for row in description:
    ws.cell(k,1,value=row)
    ws.cell(k,1).alignment=wrap_align
    k=k+1
    
ws.protection.sheet = True

### `Species` worksheet

In [16]:
ws = wb["Species"]
ws.append(['Family','Scientific name (as entered)','Authorship','Vernacular name','Taxon rank', 
           'CAPS code', 'BioNET id', 
           'Current code','Current name (according to BioNET)',])

k=2
for row in spplist:
    if row['isCurrent']=='true':
        clr="000000"
    else:
        clr="F10000"
        ws.cell(k,8,value=row["sppcode"])
        ws.cell(k,9,value=row["currentScientificName"])
        ws.cell(k,9).font = Font(italic=True, color="0000F2")

    
    ws.cell(k,1,value=row['family'])
    ws.cell(k,2,value=row['species'])
    
    if row["scientificName"] is None:
        ws.cell(k,9,value='No match with BioNET data')
    else:
        if row['species']!=row["scientificName"]:
            ws.cell(k,2).comment = addComment("probably misspelling of %s" % row["scientificName"])
    
    ws.cell(k,2).font = Font(italic=True, color=clr)
                
    ws.cell(k,3,value=row["scientificNameAuthorship"])
    ws.cell(k,4,value=row["vernacularName"])
    ws.cell(k,5,value=row['taxonRank'])
    ws.cell(k,5).font = Font( color=clr)
    ws.cell(k,6,value=row['species_code'])
    ws.cell(k,7,value=row['species_id'])
  
    for j in (6,7,8):
        ws.cell(k,j).alignment=cent_align
    k=k+1

tab = Table(displayName="SpeciesList", ref="A{}:I{}".format(1,ws.max_row))
tab.tableStyleInfo = table_style["Info"]
ws.add_table(tab)
ws.protection.sheet = True

In [17]:
spplist[10]['isCurrent']=='true'

True

### `Samples` worksheet

Data from all samples

In [18]:
ws = wb["Samples"]
ws.cell(row=1,column=1,value='Sample information')
ws.cell(row=1,column=4,value='Species information')
ws.cell(row=1,column=7,value='Species response')
ws.cell(row=1,column=9,value='Adults/Juveniles')
ws.cell(row=1,column=14,value='Recruits')
for k in (1,4,7,9,14):
    ws.cell(1,k).alignment=cent_align
    ws.cell(1,k).font  = Font(b=True, color="110110")
ws.merge_cells('A1:C1')
ws.merge_cells('D1:F1')
ws.merge_cells('G1:H1')
ws.merge_cells('I1:M1')
ws.merge_cells('N1:P1')

ws.append(['Site label','Visit Date', 'Subplot', 'Scientific name', 'CAPS Code', 'BioNET Code',
           'Resprout organ', 'Seedbank', 'L','U','R','D','K', 'L','R','D', 'Comments'])

ws["I2"].comment = addComment('# Live')
ws["J2"].comment = addComment('# Live unburnt')
ws["K2"].comment = addComment('# Reproductive')
ws["L2"].comment = addComment('# Resprouting & died after fire')
ws["M2"].comment = addComment('# Killed in fire')

for j in range(1,18):
    ws.cell(2,j).alignment=cent_align
    ws.cell(2,j).font  = Font(bold=True, color="000000")
    
k=3
for row in res:
    notes = list()
    if row['life_stage'] is not None:
        notes.append('Life stage: %s' % row['life_stage'])
    if row['scorch'] is not None:
        notes.append('Scorch: %s' % row['scorch'])
        
    for rc in row['comments']:
        if rc.find("Imported from") < 0:
            notes.append(rc)
    ws.append([
        row['visit_id'],
        row['visit_date'],
        row['sample_nr'],
        row['species'] ,
        row['species_code'],
        row['bionet_code'],
        row['resprout_organ'], 
        row['seedbank'],
        row['resprouts_live'], 
        row['adults_unburnt'], 
        row['resprouts_reproductive'],
        row['resprouts_died'], 
        row['resprouts_kill'], 
        row['recruits_live'], 
        row['recruits_reproductive'], 
        row['recruits_died'],
        ' // '.join(notes)
    ])
    ws.cell(k,4).font  = Font(italic=True, color="110110")
    for j in (1,2,3,5,6,9,10,11,12,13,14,15,16):
        ws.cell(k,j).alignment=cent_align
    for j in (17,):
        ws.cell(k,j).alignment=wrap_align
    k=k+1

#tab = Table(displayName="SpeciesInSamples", ref="A{}:Q{}".format(2,ws.max_row))
#tab.tableStyleInfo = table_style["Info"]
#ws.add_table(tab)
ws.protection.sheet = True

### `Sites` worksheet

In [19]:
time_since_fire=sitelist[10]['visit_date'] - sitelist[10]['latest_fire']
"%s" % time_since_fire
time_since_fire.days
sitelist[10]

['AlpAsh_40',
 'c. 2 km down Elliot Way, 50m below road roughly opposite side track',
 Decimal('1105'),
 148.3740887726979,
 -35.895680906884955,
 datetime.date(2021, 4, 16),
 None,
 12,
 'Jackie',
 'Miles',
 ['Jackie Miles', ' Gen Wright', ' Michael Doherty'],
 'KNP AlpAsh',
 datetime.date(2020, 1, 31)]

In [20]:
ws = wb["Sites"]

ws.append(['Survey','Site label','Location','Coordinates (WGS84)','Elevation', 
           'Visit date','Visit description', 
           'Main observer','All observers','Time since last fire (days)'])

k=2
for row in sitelist:
    if row['latest_fire'] is not None:
        time_since_fire=row['visit_date'] - row['latest_fire']
        days_since_fire=time_since_fire.days
        if days_since_fire > 365:
            time_since_fire = "%s years and %s days" % (round(days_since_fire / 365),(days_since_fire % 365))
        elif days_since_fire < 0:
            time_since_fire = "ERROR: mismatching dates" 
        else :
            time_since_fire = "%s days" % days_since_fire
    else:
        time_since_fire=None
    if row['observerlist'] is not None:
        allauthors="; ".join(row['observerlist'])
    else:
        allauthros=None
    
    if row['longitude'] is not None:
        coords="%0.3f %0.3f" % (row['longitude'],row['latitude'])
    else:
        coords=None
    ws.append([
        row['survey_name'],
        row['site_label'],
        row['location_description'],
        coords,
        row['elevation'],
         
        row['visit_date'], 
        row['visit_description'],
         "%s, %s" % (row['surname'],row['givennames']),
        allauthors, 
        time_since_fire
    ])

tab = Table(displayName="SiteList", ref="A{}:J{}".format(1,ws.max_row))
tab.tableStyleInfo = table_style["Info"]
ws.add_table(tab)
ws.protection.sheet = True

### `Vocabularies` worksheet

### Save workbook and close connection

In [21]:
wb.save(inputdir / "fireveg-field-report-model.xlsx")