# Fireveg DB imports -- import field work forms

Author: [José R. Ferrer-Paris](https://github.com/jrfep)

Date: February 2022, updated 19 August 2024

This Jupyter Notebook includes [Python](https://www.python.org) code to:
- Read data from spreadsheets with field-work data
- Create records for data import into the database
- Insert or update records in the database

This notebook deals with the third step, which is importing the information for each quadrat sample. 

**Please note:**
<div class="alert alert-warning">
    This repository contains code that is intended for internal project management and is documented for the sake of reproducibility.<br/>
    🛂 Only users contributing directly to the project have access to the credentials for data download/upload. 
</div>

## Set-up
### Load libraries 

In [1]:
import openpyxl
from pathlib import Path
import os,sys
from datetime import datetime
from configparser import ConfigParser
import psycopg2
from psycopg2.extras import DictCursor
from psycopg2.extensions import AsIs
import pandas as pd
import pyprojroot

### Define paths for input and output

In [2]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
sys.path.append(str(repodir))

Define path to workbooks

In [3]:
inputdir = repodir / "data" / "input-field-form"

### Load own functions

Load functions from `lib` folder, we will use a function to read db credentials and one for batch insert and updates:

In [4]:
from lib.parseparams import read_dbparams
from lib.firevegdb import batch_upsert, dbquery, validate_and_update_site_records
import lib.fireveg as fv

### Database credentials

🤫 We use a folder named "secrets" to keep the credentials for connection to different services (database credentials, API keys, etc). This checked this folder in our `.gitignore` so that its content are not tracked by git and not exposed. Future users need to copy the contents of this folder manually.

We read database credentials stored in a `database.ini` file using our own `read_dbparams` function.

In [5]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', 
                         section='fireveg-db-v1.1')

## Get updated vocabularies from database

In [6]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='resprout_organ_vocabulary';"
valid_organ_list = dbquery(qry, dbparams)
organ_vocab = [item for t in valid_organ_list for item in t]


In [7]:
organ_vocab

['Epicormic',
 'Apical',
 'Lignotuber',
 'Basal',
 'Tuber',
 'Tussock',
 'Short rhizome',
 'Long rhizome or root sucker',
 'Stolon',
 'None',
 'Other']

In [8]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='seedbank_vocabulary';"
valid_seedbank_list = dbquery(qry, dbparams)
seedbank_vocab = [item for t in valid_seedbank_list for item in t]


In [9]:
seedbank_vocab

['Soil-persistent', 'Transient', 'Canopy', 'Non-canopy', 'Other']

## Read workbooks
Each spreadsheet has a slightly different structure, so these scripts have to be adapted for each case.

### List of workbooks/spreadsheets in directory

In [10]:
os.listdir(inputdir)

['UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx',
 'PlantFireTraitData_2011-2018_Import_AdditionalSiteInfo.xlsx',
 'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx',
 'SthnNSWRF_data_bionet2.xlsx',
 'UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx',
 'PlantFireTraitData_2011-2018_Import.xlsx',
 '.ipynb_checkpoints',
 'UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx',
 'UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton.xlsx',
 'UNSW_VegFireResponse_KNP AlpAsh.xlsx',
 'UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx',
 'RobertsonRF_data_bionet2.xlsx',
 'Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm']

In [11]:
valid_files = ['SthnNSWRF_data_bionet2.xlsx',
               'UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx',
               'UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx',
               'UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx',
               'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx',
               'UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx',
               'RobertsonRF_data_bionet2.xlsx',
               'Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm']

Here we create an index of worksheets and column headers for each file

In [12]:
wbindex=dict()
for workbook_name in valid_files:
    inputfile=inputdir / workbook_name
    # using data_only=True to get the calculated cell values
    wb = openpyxl.load_workbook(inputfile,data_only=True)
    wbindex[workbook_name]=dict()
    for ws in wb.worksheets:
        wbindex[workbook_name][ws._WorkbookChild__title]=[list(),list()]
        for k in range(1,ws.max_column):
            wbindex[workbook_name][ws._WorkbookChild__title][0].append(ws.cell(row=1,column=k).value)
            wbindex[workbook_name][ws._WorkbookChild__title][1].append(ws.cell(row=2,column=k).value)
        

### Functions to read records and upload to database

To use this function we need to select an item (row) from the target workbook/worksheet for example:

#### Wrapping all steps together
The following function will call the  functions `import_records_from_workbook`, `create_field_sample_record`, `validate_and_update_site_records`, and `create_quadrat_sample_record` to process data from a workbook into records that are then imported into the database using `batch_upsert`.

In [13]:
def read_and_import_species_data(filepath,workbook,worksheet,col_dictionary,valid_seedbank,valid_organ):
    quadrats = fv.import_records_from_workbook(filepath, workbook, worksheet, col_dictionary,
                                       fv.create_field_sample_record)
    valid_visits = validate_and_update_site_records(quadrats,dbparams)
    
    records=fv.import_records_from_workbook(filepath, workbook, worksheet, col_dictionary,
                                         fv.create_quadrat_sample_record,
                                         lookup=valid_visits, valid_seedbank=valid_seedbank, valid_organ=valid_organ)
    valid_records=list()
    invalid_records=list()
    for record in records:
        if 'replicate_nr' in record.keys():
            replicate_nr = record['replicate_nr']
        elif 'fixed_replicate_nr' in record.keys():
            replicate_nr = col_dictionary['fixed_replicate_nr']
        else:
            replicate_nr = None
        
        if 'visit_date' in record.keys():
            p=filter(lambda n: n['visit_id'] == record['visit_id'] and  n['visit_date'] == record['visit_date'], valid_visits)
            found=list(p)
        elif 'replicate_nr' in record.keys():
            p=filter(lambda n: n['visit_id'] == record['visit_id'] and  n['replicate_nr'] == replicate_nr, valid_visits)
            found=list(p)
        else:
            found=list()
        
        if (len(found)==1):
            valid_records.append(record)
        else:
            invalid_records.append(record)

    print("%s valid records and %s invalid records" % (len(valid_records), len(invalid_records)))
    
    batch_upsert(dbparams,table='form.quadrat_samples',records=valid_records,keycol=('visit_id','visit_date','sample_nr'),
             idx=None, execute=True)


## Processing data from all workbooks

In the following section, I proceed to iterate through all the workbooks, adjusting code for each case. 

Here is the list of available workbooks:

In [14]:
wbindex.keys()

dict_keys(['SthnNSWRF_data_bionet2.xlsx', 'UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx', 'UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx', 'UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx', 'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx', 'UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx', 'RobertsonRF_data_bionet2.xlsx', 'Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm'])

If we select one workbook, we can retrieve a list of column names that we will use in our column definitions for each function:

### Upland / Basalt

- 15 visits (older) without data
- most visits with 3 quadrats or samples
- around 30 to 50 spp per visit


In [15]:
filename='UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx'
worksheet='Floristics'
cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: Updated 14/10/2019 / Entry Order
1 :: Site Number / Site Number
2 :: Replicate / Replicate
3 :: First Date / Date of sighting (dd/mm/yyyy hh:mm:ss).
4 :: Last Date / If more than 1 day (dd/mm/yyyy hh:mm:ss).
5 :: Sub plot / None
6 :: Type / Fauna (FA) or flora (FL).
7 :: Species code / Species code can be assigned by OEH, or see the reference worksheet.
8 :: Common Name / None
9 :: Scientific Name / ScientificName
10 :: Cover score / See reference worksheet for definitions
11 :: Abundance score / None
12 :: Stratum / See reference worksheet for definitions
13 :: Growth form / See reference worksheet for definitions
14 :: Height min / Flora only; height (in metres)
15 :: Height max / Flora only; height (in metres)
16 :: % Cover actual / None
17 :: Recovery organ / Recovery organ
18 :: Seedbank / Seedbank
19 :: None / Count of unburnt individuals
20 :: Abund actual / Count of resprouting individuals.
21 :: None / Count of fire-killed individuals
22 :: Number reproductive / None
23 :

In [16]:

col_dict={'visit_id':1, 'replicate_nr':2, 'date':3,
          'sample_nr':5, 'spcode':7, 'species':9,   
          'resprout_organ':17, 'seedbank':18,
          'adults_unburnt':19,'resprouts_live':20,'resprouts_kill':21,
          'resprouts_reproductive':22,'recruits_live':23, 'recruits_died':24, 'recruits_reproductive':25,
                 'notes':31,'workbook':filename,'worksheet':worksheet}


read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)


Connecting to the PostgreSQL database...
SiteNo not found
0 rows updated
Database connection closed.
1590 valid records and 2 invalid records
Connecting to the PostgreSQL database...
1590 rows updated
Database connection closed.


### Southern NSW Rainforest

- Edited file, all these UppClydeRF1, UppClydeRF2, UppClydeRF3, UppClydeRF4 corrected to UppClyde1
- This validates all 250 records


In [17]:
cols=wbindex['SthnNSWRF_data_bionet2.xlsx']['Floristics']
for k in range(1,len(cols[0])):
    print("%s :: %s // %s" % (k-1,cols[0][k-1],cols[1][k-1]))

0 :: Updated 14/10/2019 // Entry Order
1 :: None // Site Number
2 :: None // Replicate
3 :: First Date // Date of sighting (dd/mm/yyyy hh:mm:ss).
4 :: Last Date // If more than 1 day (dd/mm/yyyy hh:mm:ss).
5 :: Sub plot // SubplotID
6 :: Type // Fauna (FA) or flora (FL).
7 :: Species code // Species code can be assigned by OEH, or see the reference worksheet.
8 :: Common Name // None
9 :: Scientific Name // None
10 :: Cover score // See reference worksheet for definitions
11 :: Abundance score // CV18A See reference worksheet for definitions
12 :: Stratum // See reference worksheet for definitions
13 :: Growth form // See reference worksheet for definitions
14 :: Height min // Flora only; height (in metres)
15 :: Height max // Flora only; height (in metres)
16 :: % Cover actual // None
17 :: Recovery organ // None
18 :: Seedbank // None
19 :: None // Count of unburnt individuals
20 :: Abund actual // Count of resprouting individuals.
21 :: None //  # resprouted & died post-fire
22 :: N

In [18]:
filename='SthnNSWRF_data_bionet2.xlsx'
worksheet='Floristics'
col_dict={'visit_id':1, 'sample_nr':5, 'replicate_nr':2,'species':9, 'spcode':7, 'date':3, 'resprout_organ':17, 'seedbank':18,
          'adults_unburnt':19,'resprouts_live':20,'resprouts_died':21,'resprouts_kill':22,
          'resprouts_reproductive':23,'recruits_live':24, 'recruits_died':25, 'recruits_reproductive':26,
                 'notes':32,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)


Connecting to the PostgreSQL database...
UppClydeRF1 not found
UppClydeRF1 not found
UppClydeRF1 not found
UppClydeRF1 not found
UppClydeRF1 not found
UppClydeRF1 not found
UppClydeRF2 not found
UppClydeRF3 not found
UppClydeRF4 not found
UppClydeRF1 not found
0 rows updated
Database connection closed.
175 valid records and 75 invalid records
Connecting to the PostgreSQL database...
175 rows updated
Database connection closed.


### KNP Alpine Ash

AlpAsh26 is not in database

In [19]:
worksheet='Floristics'
filename='UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx'

cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: Updated 14/10/2019 / Entry Order
1 :: None / Site Number
2 :: None / Replicate
3 :: First Date / Date of sighting (dd/mm/yyyy hh:mm:ss).
4 :: Last Date / If more than 1 day (dd/mm/yyyy hh:mm:ss).
5 :: Sub plot / SubplotID
6 :: Type / Fauna (FA) or flora (FL).
7 :: Species code / Species code can be assigned by OEH, or see the reference worksheet.
8 :: Common Name / None
9 :: Scientific Name / Scientific Name
10 :: Cover score / See reference worksheet for definitions
11 :: Abundance score / CV18A See reference worksheet for definitions
12 :: Stratum / See reference worksheet for definitions
13 :: Growth form / See reference worksheet for definitions
14 :: Height min / Flora only; height (in metres)
15 :: Height max / Flora only; height (in metres)
16 :: % Cover actual / None
17 :: Recovery organ / None
18 :: Seedbank / None
19 :: 0 / Count of unburnt individuals
20 :: Abund actual / Count of resprouting individuals.
21 :: 0 /  # resprouted & died post-fire
22 :: None / Count of fi

In [20]:
col_dict={'visit_id':1, 'replicate_nr':2, 'date':3,
          'sample_nr':5, 'spcode':7, 'species':9,   
          'resprout_organ':17, 'seedbank':18,
          'adults_unburnt':19,'resprouts_live':20,'resprouts_died':21,'resprouts_kill':22,
          'resprouts_reproductive':23,'recruits_live':24, 'recruits_died':25, 'recruits_reproductive':26,
                 'notes':32,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)


Connecting to the PostgreSQL database...
0 rows updated
Database connection closed.
769 valid records and 1 invalid records
Connecting to the PostgreSQL database...
769 rows updated
Database connection closed.


### Alpine Bogs

- older format without replicate nr or date, 
- assuming replicate nr is fixed and corresponds to **second** replicate
- 20 samples per visit
- 20-50 spp per visit

In [21]:
filename='UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx'
worksheet='Floristics'
cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: None / Site
1 :: Species responses / Subquadrat #
2 :: None / Label
3 :: Type / Fauna (FA) or flora (FL).
4 :: Species code / Species code can be assigned by OEH, or see the reference worksheet.
5 :: Common Name / Common name
6 :: None / Species (edits in red)
7 :: None / CAPS #
8 :: None / resprout organ (epicormic,ligno, crown, basal, tuber,rhiz,stol)
9 :: None / seedbank type (canopy, soil, transient, other(not canopy)
10 :: None / # Live unburnt (no response to fire)
11 :: Adults / # resprouted & live
12 :: None /  # resprouted & died post-fire
13 :: None / # killed in fire
14 :: None / #  reproductive
15 :: Recruits / # live
16 :: None / # died post-fire
17 :: None / #  reproductive
18 :: None / notes


In [22]:

col_dict={'visit_id':0, 'sample_nr':1, 'fixed_replicate_nr':2,'species':6, 'spcode':4,  'resprout_organ':8, 'seedbank':9,
          'adults_unburnt':10,'resprouts_live':11,'resprouts_died':12,'resprouts_kill':13,
          'resprouts_reproductive':14,'recruits_live':15, 'recruits_died':16, 'recruits_reproductive':17,
                 'notes':18,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)

Connecting to the PostgreSQL database...
0 rows updated
Database connection closed.
1634 valid records and 1 invalid records
Connecting to the PostgreSQL database...
1634 rows updated
Database connection closed.


### Robertson Rainforest

- Only one visit to SASrf1 and one to SAS002B
- Four samples per visit, 37 - 46 species
- Edited file, corrected one entry (SASrf2) to SASrf1 

In [23]:
worksheet='Floristics'
filename='RobertsonRF_data_bionet2.xlsx'

cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: Updated 14/10/2019 / Entry Order
1 :: None / Site Number
2 :: None / Replicate
3 :: First Date / Date of sighting (dd/mm/yyyy hh:mm:ss).
4 :: Last Date / If more than 1 day (dd/mm/yyyy hh:mm:ss).
5 :: Sub plot / SubplotID
6 :: Type / Fauna (FA) or flora (FL).
7 :: Species code / Species code can be assigned by OEH, or see the reference worksheet.
8 :: Common Name / None
9 :: Scientific Name / None
10 :: Cover score / See reference worksheet for definitions
11 :: Abundance score / CV18A See reference worksheet for definitions
12 :: Stratum / See reference worksheet for definitions
13 :: Growth form / See reference worksheet for definitions
14 :: Height min / Flora only; height (in metres)
15 :: Height max / Flora only; height (in metres)
16 :: % Cover actual / None
17 :: Recovery organ / None
18 :: Seedbank / None
19 :: None / Count of unburnt individuals
20 :: Abund actual / Count of resprouting individuals.
21 :: None /  # resprouted & died post-fire
22 :: None / Count of fire-ki

In [24]:
col_dict={'visit_id':1, 'replicate_nr':2, 'date':3,
          'sample_nr':5, 'spcode':7, 'species':9,   
          'resprout_organ':17, 'seedbank':18,
          'adults_unburnt':19,'resprouts_live':20,'resprouts_died':21,'resprouts_kill':22,
          'resprouts_reproductive':23,'recruits_live':24, 'recruits_died':25, 'recruits_reproductive':26,
                 'notes':32,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)

Connecting to the PostgreSQL database...
SASrf2 not found
0 rows updated
Database connection closed.
223 valid records and 1 invalid records
Connecting to the PostgreSQL database...
223 rows updated
Database connection closed.


### Newnes

- 20 quadrats per visit
- Visit information incomplete in many cases (no date, which replicate?)
- as few as 4 as many as 37 species per visit

In [25]:
filename='Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm'
worksheet='Floristics'

cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: None / Site
1 :: Species responses / Quadrat #
2 :: None / Label
3 :: None / Census#
4 :: None / Date
5 :: None / Species
6 :: None / CAPS #
7 :: None / resprout organ (epicormic,ligno, crown, basal, tuber,rhiz,stol)
8 :: None / seedbank type (canopy, soil, transient, other(not canopy)
9 :: None / # Live unburnt (no response to fire)
10 :: Adults / # resprouted & live
11 :: None /  # resprouted & died post-fire
12 :: None / # killed in fire
13 :: None / #  reproductive
14 :: Recruits / # live
15 :: None / # died post-fire
16 :: None / #  reproductive
17 :: None / notes
18 :: None / total live
19 :: None / fire survial
20 :: None / seedling/adult
21 :: None / notes2
22 :: None / live present
23 :: None / None
24 :: None / None
25 :: None / None
26 :: None / None
27 :: None / None
28 :: None / None
29 :: None / None
30 :: None / None


In [26]:
col_dict={'visit_id':0, 'replicate_nr':3, 'date':4,
          'sample_nr':1, 'spcode':6, 'species':5,   
          'resprout_organ':7, 'seedbank':8,
          'adults_unburnt':9,'resprouts_live':10,'resprouts_died':11,'resprouts_kill':12,
          'resprouts_reproductive':13,'recruits_live':14, 'recruits_died':15, 'recruits_reproductive':16,
                 'notes':17,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)

Connecting to the PostgreSQL database...
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS1 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
record for BS2 is incomplete
re

### Yatteyattah

- sites SCCJB14 and MIL012B not found
- no information for SCCJB13 and SCCJB37-Near
- 4 visits with one sample per visit, 30 to 78 species

In [27]:
worksheet='Floristics'
filename='UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx'

cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])+1):
    print("%s :: %s / %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: Updated 14/10/2019 / Entry Order
1 :: None / Site Number
2 :: None / Replicate
3 :: First Date / Date of sighting (dd/mm/yyyy hh:mm:ss).
4 :: Last Date / If more than 1 day (dd/mm/yyyy hh:mm:ss).
5 :: Sub plot / SubplotID
6 :: Type / Fauna (FA) or flora (FL).
7 :: Species code / Species code can be assigned by OEH, or see the reference worksheet.
8 :: Common Name / None
9 :: Scientific Name / Species
10 :: Cover score / None
11 :: Abundance score / CV18A See reference worksheet for definitions
12 :: Abundance score / CV18A See reference worksheet for definitions
13 :: Stratum / See reference worksheet for definitions
14 :: Growth form / See reference worksheet for definitions
15 :: Height min / Flora only; height (in metres)
16 :: Height max / Flora only; height (in metres)
17 :: % Cover actual / None
18 :: Recovery organ / resprout organ (epicormic,ligno, crown, basal, tuber,rhiz,stol)
19 :: Seedbank / seedbank type (canopy, soil, transient, other(not canopy)
20 :: Adults / Count

In [28]:
col_dict={'visit_id':1, 'replicate_nr':2, 'date':3,
          'sample_nr':5, 'spcode':7, 'species':9,   
          'resprout_organ':18, 'seedbank':19,
          'adults_unburnt':20,'resprouts_live':21,'resprouts_died':22,'resprouts_kill':23,
          'resprouts_reproductive':24,'recruits_live':25, 'recruits_died':26, 'recruits_reproductive':27,
                 'notes':33,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)


Connecting to the PostgreSQL database...
SCCJB14 not found
0 rows updated
Database connection closed.
612 valid records and 3 invalid records
Connecting to the PostgreSQL database...
612 rows updated
Database connection closed.


### Rainforest in NE NSW / SE Qld

- Original did not  have visit_date or replicate nr.
- Edited file to add replicate nr (nr. 2 for BFEH_1_UNSW and BFEH_4_UNSW, nr 1 for all others)

Several updates to the script were needed to finally make it work.

In [29]:
filename='UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx'
worksheet='Floristics'

cols=wbindex[filename][worksheet]
for k in range(1,len(cols[0])):
    print("%s :: %s // %s" % (k-1,cols[0][k-1],cols[1][k-1]))


0 :: None // Site
1 :: Species responses // Subquadrat #
2 :: None // replicate
3 :: None // Label
4 :: Type // Fauna (FA) or flora (FL).
5 :: Species code // Species code can be assigned by OEH, or see the reference worksheet.
6 :: Common Name // Common name
7 :: None // Species (edits in red)
8 :: None // CAPS #
9 :: None // resprout organ (epicormic,ligno, crown, basal, tuber,rhiz,stol)
10 :: None // seedbank type (canopy, soil, transient, other(not canopy)
11 :: None // # Live unburnt (no response to fire)
12 :: Adults // # resprouted & live
13 :: None //  # resprouted & died post-fire
14 :: None // # killed in fire
15 :: None // #  reproductive
16 :: Recruits // # live
17 :: None // # died post-fire
18 :: None // #  reproductive


In [30]:
# does not have visit_date or replicate nr, 
# REMEMBER to add one column to Floristics table for the replicate nr
col_dict={'visit_id':0, 'sample_nr':1, 'replicate_nr':2,'species':7, 'spcode':5,  'resprout_organ':9, 'seedbank':10,
          'adults_unburnt':11,'resprouts_live':12,'resprouts_died':13,'resprouts_kill':14,
          'resprouts_reproductive':15,'recruits_live':16, 'recruits_died':17, 'recruits_reproductive':18,
                 'notes':19,'workbook':filename,'worksheet':worksheet}

read_and_import_species_data(filepath=inputdir,
                             workbook=filename,
                             worksheet=worksheet,
                             col_dictionary=col_dict,
                             valid_seedbank=seedbank_vocab,
                             valid_organ=organ_vocab)


Connecting to the PostgreSQL database...
record for HERN_1_UNSW is incomplete
record for HERN_1_UNSW is incomplete
record for HERN_1_UNSW is incomplete
record for HERN_1_UNSW is incomplete
record for HERN_1_UNSW is incomplete
record for HERN_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_1_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for FTR_2_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_1_UNSW is incomplete
record for CH_2_UNSW is incomplete
record for CH_2_UNSW is incomplete
record for CH_2_UNSW is i

## That is it for now!

✅ Job done! 😎👌🔥

You can:
- go [back home](../Instructions-and-workflow.ipynb),
- continue navigating the repo on [GitHub](https://github.com/ces-unsw-edu-au/fireveg-db-exports)
- continue exploring the repo on [OSF](https://osf.io/h96q2/).
- visit the database at <http://fireecologyplants.net>