# Fireveg DB imports -- import field work forms

Author: [José R. Ferrer-Paris](https://github.com/jrfep)

Date: February 2022, updated 19 August 2024

This Jupyter Notebook includes [Python](https://www.python.org) code to:
- Read data from spreadsheets with field-work data
- Create records for data import into the database
- Insert or update records in the database

This notebook deals with adding additional fire intensity and vegetation structure data to the field visits.

**Please note:**
<div class="alert alert-warning">
    This repository contains code that is intended for internal project management and is documented for the sake of reproducibility.<br/>
    🛂 Only users contributing directly to the project have access to the credentials for data download/upload. 
</div>


## Set-up
Load libraries 

In [1]:
import openpyxl
from pathlib import Path
import os,sys
from datetime import datetime
from configparser import ConfigParser
import psycopg2
from psycopg2.extras import DictCursor
from psycopg2.extensions import AsIs
import re
#import postgis
import pandas as pd
#import copy

import pyprojroot

### Define paths for input and output

Define project directory using the `pyprojroot` functions, and add this to the execution path.

In [2]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
sys.path.append(str(repodir))

Define path to workbooks

In [3]:
inputdir = repodir / "data" / "input-field-form"

### Load own functions

Load functions from `lib` folder, we will use a function to read db credentials. We use functions from module `fireveg` to read the data and create records, and functions from module `firevegdb` to execute the SQL insert or update query.

In [4]:
from lib.parseparams import read_dbparams
from lib.firevegdb import dbquery, batch_upsert
import lib.fireveg as fv

### Database credentials

🤫 We use a folder named "secrets" to keep the credentials for connection to different services (database credentials, API keys, etc). This checked this folder in our `.gitignore` so that its content are not tracked by git and not exposed. Future users need to copy the contents of this folder manually.

We read database credentials stored in a `database.ini` file using our own `read_dbparams` function.

In [5]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', 
                         section='fireveg-db-v1.1')

## Read workbooks
Each spreadsheet has a slightly different structure, so these scripts have to be adapted for each case.

### List of workbooks/spreadsheets in directory

In [8]:
avail_files = os.listdir(inputdir)
#avail_files

In [9]:
valid_files = ['SthnNSWRF_data_bionet2.xlsx',
               'UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx',
               'UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx',
               'UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx',
               'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx',
               'UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx',
               'RobertsonRF_data_bionet2.xlsx',
               'Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm']

In [10]:
for ff in valid_files:
    print(ff in avail_files)

True
True
True
True
True
True
True
True


Here we create an index of worksheets and column headers for each file

In [11]:
wbindex=dict()
for workbook_name in valid_files:
    inputfile=inputdir / workbook_name
    # using data_only=True to get the calculated cell values
    wb = openpyxl.load_workbook(inputfile,data_only=True)
    wbindex[workbook_name]=dict()
    for ws in wb.worksheets:
        wbindex[workbook_name][ws._WorkbookChild__title]=[list(),list()]
        for k in range(1,ws.max_column):
            wbindex[workbook_name][ws._WorkbookChild__title][0].append(ws.cell(row=1,column=k).value)
            wbindex[workbook_name][ws._WorkbookChild__title][1].append(ws.cell(row=2,column=k).value)
        

In [12]:
wbindex['SthnNSWRF_data_bionet2.xlsx'].keys()

dict_keys(['Site', 'Fire', 'Structure', 'Floristics', 'Reference', 'Info', 'Sheet1'])

### Fire intensity

We want to add the information on the fire intensity. This is recorded in the worksheet 'Fire' next to the fire history. We have to add a column to the worksheet for the date of the sampling (as the last column on the right), otherwise we won't be able to match this information to the sampling visit.

We will use the variables:
- 'scorch height' in m, 
- 'tree foliage biomass consumed' in %,
- 'shrub foliage biomass consumed' in %,
- 'ground foliage biomass consumed' in %,
- 'tree foliage scorch' in %,
- 'shrub foliage scorch' in %,
- 'herb foliage scorch' in %




#### Southern NSW Rainforest
Manual updates: add a visit_date column at the end of the table

In [13]:
filename='SthnNSWRF_data_bionet2.xlsx'
worksheet='Fire'
wbindex[filename][worksheet][0]

['Site',
 'Replicate',
 'Date of last fire dd/mm/yyyy',
 'Date of penultimate fire',
 'Date of earlier fire',
 'How date inferred1',
 'How date inferred2',
 'How date inferred3',
 'Ignition cause1',
 'Ignition cause2',
 'Ignition cause3',
 'Scorch hgt (m) min',
 'Scorch hgt (m) mas',
 'Scorch hgt (m) mode',
 '% Tree foliage scorch',
 "% Tree foliage c'sume",
 '% Shb foliage scorch',
 "% Shb foliage c'sume",
 '% Herb layer foliage scorch',
 "% Herb layer foliage c'sume",
 'Twig diam (mm) 1',
 'Twig diam (mm) 2',
 'Twig diam (mm) 3',
 'Twig diam (mm) 4',
 'Twig diam (mm) 5',
 'Twig diam (mm) 6',
 'Twig diam (mm) 7',
 'Twig diam (mm) 8',
 'Twig diam (mm) 9',
 'Twig diam (mm) 10',
 'Peat depth burnt (cm)',
 'Peat extent burnt %quad',
 'Peat extent unburnt %quad']

In [14]:
len(wbindex[filename][worksheet][0])

33

In [15]:
col_def={'visit_id':1, 'visit_date':34, 'scorch height':(14,12,13),
         'tree foliage biomass consumed':(16,), 
         'shrub foliage biomass consumed':(18,), 
         'ground foliage biomass consumed':(20,),
        'tree foliage scorch':(15,), 
         'shrub foliage scorch':(17,), 
         'herb foliage scorch':(19,)}

In [16]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

In [17]:
records[1]

{'visit_id': 'UppClyde1',
 'visit_date': datetime.date(2021, 11, 29),
 'measured_var': 'tree foliage biomass consumed',
 'units': '%',
 'best': 0}

In [18]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
35 rows updated
Database connection closed.


#### Upland Basalt
Added a column for `Visit date` from the `Site` worksheet, changed date format to match the day/month/year format

In [19]:
filename='UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx'
#wbindex[filename][worksheet][0]

In [20]:
col_def={'visit_id':1, 'visit_date':3, 'scorch height':(15,13,14),
         'tree foliage biomass consumed':(17,), 
         'shrub foliage biomass consumed':(19,), 
         'ground foliage biomass consumed':(21,),
        'tree foliage scorch':(16,), 
         'shrub foliage scorch':(18,), 
         'herb foliage scorch':(20,)}

In [21]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

In [22]:
records[0:2]

[{'visit_id': 'CRC09B7U',
  'visit_date': datetime.date(2021, 2, 3),
  'measured_var': 'scorch height',
  'units': 'm',
  'best': 5,
  'lower': 2,
  'upper': 10},
 {'visit_id': 'CRC09B7U',
  'visit_date': datetime.date(2021, 2, 3),
  'measured_var': 'tree foliage biomass consumed',
  'units': '%',
  'best': 10}]

In [23]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)



Connecting to the PostgreSQL database...
196 rows updated
Database connection closed.


This one also has twig diameter measurements:

In [24]:
col_def={'visit_id':1, 'visit_date':3, 'twig diameter':range(23,33)}

In [25]:
records=fv.read_twig_diameters(inputdir,filename,worksheet,col_def)

In [26]:
len(records)

109

In [27]:
records[0]

{'visit_id': 'CRC09B7U',
 'visit_date': datetime.date(2021, 2, 3),
 'measured_variable': 'twig diameter',
 'units': 'mm',
 'single_value': 0.45}

In [28]:
batch_upsert(dbparams,"form.field_visit_vegetation_raw_values",records,
             keycol=('visit_id','visit_date','measured_variable'), 
             idx=None,execute=True)


Connecting to the PostgreSQL database...
109 rows updated
Database connection closed.


#### NE NSW / SE Qld Rainforest
Added a column for date of sampling and changed date format

In [29]:
filename=valid_files[2]
print(filename)
worksheet='Fire'
wbindex[filename][worksheet][0]

UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx


['Site',
 'Replicate',
 'Date of samping',
 'Date of last fire dd/mm/yyyy',
 'Date of penultimate fire',
 'Date of earlier fire',
 'How date inferred1',
 'How date inferred2',
 'How date inferred3',
 'Ignition cause1',
 'Ignition cause2',
 'Ignition cause3',
 'Scorch hgt (m)',
 '% Tree foliage scorch',
 "% Tree foliage c'sume",
 '% Shb foliage scorch',
 "% Shb foliage c'sume",
 '% Herb layer foliage scorch',
 "% Herb layer foliage c'sume",
 'Twig diam (mm) 1',
 'Twig diam (mm) 2',
 'Twig diam (mm) 3',
 'Twig diam (mm) 4',
 'Twig diam (mm) 5',
 'Twig diam (mm) 6',
 'Twig diam (mm) 7',
 'Twig diam (mm) 8',
 'Twig diam (mm) 9',
 'Twig diam (mm) 10',
 'Peat depth burnt (cm)',
 'Peat extent burnt %quad']

In [30]:
col_def={'visit_id':1, 'visit_date':3, 'scorch height':(13,),
         'tree foliage biomass consumed':(15,), 
         'shrub foliage biomass consumed':(17,), 
         'ground foliage biomass consumed':(19,),
        'tree foliage scorch':(14,), 
         'shrub foliage scorch':(16,), 
         'herb foliage scorch':(18,),
        'peat extent burnt':(31,),
        'peat depth burnt':(30,)}

In [31]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

In [32]:
len(records)
#records[0:2]

153

In [33]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)

Connecting to the PostgreSQL database...
153 rows updated
Database connection closed.


In [34]:
col_def={'visit_id':1, 'visit_date':3, 'twig diameter':range(20,30)}
records=fv.read_twig_diameters(inputdir,filename,worksheet,col_def)

In [35]:
len(records)

160

In [36]:
batch_upsert(dbparams,"form.field_visit_vegetation_raw_values",records,
             keycol=('visit_id','visit_date','measured_variable'), 
             idx=None,execute=True)

Connecting to the PostgreSQL database...
160 rows updated
Database connection closed.


#### Yatteyattah
Add a column for the visit date and changed the date format

In [37]:
filename='UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx'
wbindex[filename][worksheet][0]

['Site',
 'Date of last fire dd/mm/yyyy',
 'Date of penultimate fire',
 'Date of earlier fire',
 'How date inferred1',
 'How date inferred2',
 'How date inferred3',
 'Ignition cause1',
 'Ignition cause2',
 'Ignition cause3',
 'Scorch hgt (m)',
 '% Tree foliage scorch',
 "% Tree foliage c'sume",
 '% Shb foliage scorch',
 "% Shb foliage c'sume",
 '% Herb layer foliage scorch',
 "% Herb layer foliage c'sume",
 'Twig diam (mm) 1',
 'Twig diam (mm) 2',
 'Twig diam (mm) 3',
 'Twig diam (mm) 4',
 'Twig diam (mm) 5',
 'Twig diam (mm) 6',
 'Twig diam (mm) 7',
 'Twig diam (mm) 8',
 'Twig diam (mm) 9',
 'Twig diam (mm) 10',
 'Peat depth burnt (cm)',
 'Peat extent burnt %quad']

In [38]:
col_def={'visit_id':1, 'visit_date':2, 'scorch height':(12,),
         'tree foliage biomass consumed':(14,), 
         'shrub foliage biomass consumed':(16,), 
         'ground foliage biomass consumed':(18,),
        'tree foliage scorch':(13,), 
         'shrub foliage scorch':(15,), 
         'herb foliage scorch':(17,)}

In [39]:
col_def

{'visit_id': 1,
 'visit_date': 2,
 'scorch height': (12,),
 'tree foliage biomass consumed': (14,),
 'shrub foliage biomass consumed': (16,),
 'ground foliage biomass consumed': (18,),
 'tree foliage scorch': (13,),
 'shrub foliage scorch': (15,),
 'herb foliage scorch': (17,)}

In [40]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

We will need to add some extra sites/visits to avoid errors in the function. Watch out for reviewing this later.

In [41]:
add_visits = [{'visit_id': 'SCCJB37-Near', 
 'visit_date': '2021-01-01',
 'replicate_nr': 3,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
},{'visit_id': 'SCCJB', 
 'visit_date': '2021-01-01',
 'survey_name': 'Yatteyattah'
},{'visit_id': 'SZ23101', 
 'visit_date': '2021-01-04',
   'replicate_nr': 1,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
},{'visit_id': 'MR_N2', 
 'visit_date': '2021-01-04',
   'replicate_nr': 1,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
},{'visit_id': 'MR_N4', 
 'visit_date': '2021-01-04',
   'replicate_nr': 1,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
},{'visit_id': 'SCCRR25', 
 'visit_date': '2021-01-04',
   'replicate_nr': 1,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
},{'visit_id': 'MIL012B', 
 'visit_date': '2021-01-04',
   'replicate_nr': 1,
 'mainobserver': 12,
 'survey_name': 'Yatteyattah'
}]

add_sites = [{'site_label': 'SCCJB'},]

In [42]:
batch_upsert(dbparams,"form.field_site",add_sites,
             keycol=('site_label'), 
             idx=None,execute=True)
batch_upsert(dbparams,"form.field_visit",add_visits,
             keycol=('visit_id','visit_date'), 
             idx='field_visit_pkey',execute=True)


Connecting to the PostgreSQL database...
0 rows updated
Database connection closed.
Connecting to the PostgreSQL database...
7 rows updated
Database connection closed.


In [43]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)

Connecting to the PostgreSQL database...
49 rows updated
Database connection closed.


In [45]:
col_def={'visit_id':1, 'visit_date':2, 'twig diameter':range(19,29)}
records=fv.read_twig_diameters(inputdir,filename,worksheet,col_def)

In [46]:
len(records)

63

In [47]:
batch_upsert(dbparams,"form.field_visit_vegetation_raw_values",records,
             keycol=('visit_id','visit_date','measured_variable'), 
             idx=None,execute=True)

Connecting to the PostgreSQL database...
63 rows updated
Database connection closed.


#### Alpine Ash
Information about fire history and fire intensity are incomplete in this file

In [48]:
valid_files[4]

'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx'

#### Alpine Bogs

In [49]:
filename=valid_files[5]
print(filename)
worksheet='Fire'

UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx


In [50]:
col_def={'visit_id':1, 'visit_date':3, 
         'shrub foliage biomass consumed':(17,), 
         'ground foliage biomass consumed':(19,),
        'peat extent burnt':(31,),
        'peat depth burnt':(30,)}

In [51]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

In [52]:
len(records)

24

In [53]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
24 rows updated
Database connection closed.


In [54]:
col_def={'visit_id':1, 'visit_date':3, 'twig diameter':range(20,30)}
records=fv.read_twig_diameters(inputdir,filename,worksheet,col_def)

In [55]:
len(records)
batch_upsert(dbparams,"form.field_visit_vegetation_raw_values",records,
             keycol=('visit_id','visit_date','measured_variable'), 
             idx=None,execute=True)

Connecting to the PostgreSQL database...
60 rows updated
Database connection closed.


#### Robertson RF

In [56]:
filename=valid_files[6]
print(filename)


RobertsonRF_data_bionet2.xlsx


In [57]:
col_def={'visit_id':1, 'visit_date':3, 'scorch height':(15,13,14),
         'tree foliage biomass consumed':(17,), 
         'shrub foliage biomass consumed':(19,), 
         'ground foliage biomass consumed':(21,),
        'tree foliage scorch':(16,), 
         'shrub foliage scorch':(18,), 
         'herb foliage scorch':(20,),
         'peat extent burnt':(33,),
        'peat depth burnt':(32,)}

In [58]:
records=fv.read_fire_intensity(inputdir,filename,worksheet,col_def)

In [59]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
18 rows updated
Database connection closed.


In [60]:
col_def={'visit_id':1, 'visit_date':3, 'twig diameter':range(22,32)}
records=fv.read_twig_diameters(inputdir,filename,worksheet,col_def)

In [61]:
len(records)
batch_upsert(dbparams,"form.field_visit_vegetation_raw_values",records,
             keycol=('visit_id','visit_date','measured_variable'), 
             idx=None,execute=True)

Connecting to the PostgreSQL database...
20 rows updated
Database connection closed.


### Read valid vegetation classes from spreadsheet

This was provided by David Keith in April 2022 as vocabulary for the vegetation classes and formations

In [68]:
vegclass = pd.read_excel(repodir / "data" / "NSWmap_v3_key3.xlsx")

In [69]:
records=list()
for row in vegclass.values:
    records.append({'vegetation_formation':row[2],'vegetation_class':row[1],'mapunitno':row[0],'formnum':row[3]})
    
batch_upsert(dbparams,
             "vegetation.nsw_units",
             records,
             keycol=('vegetation_class','vegetation_formation'), 
             idx=None,execute=True)


Connecting to the PostgreSQL database...
107 rows updated
Database connection closed.


### Vegetation formation and structure
We can do both of these for each survey.


#### Southern Rainforest

Let's add first the vegetation formation information:

In [71]:
wbindex['SthnNSWRF_data_bionet2.xlsx']['Site'][0][40:]

['NSW TEC', 'variant', 'Vegetation formation', 'Vegegtation class']

In [72]:
filename='SthnNSWRF_data_bionet2.xlsx'
worksheet='Site'
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})
len(records)

5

In [73]:
batch_upsert(dbparams,"form.field_visit_veg_description",
             records,keycol=('visit_id','visit_date'), idx=None,execute=True)


Connecting to the PostgreSQL database...
5 rows updated
Database connection closed.


Now we can upload the vegetation structure data

In [74]:
filename='SthnNSWRF_data_bionet2.xlsx'
worksheet='Structure'
wbindex[filename][worksheet][0]

['SiteNo',
 'Replicate',
 'Date',
 'Stage',
 'Stratum',
 'LowerHeight',
 'UpperHeight',
 'ModalHeight',
 'PercentCover',
 'Dominant1',
 'Dominant2']

In [75]:
col_def={'visit_id':1, 'visit_date':3, 'stage':4, 'stratum':5, 'height':(8,6,7),'cover':(9,)}

In [77]:
records=fv.read_veg_structure(inputdir,filename,worksheet,col_def)

In [78]:
len(records)
records[36:38]

[{'visit_id': 'MaxwellsCk',
  'visit_date': datetime.date(2021, 12, 2),
  'comment': ['Stage: inferred prefire',
   'upper bound given as 20 but less than best estimate'],
  'measured_var': 'stratum T height',
  'best': 30,
  'lower': 12,
  'upper': 30},
 {'visit_id': 'MaxwellsCk',
  'visit_date': datetime.date(2021, 12, 2),
  'comment': ['Stage: inferred prefire'],
  'measured_var': 'stratum T cover',
  'best': 70}]

In [79]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
72 rows updated
Database connection closed.


#### KNP Alp Ash

In [80]:
filename='UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx'
worksheet='Site'
wbindex[filename][worksheet][0][40:48]


['NSW TEC',
 'variant',
 'Vegetation formation',
 'Vegegtation class',
 'NSW PCT',
 None,
 None,
 None]

In [82]:
records = fv.read_veg_classes(inputdir,filename,worksheet,
                              {'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})


In [83]:
len(records)

8

In [84]:
batch_upsert(dbparams,"form.field_visit_veg_description",
             records,keycol=('visit_id','visit_date'), idx=None,execute=True)


Connecting to the PostgreSQL database...
8 rows updated
Database connection closed.


In [86]:
worksheet='Structure'
col_def={'visit_id':1, 'visit_date':3, 'stage':4, 'stratum':5, 'height':(8,6,7),'cover':(9,)}
records=fv.read_veg_structure(inputdir,filename,worksheet,col_def)
len(records)

110

In [87]:
for record in records:
    print(record['measured_var'])

stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T height
stratum T cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T height
stratum T cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratum M1 cover
stratum L1 height
stratum L1 cover
stratum T1 height
stratum T1 cover
stratum T2 height
stratum T2 cover
stratum M1 height
stratu

In [89]:
all_visits = dbquery("select distinct visit_id,visit_date FROM form.field_visit",
        dbparams)

In [90]:
valid_records=list()
for record in records:
    p=filter(lambda n: n['visit_id'] == record['visit_id'] and  n['visit_date'] == record['visit_date'], all_visits)
    found=list(p)
    if len(found) == 0:
        print(record)
    else:
        valid_records.append(record)

{'visit_id': 'AlpAsh_69', 'visit_date': datetime.date(2021, 4, 14), 'comment': ['Stage: inferred prefire'], 'measured_var': 'stratum T1 height', 'best': 40, 'lower': 40, 'upper': 40}
{'visit_id': 'AlpAsh_69', 'visit_date': datetime.date(2021, 4, 14), 'comment': ['Stage: inferred prefire'], 'measured_var': 'stratum T1 cover', 'best': 3}


In [91]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",valid_records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
108 rows updated
Database connection closed.


#### Alpine bogs
Here we have information about vegetation formations.

In [92]:
filename='UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx'
worksheet='Site'

wbindex[filename][worksheet][0][40:48]

['NSW TEC',
 'variant',
 'Vegetation formation',
 'Vegegtation class',
 'NSW PCT',
 None,
 None,
 None]

In [93]:
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})


In [94]:
len(records)

6

In [95]:
batch_upsert(dbparams,"form.field_visit_veg_description",records,keycol=('visit_id','visit_date'), idx=None,execute=True)


Connecting to the PostgreSQL database...
6 rows updated
Database connection closed.


Need to reformat the Vegetation Structure table before adding it to the database.

In [98]:
worksheet='VegStructure'

In [102]:
wbindex[filename][worksheet][1]

['Site',
 'Replicate',
 'Prefire treehgt lower',
 'Prefire treehgt upper',
 'Prefire treehgt mode',
 'Prefire treecov',
 'Prefire shbhgt lower',
 'Prefire shbhgt upper',
 'Prefire shbhgt mode',
 'Prefire shbcov',
 'Prefire hrbhgt lower',
 'Prefire hrbhgt upper',
 'Prefire hrbhgt mode',
 'Prefire hrbcov',
 'Prefire moss hgt',
 'Prefire moss cover',
 'Postfire treehgt lower',
 'Postfire treehgt upper',
 'Postfire treehgt mode',
 'Postfire treecov',
 'Postfire shbhgt lower',
 'Postfire shbhgt upper',
 'Postfire shbhgt mode',
 'Postfire shbcov',
 'Postfire hrbhgt lower',
 'Postfire hrbhgt upper',
 'Postfire hrbhgt mode',
 'Postfire hrbcov',
 'postfire moss ht']

#### Newnes
Vegetation information is not present in 'Site' worksheet. Not imported.

In [105]:
filename='Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm'
worksheet='Site'
wbindex[filename][worksheet][0]

['Site',
 'Easting',
 'Northing',
 'Valley',
 'Elev',
 'Undermined',
 'Fire interval',
 'Census',
 'Date',
 'Scorch hgt',
 'Shb foliage scorch',
 "Shb foliage c'sume",
 'Herb foliage scorch',
 "Herb foliage c'sume",
 'Twig diam mean',
 'Twig diam se',
 'Peat depth burnt',
 'Peat extent burnt',
 'Peat fire index',
 'Postfire treehgt lower',
 'Postfire treehgt upper',
 'Postfire treehgt mode',
 'Postfire treecov',
 'Prefire shbhgt lower',
 'Prefire shbhgt upper',
 'Prefire shbhgt mode',
 'Prefire shbcov',
 'Postfire shbhgt lower',
 'Postfire shbhgt upper',
 'Postfire shbhgt mode',
 'Postfire shbcov',
 'Prefire hrbhgt lower',
 'Prefire hrbhgt upper',
 'Prefire hrbhgt mode',
 'Prefire hrbcov',
 'Postfire hrbhgt lower',
 'Postfire hrbhgt upper',
 'Postfire hrbhgt mode',
 'Postfire hrbcov',
 'Biomass A',
 'Biomass B',
 'Biomass C',
 'Biomass D',
 'Biomass E',
 'Mean dry (60C) biomass (g)',
 'CV biomass',
 'Mean biomass (g/m2)',
 'Native spp richness',
 'Sediment depth (mm) 1',
 'Sediment dep

Vegetation structure data available in a separate CSV file, need to find the file and work on the best approach for data input

In [107]:
# newnes = pd.read_csv(inputdir / 'NewnesStruc.csv')
# newnes['Stratum']

#### Upland Basalt

In [108]:
filename='UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx'
worksheet='Site'

wbindex[filename][worksheet][0][0:8]

['Site',
 'Replicate',
 'Date of previous survey',
 'Observers (comma sep if >1)',
 'Date of sampling',
 'Prior Survey Date',
 None,
 None]

In [109]:
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':3,'vegetation_formation':44,'vegetation_class':45})


In [110]:
records[10]

{'visit_id': 'MTW01TOU',
 'visit_date': datetime.date(2020, 12, 9),
 'vegetation_formation': 'Wet sclerophyll forests (Shrubby subformation)',
 'vegetation_class': 'Southern Escarpment Wet Sclerophyll Forests'}

In [111]:
batch_upsert(dbparams,"form.field_visit_veg_description",records,keycol=('visit_id','visit_date'), idx='field_visit_veg_description_pkey',execute=True)


Connecting to the PostgreSQL database...
28 rows updated
Database connection closed.


Manual updates: 
- date for MWL15 changed from 11/11/2021 to 11/11/2020
- date for MWL11 changed from 18/11/2021 to 18/11/2020

Problems:
- date for MWL11b does not match, not sure which one is correct

In [113]:
worksheet='Structure'
wbindex[filename][worksheet][0]

['SiteNo',
 'Replicate',
 'Date',
 'Stage',
 'Stratum',
 'LowerHeight',
 'UpperHeight',
 'ModalHeight',
 'PercentCover',
 'Dominant1',
 'Dominant2']

In [115]:
col_def={'visit_id':1, 'visit_date':3, 'stage':4, 'stratum':5, 'height':(8,6,7),'cover':(9,)}
records=fv.read_veg_structure(inputdir,filename,worksheet,col_def)
len(records)

186

In [116]:
valid_records=list()
for record in records:
    p=filter(lambda n: n['visit_id'] == record['visit_id'] and  n['visit_date'] == record['visit_date'], all_visits)
    found=list(p)
    if len(found) == 0:
        print(record)
    else:
        valid_records.append(record)

{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum T height', 'best': 25, 'lower': 20, 'upper': 25}
{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum T cover', 'best': 10}
{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum M1 height', 'best': 0.5, 'lower': 0.5, 'upper': 6}
{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum M1 cover', 'best': 2}
{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum L1 height', 'best': 1, 'lower': 0.1, 'upper': 1.5}
{'visit_id': 'MWL11b', 'visit_date': datetime.date(2021, 12, 2), 'comment': ['Stage: 1 year postfire'], 'measured_var': 'stratum L1 cover', 'best': 90}


In [117]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",
             valid_records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
180 rows updated
Database connection closed.


#### Yatteyatah

In [118]:
filename='UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx'
worksheet='Site'

wbindex[filename][worksheet][0][40:48]

['NSW TEC', 'variant', 'Vegetation formation', 'Vegegtation class']

In [119]:
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})
len(records)

7

In [120]:
batch_upsert(dbparams,"form.field_visit_veg_description",records,keycol=('visit_id','visit_date'), idx=None,execute=True)


Connecting to the PostgreSQL database...
7 rows updated
Database connection closed.


Vegetation structure

In [121]:
wbindex[filename].keys()

dict_keys(['Sample', 'Site', 'Environment', 'Fire', 'VegStructure', 'Floristics', 'Reference'])

In [122]:
worksheet='VegStructure'
wbindex[filename][worksheet][1]

['Site',
 'Prefire treehgt lower',
 'Prefire treehgt upper',
 'Prefire treehgt mode',
 'Prefire treecov',
 'Prefire shbhgt lower',
 'Prefire shbhgt upper',
 'Prefire shbhgt mode',
 'Prefire shbcov',
 'Prefire hrbhgt lower',
 'Prefire hrbhgt upper',
 'Prefire hrbhgt mode',
 'Prefire hrbcov',
 'Postfire treehgt lower',
 'Postfire treehgt upper',
 'Postfire treehgt mode',
 'Postfire treecov',
 'Postfire shbhgt lower',
 'Postfire shbhgt upper',
 'Postfire shbhgt mode',
 'Postfire shbcov',
 'Postfire hrbhgt lower',
 'Postfire hrbhgt upper',
 'Postfire hrbhgt mode']

#### Robertson RF

In [124]:
filename='RobertsonRF_data_bionet2.xlsx'
worksheet='Site'
wbindex[filename][worksheet][0][40:48]

['NSW TEC', 'variant', 'Vegetation formation', 'Vegegtation class']

In [125]:
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})
batch_upsert(dbparams,"form.field_visit_veg_description",records,keycol=('visit_id','visit_date'), idx='field_visit_veg_description_pkey',execute=True)


Connecting to the PostgreSQL database...
2 rows updated
Database connection closed.


Edited dates to match the visit dates (previous day)

In [126]:
print(filename)
wbindex[filename].keys()
worksheet='Structure'
wbindex[filename][worksheet][0]

RobertsonRF_data_bionet2.xlsx


['SiteNo',
 'Replicate',
 'Date',
 'Stage',
 'Stratum',
 'LowerHeight',
 'UpperHeight',
 'ModalHeight',
 'PercentCover',
 'Dominant1',
 'Dominant2']

In [127]:
col_def={'visit_id':1, 'visit_date':3, 'stage':4, 'stratum':5, 'height':(8,6,7),'cover':(9,)}
records=fv.read_veg_structure(inputdir,filename,worksheet,col_def)
len(records)

32

In [128]:
valid_records=list()
for record in records:
    p=filter(lambda n: n['visit_id'] == record['visit_id'] and  n['visit_date'] == record['visit_date'], all_visits)
    found=list(p)
    if len(found) == 0:
        print(record)
    else:
        valid_records.append(record)
print(len(valid_records))

32


In [129]:
batch_upsert(dbparams,"form.field_visit_vegetation_estimates",
             records,
             keycol=('visit_id','visit_date','measured_var'), 
             idx='field_visit_vegetation_estimates_pkey',execute=True)


Connecting to the PostgreSQL database...
32 rows updated
Database connection closed.


#### NE NSW / SE Qld Rainforest
(or RMK)

In [131]:
filename='UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx'
worksheet='Site'
wbindex[filename][worksheet][0][0:8]

['Site',
 'Replicate',
 'Observers (comma sep if >1)',
 'Date of samping',
 'Survey Date Replicate 1',
 'Survey Date Replicate 2',
 'Survey Date Replicate 3',
 'Survey Date Replicate 4']

In [133]:
records = fv.read_veg_classes(inputdir,filename,worksheet,{'visit_id':1,'visit_date':4,'vegetation_formation':43,'vegetation_class':44})
len(records)

17

In [134]:
batch_upsert(dbparams,"form.field_visit_veg_description",records,keycol=('visit_id','visit_date'), idx='field_visit_veg_description_pkey',execute=True)


Connecting to the PostgreSQL database...
17 rows updated
Database connection closed.


In [135]:
wbindex[filename].keys()

dict_keys(['Site', 'Fire', 'VegStructure', 'Floristics', 'pivot', 'Reference'])

In [136]:
worksheet='VegStructure'
wbindex[filename][worksheet][1]

['Site',
 'Replicate',
 'Prefire treehgt lower',
 'Prefire treehgt upper',
 'Prefire treehgt mode',
 'Prefire treecov',
 'Prefire shbhgt lower',
 'Prefire shbhgt upper',
 'Prefire shbhgt mode',
 'Prefire shbcov',
 'Prefire hrbhgt lower',
 'Prefire hrbhgt upper',
 'Prefire hrbhgt mode',
 'Prefire hrbcov',
 'Postfire treehgt lower',
 'Postfire treehgt upper',
 'Postfire treehgt mode',
 'Postfire treecov',
 'Postfire shbhgt lower',
 'Postfire shbhgt upper',
 'Postfire shbhgt mode',
 'Postfire shbcov',
 'Postfire hrbhgt lower',
 'Postfire hrbhgt upper',
 'Postfire hrbhgt mode']

### Update Units

Update units for vegetation measurements.

In [138]:

upds = ["""
UPDATE form.field_visit_vegetation_estimates  
SET units='m' 
WHERE measured_var::text LIKE '%height' AND units IS NULL;
""","""
UPDATE form.field_visit_vegetation_estimates  
SET units='%' 
WHERE measured_var::text LIKE '%cover' AND units is NULL;
"""]

conn = psycopg2.connect(**dbparams)
cur = conn.cursor(cursor_factory=DictCursor)

for upd in upds:
    cur.execute(upd)

cur.close()
conn.commit()
conn.close()

## That is it for now!

✅ Job done! 😎👌🔥

You can:
- go [back home](../Instructions-and-workflow.ipynb),
- continue navigating the repo on [GitHub](https://github.com/ces-unsw-edu-au/fireveg-db-exports)
- continue exploring the repo on [OSF](https://osf.io/h96q2/).
- visit the database at <http://fireecologyplants.net>