# Read files for the Mallee Woodlands

This Excel workbook was prepared by Prof. David Keith, FAA, and imported on May 2023.

We need to adapt functions defined in modules `fireveg` and `firevegdb`  to:

- Read data from spreadsheets with field-work data
- Create records for data import into the database
- Insert or update records in the database

For this dataset we have several sites (S2007/1, T2001/1, etc), each site has several subplots with different treatments (A, K, N, R, G, X1, X2, X3) and replicates for each site/subplot.

This jupyter notebook runs through each step of data import, starting with field site and visit information. Then... other steps

## Set-up
Load libraries 

In [1]:
import openpyxl
from pathlib import Path
import os
from datetime import datetime
from configparser import ConfigParser
import psycopg2
from psycopg2.extensions import AsIs
import pyprojroot
import re

Load functions from `lib` folder, we will use a function to read db credentials and one for batch insert and updates:

In [2]:
from lib.parseparams import read_dbparams
from lib.firevegdb import batch_upsert
from lib.firevegdb import validate_and_update_site_records

import lib.fireveg as fv

Define path to workbooks

In [3]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
inputdir = repodir / "data" / "input-field-form"

Database credentials are stored in a database.ini file

In [4]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', section='aws-lght-sl')

## List of workbooks/spreadsheets in directory

Each spreadsheet has a slightly different structure, so these scripts have to be adapted for each case.

We use functions from module `fireveg` to read the data and create records, and functions from module `firevegdb` to execute the SQL insert or update query.


In [5]:
os.listdir(inputdir)

['UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton_revisedfields_Mar2022.xlsx',
 'PlantFireTraitData_2011-2018_Import_AdditionalSiteInfo.xlsx',
 'UNSW_VegFireResponse_KNP AlpAsh_firehistupdate.xlsx',
 'SthnNSWRF_data_bionet2.xlsx',
 'UNSWFireVegResponse_UplandBasalt_AlexThomsen+DK.xlsx',
 'PlantFireTraitData_2011-2018_Import.xlsx',
 '.ipynb_checkpoints',
 'UNSW_VegFireResponse_RMK_reformat_Sep2021a.xlsx',
 'UNSW_VegFireResponse_DataEntry_Yatteyattah all +DK +Milton.xlsx',
 'UNSW_VegFireResponse_KNP AlpAsh.xlsx',
 'UNSW_VegFireResponse_AlpineBogs_reformat_Sep2021.xlsx',
 'RobertsonRF_data_bionet2.xlsx',
 'Fire response quadrat survey Newnes Nov2020_DK_revised IDs+AllNovData.xlsm']

In [6]:
valid_files =  ['PlantFireTraitData_2011-2018_Import.xlsx',
             'PlantFireTraitData_2011-2018_Import_AdditionalSiteInfo.xlsx']


Here we create an index of worksheets and column headers for each file

In [7]:
wbindex=dict()
for workbook_name in valid_files:
    inputfile=inputdir / workbook_name
    # using data_only=True to get the calculated cell values
    wb = openpyxl.load_workbook(inputfile,data_only=True)
    wbindex[workbook_name]=dict()
    for ws in wb.worksheets:
        wbindex[workbook_name][ws._WorkbookChild__title]=list()
        for k in range(1,ws.max_column):
            wbindex[workbook_name][ws._WorkbookChild__title].append(ws.cell(row=1,column=k).value)
        

## Database queries

Database connection

In [8]:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(**dbparams)
cur = conn.cursor()

Connecting to the PostgreSQL database...


### create new survey

In [9]:
updated_rows = 0
qry = "INSERT INTO form.surveys(survey_name) values ('Mallee Woodlands') ON CONFLICT DO NOTHING;"
cur.execute(qry)
if cur.rowcount > 0:
    updated_rows = cur.rowcount
else:
    print(qry)
conn.commit() 
print("%s rows updated" % (updated_rows))



INSERT INTO form.surveys(survey_name) values ('Mallee Woodlands') ON CONFLICT DO NOTHING;
0 rows updated


In [10]:
cur.execute("SELECT * FROM form.surveys;")
surveys = cur.fetchall()

In [11]:
surveys

[('TO BE CLASSIFIED',
  'Placeholder for field visits not yet assigned to a survey',
  'JR Ferrer-Paris'),
 ('UplandBasalt', 'Upland Basalt', None),
 ('Rainforests NSW-Qld', 'Rainforests NE NSW & SE Qld', None),
 ('NEWNES', 'Newnes plateau swamps', None),
 ('KNP AlpAsh', 'Kosciuszko NP Alpine Ash', None),
 ('SthnNSWRF', 'Southern NSW Rainforests', None),
 ('Alpine Bogs', 'Alpine Bogs', None),
 ('Robertson RF', 'Robertson RF', None),
 ('Yatteyattah', 'Yatteyattah', None),
 ('Mallee Woodlands', 'Mallee Woodlands', None)]

### Valid vocabularies

In [12]:
cur.execute("SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='resprout_organ_vocabulary';")
valid_organ_list = cur.fetchall()
organ_vocab = [item for t in valid_organ_list for item in t]

cur.execute("SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='seedbank_vocabulary';")
valid_seedbank_list = cur.fetchall()
seedbank_vocab = [item for t in valid_seedbank_list for item in t]

### Close DB connection

In [13]:
cur.close()
if conn is not None:
    conn.close()
    print('Database connection closed.')

Database connection closed.


## Import data from each worksheet

In the following section, I proceed to iterate through worksheets in the the workbook, using functions defined in the `fireveg` and `firevegdb` modules.

Here is the list of available worksheets:

In [14]:
filename=valid_files[0]

wbindex[filename].keys()

dict_keys(['SiteData', 'FireEvents', 'PlantCounts'])

### Import site visits records into database

- 56 sites/visits in the period 2011 to 2018
- But we need to fix the site label to exclude the replicate number

The original list was incomplete, so we need to read two workbooks. We can retrieve the list of column names that we will use in our column definitions for each function:

In [15]:
cols=wbindex[valid_files[0]]['SiteData']
for k in range(1,len(cols)):
    print("%s :: %s" % (k-1,cols[k-1]))

0 :: Site_subplot_census
1 :: Site
2 :: Subplot
3 :: Replicate
4 :: Observers (comma sep if >1)
5 :: Date of samping
6 :: Survey Date Replicate 1
7 :: Survey Date Replicate 2
8 :: Survey Date Replicate 3
9 :: Survey Date Replicate 4
10 :: Survey Date Replicate 5
11 :: Survey Date Replicate 6
12 :: Location text
13 :: Zone
14 :: Easting
15 :: Northing
16 :: GPS Precision (m)
17 :: Latitude
18 :: Longitude
19 :: Layout & GPS marker position
20 :: 2nd ref point Zone
21 :: 2nd ref point Easting
22 :: 2nd ref point Northing
23 :: 2nd ref point Position of GPS
24 :: 3rd ref point Zone
25 :: 3rd ref point Easting
26 :: 3rd ref point Northing
27 :: 3rd ref point Position of GPS
28 :: 4th ref point Zone
29 :: 4th ref point Easting
30 :: 4th ref point Northing
31 :: 4th ref point Position of GPS
32 :: Total sample area (sq.m)
33 :: Subquadrat area (sq.m)
34 :: # subquadrats
35 :: Substrate
36 :: Notes
37 :: Slope
38 :: Aspect
39 :: Elevation
40 :: Disturbance notes
41 :: Cwth TEC
42 :: NSW TEC
4

In [16]:
cols=wbindex[valid_files[1]]['Sheet1']
for k in range(1,len(cols)):
    print("%s :: %s" % (k-1,cols[k-1]))

0 :: Site_subplot_census
1 :: Site
2 :: Subplot
3 :: Replicate
4 :: Observers (comma sep if >1)
5 :: Date of samping
6 :: Survey Date Replicate 1
7 :: Survey Date Replicate 2
8 :: Survey Date Replicate 3
9 :: Survey Date Replicate 4
10 :: Survey Date Replicate 5
11 :: Survey Date Replicate 6
12 :: Location text
13 :: Zone
14 :: Easting
15 :: Northing
16 :: GPS Precision (m)
17 :: Latitude
18 :: Longitude
19 :: Layout & GPS marker position
20 :: 2nd ref point Zone
21 :: 2nd ref point Easting
22 :: 2nd ref point Northing
23 :: 2nd ref point Position of GPS
24 :: 3rd ref point Zone
25 :: 3rd ref point Easting
26 :: 3rd ref point Northing
27 :: 3rd ref point Position of GPS
28 :: 4th ref point Zone
29 :: 4th ref point Easting
30 :: 4th ref point Northing
31 :: 4th ref point Position of GPS
32 :: Total sample area (sq.m)
33 :: Subquadrat area (sq.m)
34 :: # subquadrats
35 :: Substrate
36 :: Notes
37 :: Slope
38 :: Aspect
39 :: Elevation
40 :: Disturbance notes
41 :: Cwth TEC
42 :: NSW TEC
4

Slightly different dictionary for both workbooks!

In [17]:
cdict = {'site_label':1,'location_description':12, 'utm_zone':13,'xs':(14,), 'ys':(15,), 
        'gps_geom_description':19, 
         'visit_date':(5,), 'replicate_nr':3,'observerlist':4, 'survey':"Mallee Woodlands"}

In [18]:
site_records = fv.import_records_from_workbook(filepath=inputdir,
                                               workbook='PlantFireTraitData_2011-2018_Import.xlsx',
                                                worksheet='SiteData',
                                                col_dictionary=cdict,
                                                create_record_function=fv.create_field_site_record)

In [19]:
cdict2 = {'site_label':0,'location_description':12, 'utm_zone':13,'xs':(14,), 'ys':(15,), 
        'gps_geom_description':19, 
         'visit_date':(5,), 'replicate_nr':3,'observerlist':4, 'survey':"Mallee Woodlands"}

In [20]:
more_site_records = fv.import_records_from_workbook(filepath=inputdir,
                                               workbook=valid_files[1],
                                                worksheet='Sheet1',
                                                col_dictionary=cdict2,
                                                create_record_function=fv.create_field_site_record)

In [21]:
site_records[1]

{'site_label': 'S2007/2',
 'location_description': 'Scotia Sanctuary, southwestern sector, West of Elliots Bore, edge of burnt area',
 'gps_geom_description': 'Centre point at intersection of A, K, R & N subplots with G subplot adjacent and X1-X3 separated and wrapped around G subplot',
 'geom': "ST_GeomFromText('POINT(505169 6318145)', 28354)"}

In [22]:
len(more_site_records)

42

In [23]:
batch_upsert(dbparams,"form.field_site",site_records,keycol=('site_label',), idx='field_site_pkey1',execute=True)

Connecting to the PostgreSQL database...
56 rows updated
Database connection closed.


In [24]:
batch_upsert(dbparams,"form.field_site",more_site_records,keycol=('site_label',), idx='field_site_pkey1',execute=True)

Connecting to the PostgreSQL database...
15 rows updated
Database connection closed.


insert location and visit records based on the sample id, but then, how do we transform the subploots into sample nrs?


In [28]:
visit_records = fv.import_records_from_workbook(filepath=inputdir,
                                          workbook='PlantFireTraitData_2011-2018_Import.xlsx',
                                            worksheet='SiteData',
                                            col_dictionary=cdict,
                                            create_record_function=fv.create_field_visit_record) 

In [29]:
visit_records[1]

{'visit_id': 'S2007/2',
 'visit_date': datetime.datetime(2013, 9, 24, 0, 0),
 'survey_name': 'Mallee Woodlands',
 'observerlist': ['David Keith'],
 'replicate_nr': 3}

In [34]:
obslist=list()
for record in visit_records:
    if 'observerlist' in record.keys():
        for observer in record['observerlist']:
            obslist.append(observer)
uniq_obs = set(obslist)

In [35]:
uniq_obs

{'Chris Simpson',
 'David Keith',
 'Freya Thomas',
 'Kate Giljohann',
 'Mark Tozer',
 'Recorders not recorded in Notebook?',
 'Renee Woodward'}

In [27]:
more_visit_records = fv.import_records_from_workbook(filepath=inputdir,
                                          workbook=valid_files[1],
                                            worksheet='Sheet1',
                                            col_dictionary=cdict2,
                                            create_record_function=fv.create_field_visit_record) 

In [28]:
more_visit_records[1]

{'visit_id': 'S2010/2',
 'visit_date': datetime.datetime(2011, 10, 7, 0, 0),
 'survey_name': 'Mallee Woodlands',
 'replicate_nr': 1}

In [29]:
batch_upsert(dbparams,"form.field_visit",visit_records,keycol=('visit_id','visit_date'), idx='field_visit_pkey2',execute=True)

Connecting to the PostgreSQL database...
53 rows updated
Database connection closed.


In [30]:
batch_upsert(dbparams,"form.field_visit",more_visit_records,keycol=('visit_id','visit_date'), idx='field_visit_pkey2',execute=True)

Connecting to the PostgreSQL database...
42 rows updated
Database connection closed.


### Import fire history records

This was provided by David in May 2023, check if it works...

In [31]:
worksheet = 'FireEvents'
#wbindex[filename][worksheet][0][0:13]
cols=wbindex[filename][worksheet]
for k in range(1,len(cols)):
    print("%s :: %s" % (k-1,cols[k-1]))

0 :: Site
1 :: Replicate
2 :: Date of last fire dd/mm/yyyy
3 :: Date of penultimate fire
4 :: Date of earlier fire
5 :: How date inferred1
6 :: How date inferred2
7 :: How date inferred3
8 :: Ignition cause1
9 :: Ignition cause2
10 :: Ignition cause3
11 :: Scorch hgt (m) min
12 :: Scorch hgt (m) mas
13 :: Scorch hgt (m) mode
14 :: % Tree foliage scorch
15 :: % Tree foliage c'sume
16 :: % Shb foliage scorch
17 :: % Shb foliage c'sume
18 :: % Herb layer foliage scorch
19 :: % Herb layer foliage c'sume
20 :: Twig diam (mm) 1
21 :: Twig diam (mm) 2
22 :: Twig diam (mm) 3
23 :: Twig diam (mm) 4
24 :: Twig diam (mm) 5
25 :: Twig diam (mm) 6
26 :: Twig diam (mm) 7
27 :: Twig diam (mm) 8
28 :: Twig diam (mm) 9
29 :: Twig diam (mm) 10
30 :: Peat depth burnt (cm)
31 :: Peat extent burnt %quad


In [32]:
col_dicts=[{'site_label':0,'fire_date':2,'how_inferred':5,'cause_of_ignition':8},
    {'site_label':0,'fire_date':3,'how_inferred':6,'cause_of_ignition':9},
    {'site_label':0,'fire_date':4,'how_inferred':7,'cause_of_ignition':10}]
fire_records = fv.import_records_from_workbook(inputdir, filename, worksheet, col_dicts, create_record_function=fv.create_fire_history_record)
len(fire_records)

833

In [33]:
fire_records[10]

{'site_label': 'S2007/1_X1_3',
 'fire_date': '2006-11-01',
 'earliest_date': datetime.date(2006, 11, 1),
 'latest_date': datetime.date(2006, 11, 1),
 'how_inferred': 'Land manager records & pers obs'}

Need to adjust the site label (remove the trailing replicate number, and include all the missing site labels (sites with fire history but no visit recorded yet).

In [34]:
all_sites = list()
for record in fire_records:
    record['site_label']=re.sub("_[AKRNGX123]+_[0-9]$", "", record['site_label'])
    all_sites.append(record['site_label'])

In [35]:
add_site_records=list()
all_sites = set(all_sites)

for site in all_sites:
    add_site_records.append({'site_label':site})

In [36]:
len(add_site_records)

53

In [37]:
add_site_records[1:10]

[{'site_label': 'T1996/1'},
 {'site_label': 'T2001/4'},
 {'site_label': 'S2007/2'},
 {'site_label': 'T1997/2'},
 {'site_label': 'T2003/4'},
 {'site_label': 'T2000/4'},
 {'site_label': 'S2007/4'},
 {'site_label': 'S2010/4'},
 {'site_label': 'S2012/2'}]

In [38]:
batch_upsert(dbparams,"form.field_site",add_site_records,keycol=(), idx=None,execute=True)

Connecting to the PostgreSQL database...
0 rows updated
Database connection closed.


Now we can do the batch upsert of all the fire history records

In [39]:
batch_upsert(dbparams,"form.fire_history",fire_records,keycol=('site_label','fire_date'), idx='fire_history_pkey1',execute=True)

Connecting to the PostgreSQL database...
639 rows updated
Database connection closed.


### Import plant count data

In [40]:
worksheet = 'PlantCounts'
cols=wbindex[filename][worksheet]
for k in range(1,len(cols)):
    print("%s :: %s" % (k-1,cols[k-1]))

0 :: site_subplot_cen
1 :: Species_name
2 :: Recovery organ
3 :: Seedbank
4 :: Count of unburnt adlt individuals
5 :: Count of unburnt juv individuals
6 :: Count of resprouting juv individuals.
7 :: Count of resprouting adult individuals.
8 ::  # resprouted & died post-fire
9 :: Count of fire-killed juv individuals
10 :: Count of fire-killed adult individuals
11 :: #  reproductive pre-fire plants
12 :: Count of live postfire recruits
13 :: #  reproductive post-fire recruits
14 :: # recruits died post-fire
15 :: # reproductive recruits died post-fire
16 :: # live interfire recruits (>3yr postfire emerg
17 :: # live reproductive interfire recruits (>3yr postfire emerg
18 :: # deadinterfire recruits (>3yr postfire emerg)


In [41]:
col_dict={'visit_id':0, 'species':1,   
          'resprout_organ':2, 'seedbank':3,
          'adults_unburnt':4,
          'resprouts_live':6,
          'resprouts_kill':8,
          'resprouts_reproductive':7,
          'recruits_live':12, 
          'recruits_died':14, 
          'recruits_reproductive':13,
          'split_visit_id': True,
          'notes':19,'workbook':filename,'worksheet':worksheet}

In [42]:
quadrats = fv.import_records_from_workbook(inputdir, filename, worksheet, col_dict,
                                       fv.create_field_sample_record)

In [43]:
len(quadrats)

9051

In [44]:
quadrats[175]

{'visit_id': 'S2007/1', 'replicate_nr': 3, 'sample_nr': 'X2'}

In [45]:
samples = {"A":1,"K":2,"R":3,"N":4,"G":5,"X1":6,"X2":7,"X3":8,
           "AX":9, "KX":10, "RX":11, "NX":12,}
for record in quadrats:
    if record['sample_nr']:
        samplenr=record['sample_nr']
        record['sample_nr']=samples[samplenr]
        record['comment']=['Original sample code was %s' % samplenr]
    else:
        record['sample_nr']=99
        record['comment']=['Original sample code/nr was missing']


In [46]:
quadrats[175]

{'visit_id': 'S2007/1',
 'replicate_nr': 3,
 'sample_nr': 7,
 'comment': ['Original sample code was X2']}

Now check which ones are valid visit records (already present in the database)

In [47]:
new_conn = psycopg2.connect(**dbparams)

Manual fix for some missing visits

In [48]:
qrystr="""
INSERT INTO form.field_visit(visit_id,visit_date,replicate_nr,visit_description,survey_name) 
VALUES(%s,%s,%s,%s,%s)
ON CONFLICT DO NOTHING;
"""
   

In [49]:
#print('Connecting to the PostgreSQL database...')
#conn = psycopg2.connect(**dbparams)
cur = new_conn.cursor()
qry = cur.mogrify(qrystr, ('T1998/CON1','2013-04-14',1,'visit date unknown, please replace placeholder date','Mallee Woodlands'))
cur.execute(qry)
if cur.rowcount > 0:
        updated_rows = updated_rows + cur.rowcount
qry = cur.mogrify(qrystr, ('S2012/2','2014-01-10',1,'visit date unknown, please replace placeholder date','Mallee Woodlands'))
cur.execute(qry)
if cur.rowcount > 0:
        updated_rows = updated_rows + cur.rowcount
new_conn.commit()        
cur.close()
print("%s rows updated" % (updated_rows))
#conn.close()
#print('Database connection closed.')

1 rows updated


In [50]:
valid_visits = validate_and_update_site_records(quadrats,useconn=new_conn)

record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for S2012/2 is incomplete
record for T1998/CON1 is incomplete
record for T1998/CON1 is incomplete
record for T1998/CON1 is incomplete
record for T1998/CON1 is incomplete
526 rows updated


In [51]:
new_conn.close()

In [52]:
len(valid_visits)
#len(quadrats)

91

In [53]:
valid_visits[5]


['K259', datetime.date(2018, 9, 27), 6]

In [54]:
records=fv.import_records_from_workbook(inputdir, filename, worksheet, col_dict,
                                         fv.create_quadrat_sample_record,
                                         lookup=valid_visits, 
                                        valid_seedbank=seedbank_vocab, 
                                        valid_organ=organ_vocab)

In [55]:
records[555]

{'visit_id': 'S2007/6',
 'sample_nr': 'X1',
 'species': 'Triodia scariosa',
 'visit_date': datetime.date(2013, 4, 11),
 'resprouts_live': 0,
 'resprouts_kill': 0,
 'resprouts_reproductive': 34,
 'comments': ['visit_id originally recorded as S2007/6_X1_3',
  'Imported from workbook PlantFireTraitData_2011-2018_Import.xlsx using python script',
  'Imported from spreadsheet PlantCounts',
  'matched by replicate nr 3, assuming date object',
  'resprout organ written as rhizome short',
  'seedbank written as soil persistent']}

In [56]:
samples = {"A":1,"K":2,"R":3,"N":4,"G":5,"X1":6,"X2":7,"X3":8,
           "AX":9, "KX":10, "RX":11, "NX":12,}
valid_records=list()
invalid_records=list()
for record in records:
    if 'visit_date' in record.keys():
        if record['sample_nr']:
            samplenr=record['sample_nr']
            record['sample_nr']=samples[samplenr]
            record['comments'].append('Original sample code was %s' % samplenr)
        else:
            record['sample_nr']=99
            record['comments'].append('Original sample code/nr was missing')
        valid_records.append(record)
    else:
        invalid_records.append(record)

print("%s valid records and %s invalid records" % (len(valid_records), len(invalid_records)))


8588 valid records and 469 invalid records


In [57]:
batch_upsert(dbparams,table='form.quadrat_samples',
             records=valid_records,
             keycol=('visit_id','visit_date','sample_nr'),
             idx=None, execute=True)

Connecting to the PostgreSQL database...
8588 rows updated
Database connection closed.


In [58]:
record

{'visit_id': 'T2017/5',
 'sample_nr': 11,
 'species': 'Triodia scariosa',
 'visit_date': datetime.date(2017, 3, 25),
 'adults_unburnt': 57,
 'comments': ['visit_id originally recorded as T2017/5_RX_1',
  'Imported from workbook PlantFireTraitData_2011-2018_Import.xlsx using python script',
  'Imported from spreadsheet PlantCounts',
  'matched by replicate nr 1, assuming date object',
  'resprout organ written as rhizome short',
  'seedbank written as soil persistent',
  'Original sample code was RX']}

Done, next steps:
- fill the species code column for all these species. Check the other notebooks for updating species list.
- Check the invalid records (site labels not in the database, why?)
- Once the missing site labels have been sorted we need to run again, any duplicates?

In [59]:
from tabulate import tabulate
from IPython.display import HTML, display


In [60]:
vids=list()
for record in invalid_records:
    vids.append((record['visit_id'],record['replicate_nr']))

table = tabulate(set(vids), tablefmt='html')

display(HTML(table))

0,1
T2017/2,1.0
,1.0
S2012/2,1.0
T2017/3,1.0
T1998/CON1,


In [61]:
len(set(vids))

5

In case of error, if we need to delete these records and start again, we need to run these steps:

Manual edits to the list of observers

In [None]:
"""
INSERT INTO form.observerid(givennames,surname) 
VALUE ('Chris','Simpson'),('Freya','Thomas'),('Kate','Giljohann'),('Mark','Tozer'),('Renee','Woodward');
"""

In [None]:
"""
WITH A AS (
SELECT visit_id, visit_date, userkey, observerlist[1] AS obs1 
FROM form.field_visit 
LEFT JOIN form.observerid 
    ON observerlist[1]=givennames || ' ' || surname
WHERE survey_name='Mallee Woodlands' AND observerlist is not NULL
)
INSERT INTO form.field_visit(visit_id,visit_date,mainobserver) 
SELECT visit_id,visit_date,userkey FROM A
ON CONFLICT ON CONSTRAINT field_visit_pkey2 
    DO UPDATE SET mainobserver=EXCLUDED.mainobserver
"""