# Fireveg DB imports -- import field work forms

Author: [José R. Ferrer-Paris](https://github.com/jrfep)

Date: February 2022, updated 20 August 2024

This Jupyter Notebook includes [Python](https://www.python.org) code to update various fields in the records from the field samples.

**Please note:**
<div class="alert alert-warning">
    This repository contains code that is intended for internal project management and is documented for the sake of reproducibility.<br/>
    🛂 Only users contributing directly to the project have access to the credentials for data download/upload. 
</div>

## Set-up
### Load libraries 

In [1]:
import openpyxl
from pathlib import Path
import os,sys
from datetime import datetime
from configparser import ConfigParser
import psycopg2
from psycopg2.extensions import AsIs
import pyprojroot
import re
import pandas as pd

import pyprojroot

### Define paths for input and output

In [2]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
sys.path.append(str(repodir))

### Load own functions

Load functions from `lib` folder, we will use a function to read db credentials and one for batch insert and updates:

In [4]:
from lib.parseparams import read_dbparams
from lib.firevegdb import dbquery, batch_upsert

### Database credentials

🤫 We use a folder named "secrets" to keep the credentials for connection to different services (database credentials, API keys, etc). This checked this folder in our `.gitignore` so that its content are not tracked by git and not exposed. Future users need to copy the contents of this folder manually.

We read database credentials stored in a `database.ini` file using our own `read_dbparams` function.

In [5]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', 
                         section='fireveg-db-v1.1')

## Update observer ids

Do we have all people with an observer id?

In [7]:
dbquery("select * from form.observerid",dbparams)

[[7, 'David', 'Keith'],
 [9, 'D.', 'Benson'],
 [10, 'L.', 'Watts,'],
 [11, 'T.', 'Manson'],
 [12, 'Jackie', 'Miles'],
 [13, 'Robert', 'Kooyman'],
 [8, 'Alexandria', 'Thomsen'],
 [14, 'Jedda', 'Lemmen']]

In [8]:
other_observers = [{'userkey': 1,
 'givennames': 'Chris',
 'surname': 'Simpson'},
 {'userkey': 2,
 'givennames': 'Freya',
 'surname': 'Thomas'},
 {'userkey': 3,
 'givennames': 'Kate',
 'surname': 'Giljohann'},
 {'userkey': 4,
 'givennames': 'Mark',
 'surname': 'Tozer'},
 {'userkey': 5,
 'givennames': 'Renee',
 'surname': 'Woodward'}]


In [9]:
batch_upsert(dbparams, 
             table='form.observerid',
             records=other_observers,
             keycol=['userkey',], 
             idx='observerid_pkey',
             execute = True)

Connecting to the PostgreSQL database...
5 rows updated
Database connection closed.


Now we can try to update the field visit information with these keys:

In [11]:
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(**dbparams)
cur = conn.cursor()
updated_rows=0

qry = """
WITH A AS (
SELECT visit_id, visit_date, userkey, observerlist[1] AS obs1 
FROM form.field_visit 
LEFT JOIN form.observerid 
    ON observerlist[1]=givennames || ' ' || surname
WHERE survey_name='Mallee Woodlands' AND observerlist is not NULL
)
INSERT INTO form.field_visit(visit_id,visit_date,mainobserver) 
SELECT visit_id,visit_date,userkey FROM A
ON CONFLICT ON CONSTRAINT field_visit_pkey 
    DO UPDATE SET mainobserver=EXCLUDED.mainobserver
"""
cur.execute(qry)
updated_rows = updated_rows + cur.rowcount 
conn.commit()        
cur.close()
print("%s rows updated" % (updated_rows))
conn.close()
print('Database connection closed.')

Connecting to the PostgreSQL database...
53 rows updated
Database connection closed.


## Update information from comments

Need to update:
- some common mistakes in resprout organ and seedbank type

Need to add this information into the database:

- count of fully scorched & resprouting individuals
- count of fully scorched & fire-killed individuals
- count of partially scorched & resprouting individuals
- count of partially scorched & fire-killed individuals

Also identify adults from not adults


### Read valid vocabularies

Check age and scorch vocabularies:

In [12]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='scorch_vocabulary';"
scorch_list = dbquery(qry, dbparams)
scorch_vocab = [item for t in scorch_list for item in t]

In [13]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='age_group';"
age_list = dbquery(qry, dbparams)
age_vocab = [item for t in age_list for item in t]

In [14]:
print(age_vocab)
print(scorch_vocab)

['adult', 'juvenile', 'other']
['Full canopy scorch', 'Partial scorch', 'Other']


Check seedbank and reprout organ

In [15]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='seedbank_vocabulary';"
valid_seedbank_list = dbquery(qry, dbparams)
seedbank_vocab = [item for t in valid_seedbank_list for item in t]

In [16]:
qry = "SELECT enumlabel FROM pg_enum e LEFT JOIN pg_type t ON e.enumtypid=t.oid where typname='resprout_organ_vocabulary';"
valid_organ_list = dbquery(qry, dbparams)
organ_vocab = [item for t in valid_organ_list for item in t]

In [17]:
print(seedbank_vocab)
print(organ_vocab)

['Soil-persistent', 'Transient', 'Canopy', 'Non-canopy', 'Other']
['Epicormic', 'Apical', 'Lignotuber', 'Basal', 'Tuber', 'Tussock', 'Short rhizome', 'Long rhizome or root sucker', 'Stolon', 'None', 'Other']


### Update seedbank/resprout organ from comments

In [19]:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(**dbparams)
cur = conn.cursor()
updated_rows=0

Connecting to the PostgreSQL database...


In [20]:
qrystr="""
UPDATE form.quadrat_samples
SET seedbank=%s
WHERE
%s=ANY(comments)
AND seedbank is NULL
"""

In [21]:
mtchs = [
    ("Soil-persistent","seedbank written as persistent soil"),
    ("Soil-persistent","seedbank written as soil persistent"),
    ("Non-canopy","seedbank written as non canopy")
]

In [22]:
for mtch in mtchs:
    qry = cur.mogrify(qrystr, mtch)
    cur.execute(qry)
    updated_rows = updated_rows + cur.rowcount 

In [23]:
qrystr="""
UPDATE form.quadrat_samples
SET resprout_organ=%s
WHERE
%s=ANY(comments)
AND resprout_organ is NULL
"""

In [24]:
mtchs = [
    ("Short rhizome","resprout organ written as rhizome short"),
    ("Basal","resprout organ written as basal stems"),
    ("Tussock","resprout organ written as none/tussock")
]
# ("???","resprout organ written as rhizome long"),

In [25]:
for mtch in mtchs:
    qry = cur.mogrify(qrystr, mtch)
    cur.execute(qry)
    updated_rows = updated_rows + cur.rowcount 

In [26]:
conn.commit()        
cur.close()
print("%s rows updated" % (updated_rows))
conn.close()
print('Database connection closed.')

10379 rows updated
Database connection closed.


In [27]:
qry="""SELECT seedbank, resprout_organ, count(*) FROM form.quadrat_samples GROUP BY seedbank, resprout_organ;"""
res=dbquery(qry,dbparams)
res

[[None, 'None', 3],
 ['Non-canopy', 'Epicormic', 23],
 ['Non-canopy', 'None', 614],
 ['Soil-persistent', 'None', 6102],
 ['Canopy', 'Tussock', 1],
 ['Soil-persistent', 'Stolon', 133],
 ['Transient', 'Epicormic', 7],
 ['Transient', 'Stolon', 3],
 ['Transient', 'None', 164],
 ['Transient', None, 489],
 [None, 'Stolon', 2],
 ['Non-canopy', 'Basal', 1033],
 ['Transient', 'Lignotuber', 1],
 ['Soil-persistent', 'Tuber', 73],
 [None, None, 1560],
 ['Soil-persistent', 'Basal', 611],
 ['Soil-persistent', None, 1187],
 ['Transient', 'Apical', 38],
 ['Soil-persistent', 'Lignotuber', 495],
 ['Non-canopy', 'Short rhizome', 9],
 ['Transient', 'Basal', 32],
 [None, 'Basal', 1],
 ['Transient', 'Tuber', 122],
 ['Non-canopy', 'Tuber', 303],
 [None, 'Lignotuber', 2],
 [None, 'Tuber', 17],
 ['Non-canopy', 'Lignotuber', 60],
 ['Soil-persistent', 'Apical', 4],
 ['Soil-persistent', 'Epicormic', 4],
 ['Non-canopy', 'Tussock', 678],
 ['Canopy', 'None', 113],
 ['Canopy', 'Epicormic', 67],
 ['Non-canopy', None, 

### Add information about partial scorch

Connect to database

In [28]:
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(**dbparams)
cur = conn.cursor()

Connecting to the PostgreSQL database...


Filter comments by keywords:

In [30]:
qry = """
WITH A AS (select record_id,unnest(comments) as note from form.quadrat_samples)
SELECT record_id,note FROM A WHERE note ilike '%partial%';
"""
cur.execute(qry)

records=cur.fetchall()
records

[(13024,
  '10 resprouters with dbh <10cm, 7+3 with dbh>10cm (one + two partially burnt), killed plants with dbh 5cm, 9.5cm & 16cm'),
 (13029, 'adult tree dbh 46 cm, partially burnt')]

We can run several updates:

In [31]:
updates_to_run = ["""
UPDATE form.quadrat_samples SET life_stage='juvenile' where record_id IN
(WITH A AS (select record_id,unnest(comments) as note from form.quadrat_samples)
SELECT record_id FROM A WHERE note ilike '%juvenil%');
""","""
UPDATE form.quadrat_samples SET life_stage='adult' where record_id IN
(WITH A AS (select record_id,unnest(comments) as note from form.quadrat_samples)
SELECT record_id FROM A WHERE note ilike '%adult %');
""","""
UPDATE form.quadrat_samples SET life_stage='other' where record_id IN
(WITH A AS (select record_id,unnest(comments) as note from form.quadrat_samples)
SELECT record_id FROM A WHERE note ilike '%sapling%');
""","""
UPDATE form.quadrat_samples SET scorch='Partial scorch' where record_id IN
(WITH A AS (select record_id,unnest(comments) as note from form.quadrat_samples)
SELECT record_id FROM A WHERE note ilike '%partial%');
"""]

for upd in updates_to_run:
    cur.execute(upd)



Check how many records are updated:

In [32]:
qry= "select life_stage,scorch,count(*) from form.quadrat_samples group by life_stage,scorch;"
cur.execute(qry)
cur.fetchall()

[(None, None, 18390),
 ('juvenile', None, 4),
 ('adult', None, 1),
 (None, 'Partial scorch', 1),
 ('adult', 'Partial scorch', 1)]

In [33]:
conn.commit()
cur.close()
if conn is not None:
    conn.close()
    print('Database connection closed.')

Database connection closed.


## That is it for now!

✅ Job done! 😎👌🔥

You can:
- go [back home](../Instructions-and-workflow.ipynb),
- continue navigating the repo on [GitHub](https://github.com/ces-unsw-edu-au/fireveg-db-exports)
- continue exploring the repo on [OSF](https://osf.io/h96q2/).
- visit the database at <http://fireecologyplants.net>