# Export list of references

Author: [José R. Ferrer-Paris](https://github.com/jrfep)

This jupyter notebook creates an output workbook with the list of references used in the `litrev` schema of the database.

## Set up
Load modules

In [1]:
# work with paths in operating system
from pathlib import Path
import os

# datetime support
import datetime
# For database connection
from configparser import ConfigParser
import psycopg2
from psycopg2.extras import DictCursor

# Pandas for calculations
import pandas as pd
# Regular expressions
import re
# Pyprojroot for easier handling of working directory
import pyprojroot

Load functions from `lib` folder, we will use a function to read db credentials, one for executing database queries and three functions for extracting data from the reference description string

In [2]:
from lib.parseparams import read_dbparams
from lib.firevegdb import dbquery
from lib.firevegrefs import extract_year,extract_authors,extract_rest

Define project directory using the `pyprojroot` functions.

In [3]:
repodir = pyprojroot.find_root(pyprojroot.has_dir(".git"))
inputdir = repodir / "data" / "output-report"
os.listdir(inputdir)

['fireveg-db-references.xlsx']

## Database query
Database connection and query using functions defined in the project library.
Database credentials are stored in a `database.ini` file.

In [4]:
dbparams = read_dbparams(repodir / 'secrets' / 'database.ini', section='aws-lght-sl')

In [5]:
ref_info = dbquery("SELECT * FROM litrev.ref_list ", dbparams)

## Create data frame and add columns
We transform the query result to a data frame and add three columns with data extracted from the ref_cite string.

In [6]:
df=pd.DataFrame(ref_info,columns=ref_info[1].keys())

In [7]:
df['date']=df.apply(lambda row : extract_year(row['ref_cite']), axis = 1)
df['authors']=df.apply(lambda row : extract_authors(row['ref_cite']), axis = 1)
df['ref_info']=df.apply(lambda row : extract_rest(row['ref_cite']), axis = 1)

In [8]:
df

Unnamed: 0,ref_code,ref_cite,alt_code,date,authors,ref_info
0,Peter Byrne Beerwah Qld. unpub.,"Peter Byrne, Beerwah, Qld. (unpublished)",NSWFFRD-NFRR-ref-B,unpub.,"Peter Byrne, Beerwah, Qld.",
1,Baird 1977,"Baird, A.M. (1977). Regeneration after fire in...",NSWFFRD-NFRR-ref-BA,1977,"Baird, A.M.",". Regeneration after fire in King's Park, Pert..."
2,Benson McDougall 1995,"Benson, D. and McDougall, L. (1995). Ecology o...",NSWFFRD-NFRR-ref-BB,1995,"Benson, D. and McDougall, L.",. Ecology of Sydney plant species part 3: Dico...
3,Benson McDougall 1997,"Benson, D. and McDougall, L. (1997). Ecology o...",NSWFFRD-NFRR-ref-BD,1997,"Benson, D. and McDougall, L.",. Ecology of Sydney plant species part 5: Dico...
4,Benson 1985,"Benson, D.H. (1985). Maturation periods for fi...",NSWFFRD-NFRR-ref-BE,1985,"Benson, D.H.",. Maturation periods for fire sensitive shrub ...
...,...,...,...,...,...,...
304,Baskin & Baskin 2014,"Baskin, C. and Baskin, J.M. (2014) Seeds: Ecol...",,2014,"Baskin, C. and Baskin, J.M.","Seeds: Ecology, Biogeography, and Evolution o..."
305,Vening etal 2017,Vening etal 2017 Aust J Bot,,2017,Vening etal 2017 Aust J Bo,Vening etal 2017 Aust J Bot
306,Myerscough 1998,Myerscough 1998 Cunninghamia,,1998,Myerscough 1998 Cunninghami,Myerscough 1998 Cunninghamia
307,Clarke et al 2000,Clarke et al 2000,,200,Clarke et al 200,Clarke et al 2000


## Export to excel workbook
Here we use a simple excel export function to save the results:

In [9]:
df.to_excel(inputdir / "fireveg-db-references.xlsx") 

# Done for today
