# ENVT3031 Practical 1: Accessing TERN Surveillance data with R

This tutorial is modified from the "'AusplotsR' Package and AusPlots Data Basics"

* Blanco-Martin, Bernardo (2019). Tutorial: "'AusplotsR' Package and AusPlots Data Basics". Terrestrial Ecology Research Network. Version 2019.06.0, June 2019. https://github.com/ternaustralia/TERN-Data-Skills/tree/master/EcosystemSurveillance_PlotData/AusPplots_BasicTutorial

`ausplotsR` is an R package for live extraction and preparation of TERN AusPlots ecosystem monitoring data. Through `ausplotsR`, users can: 
* Directly obtain plot-based data on vegetation and soils across Australia
* Preprocess these data into structures that facilitate the visualisation and analysis of ausplots data

Data preprocessing includes the computation of:
* Species occurrence
* Vegetation fractional and single cover
* Growth form
* Basal area (see below for details)



### The `ausplotsR` package currently includes 6 functions:

* `get_ausplots`: Extracts AusPlots data in R. The stating point for any AusPlots data exploration and analysis in R. 
* `species_table`: Generates species occurrence matrices using the chosen scoring method (i.e. presence/absence, cover, frequencey, or IVI index) from a data frame of individual raw intercept hits (generated from AusPlots data using the `get_ausplots` function).
* `fractional_cover`: Calculates fractional cover (i.e., the proportional cover of green vegetation, dead vegetation and bare substrate) from a data frame of individual raw intercept hits (generated from AusPlots data using the `get_ausplots` function).
* `growth_form_table`: Generates occurrence matrices for NVIS plant growth forms in plots using the chosen scoring method (i.e. presence/absence, percent cover or species richness -number of species assigned to a particular growth form-) from a data frame of individual raw intercept hits (generated from AusPlots data using the `get_ausplots` function).
* `single_cover_value`: Calculates a total vegetation cover by height and/or growh form per site from a data frame of individual raw intercept hits (generated from AusPlots data using the `get_ausplots` function). In this fucntion cover can be subsetted to vegetation over a specified height and/or by plant growth forms. By default, vegetation cover is calculated per plot for tree growth forms of 5 metres or higher (i.e. forests).
* `basal_area`: Calculates basal area (or number of basal wedge hits) for each plot using the raw basal wedge data (generated from AusPlots data using the `get_ausplots` function).


# INSTALLING and LOADING `ausplotsR`
R libraries need to be installed and loaded before they can be used in the R environment. 
### Installing
However, we are running R within the ecocloud enviornment which already has 'ausplotsR' installed. So we do not have to install it in this instance. If you are trying to complete these excersises in a different computer environment I suggest you see the examples in the "'AusplotsR' Package and AusPlots Data Basics" (see link above).
### Loading 'ausplotsR'
All libraries that are used in R must be loaded before they are available. This is achieved through the simple command below.

In [None]:
## Load the package
library(ausplotsR)

Help on the `ausplotsR` package and a vignette with a guide on how to use the package can be obtained with the code below

In [3]:
#help(ausplotsR)
#browseVignettes(package="ausplotsR")

# Part 1: OBTAIN & EXPLORE AusPlots DATA: `get_ausplots` function

The `get_ausplots` function extracts and compiles AusPlots data. 

Data of specific types, sites, geographical locations, and/or species can be requested via the function arguments.

*DATA TYPES:* Up to 8 different types of data can be obtained by setting the corresponding arguments to TRUE/FALSE:

  * `site_info`: Site summary data. Includes (among others): plot and visit details, landform data, geographic coordinates, and notes. Included by default. Site summary data are stored in the `site.info` data frame.
  * `structural_summaries`: Site vegetation structural summaries. Site vegetation structural summary data are stored in the `struct.summ` data frame.
  * `veg.vouchers`: Complete set of species records for the plot determined by a herbarium plus ID numbers for silica-dried tissue samples. Included by default. Vegetation vouchers data are stored in the `veg.vouch` data frame. 
  * `veg.PI`: Point Intercept (PI) data. Includes data on: substrate, plant species, growth form and height, etc at each of (typically) 1010 points per plot. Included by default. Vegetation point intercept data are stored in the `veg.PI` data frame.
  * `basal.wedge`: Basal Wedge Data Raw Hits. These data are required for the calculation of Basal Area by Species by Plot. Basal wedge data are stored in the `veg.basal` data frame.
  * `soil_subsites`: Information on what soil and soil metagenomics samples were taken at nine locations across the plot and their identification barcode numbers. Soil and soil metagenomics data are stored in the `soil.subsites` data frame. 
  * `soil_bulk_density`: Soil bulk density. Soil bulk density data are stored in the `soil.bulk` data frame.
  * `soil_character`: Soil characterisation and sample ID data at 10 cm increments to a depth of 1 m. Soil characterisation and sample ID data are stored in the `soil.char` data frame.

*SPATIAL FILTERING:* AusPlot data can be spatially subset via the `get_ausplots` function arguments in two ways:

  * `my.Plot_IDs`: Character vector with the plots IDs of specific AusPlots plots. 
  * `bounding_box`: Spatial filter for selecting AusPlots based on a rectangular box, in the format of e.g. c(xmin, xmax, ymin, ymax). AusPlots spatial data are are in longlat, thus x is the longitude and y is the latitude of the box/extent object (e.g., c(120, 140, -30, -10)).  

## Example 1: In this example we download all the available data at three ausplot sites

* The following code makes an extracts all availalbe data from the database for three sites in SA, Qld. and the NT
* The code puts the extracted data into the 'list object' called 'AP.data'
* AP.data contains a series of data frames (fancy R tables) that we will explore in the rest of the practical

In [11]:
# Obtain the data ('site_info', 'veg.vouchers', and 'veg.PI' are retrieve by default)
AP.data = get_ausplots( my.Plot_IDs=c("SATFLB0004", "QDAMGD0022", "NTASTU0002"),
                        structural_summaries=TRUE, basal.wedge=TRUE,
                        soil_subsites=TRUE, soil_bulk_density=TRUE, soil_character=TRUE  )

User-supplied Plot_IDs located. 


* Explore retrieved data by running the subsequent cells and thinking about what is returned

In [12]:
# By typeing class(AP.data) we are told that the data structure is a 'list'
class(AP.data)

In [15]:
# By running 'summary(AP.data)' we are given a summary of the list
summary(AP.data)

# You can see that there are 7 different data frames
# These data frames match what is listed under Part 1 above

              Length Class      Mode     
site.info     43     data.frame list     
struct.summ   15     data.frame list     
soil.subsites 12     data.frame list     
soil.bulk     15     data.frame list     
soil.char     34     data.frame list     
veg.basal     10     data.frame list     
veg.vouch     12     data.frame list     
veg.PI        13     data.frame list     
citation       1     -none-     character

* We can use the dollar sign ($) to get data from a lower level in the list object

In [13]:
# The following code isolates one of the data frames in AP.data
AP.data.siteinfo = AP.data$site.info

In [None]:
# We can use the same code as above to examine the new object
# Notice that now the output is a data frame
class(AP.data.siteinfo)

In [None]:
# Data frames have rows and columns - just like an excel spreadsheet
# The rows in the data frame are each of the ausplot sites
# The columns in the data frame are the different properties at each of those sites
# When we ask for the summary of the data frame we are given all the column headings of the data frame 
summary(AP.data.siteinfo)

In [22]:
# We can also lsee how many plots there are in the data frame by asking how many rows there are with the following
nrow(AP.data.siteinfo)

* It is also useful to save data frames off as another format such as csv
* Use the code below to save the new data frame to csv

In [23]:
# Save an AusPlots derived Data Frame (generated for pre-processing), using 'write.csv'
# =====================================================================================

# Provide Path for Directory where data will be stored
file.path = "workspace"

# Create Name of the file to be stored
file.name.sting = "site.info" #############  TYPE HERE WHAT YOU WANT TO CALL YOUR FILE
# Add the "txt" extension
file.name = paste(file.name.sting,"csv",sep=".")

# Save the Basal Area data to a Text File with columns separated by tabs
write.csv(AP.data.siteinfo, paste(file.path, file.name, sep="/"))

## Example 2: In this example we extract data for Adelaide and its sourrounding area

* The following code extracts data for Adelaide (34.92866S  138.59863E) and its sourrounding area
* This time the code puts the extracted data into the 'list object' called 'AP.data.AdelReg'

In [27]:
# 'site_info', 'veg.vouchers', and 'veg.PI' data retrived for Adelaide (34.92866S  138.59863E) and its sourrounding area
# Notice that we do not ask for structural_summaries, basal.wedge, soil_subsites, soil_bulk_density or soil_character like we did in the first example
AP.data.AdelReg = get_ausplots(bounding_box=c(138.1, 139.1, -34.5, -35.5))

In [31]:
# Using the same commands that we used before explore the new output
class(AP.data.AdelReg)  
summary(AP.data.AdelReg)

In [32]:
# Make a separate data frame of veg.vouch 
AP.data.AdelReg.vegvouch = AP.data.AdelReg$veg.vouch

In [34]:
# Using the same commands that we used before explore the new output
class(AP.data.AdelReg.vegvouch)  
summary(AP.data.AdelReg.vegvouch)

In [35]:
# Save an AusPlots derived Data Frame (generated for pre-processing), using 'write.csv'
# =====================================================================================

# Provide Path for Directory where data will be stored
file.path = "workspace"

# Create Name of the file to be stored
file.name.sting = "veg.vouch" #############  TYPE HERE WHAT YOU WANT TO CALL YOUR FILE
# Add the "txt" extension
file.name = paste(file.name.sting,"csv",sep=".")

# Save the Basal Area data to a Text File with columns separated by tabs
write.csv(AP.data.AdelReg.vegvouch, paste(file.path, file.name, sep="/"))

## Example 3A: Get data for a transect in South Australia

In [46]:
# make a list of sites
transect_list <- c('SATEYB0001', 'SATEYB0002', 'SATFLB0001', 'SATFLB0002', 'SATFLB0003', 'SATFLB0004', 'SATFLB0005', 'SATFLB0006', 
                   'SATFLB0007', 'SATFLB0008', 'SATFLB0009', 'SATFLB0010', 'SATFLB0011', 'SATFLB0012', 'SATFLB0013', 'SATFLB0014', 
                   'SATFLB0015', 'SATFLB0016', 'SATFLB0017', 'SATFLB0018', 'SATFLB0019', 'SATFLB0020', 'SATFLB0021', 'SATFLB0022', 
                   'SATFLB0023', 'SATFLB0024', 'SATFLB0025', 'SATFLB0026', 'SATFLB0027', 'SATFLB0028', 'SATKAN0001', 'SATKAN0002', 
                   'SATKAN0003', 'SATKAN0004', 'SATSTP0001', 'SATSTP0002', 'SATSTP0003', 'SATSTP0004', 'SATSTP0005', 'SATSTP0006', 
                   'SATSTP0007', 'SATSTP0008'
                  )

In [78]:
# Here we filter by variables in the 'veg.PI' data frame
AP.data.transect = get_ausplots( my.Plot_IDs=transect_list, 
                       structural_summaries=TRUE, 
                       basal.wedge=TRUE,
                       soil_subsites=TRUE, 
                       soil_bulk_density=TRUE, 
                       soil_character=TRUE  )

User-supplied Plot_IDs located. 


# Vegetation Cover data by Growth Form and/or Height
## `single_cover_value` function

The `single_cover_value` function in the `auplotsR` package calculates Vegetation Cover Values for particular Growth Form Types and/or Height Thresholds per Site from Raw AusPlots Vegetation Point Intercept data. The `growth_form_table` function can also be used to calculate Cover Values for all Vegetation Growth Form Types; however, `single_cover_value` can perform these computations for:
* Particular vegetation growth form types (i.e. for individual growth forms or any combination of growth form types).
* Vegetation higher that a specified height threshold
* Vegetation with any combination of growth form types and minimum height

Specifically `single_cover_value` takes the following inputs via its arguments:
* `veg.PI`: Raw Vegetation Point Intercept data from AusPlots. A veg.PI data frame generated by the `get_ausplots` function (see above).
* `in_canopy_sky`: Method used to calculate Cover. A logical value that indicates whether to use in ‘canopy sky hits’ (i.e. calculate ‘opaque canopy cover’) or ‘projected foliage cover’. The default value, ‘FALSE’, calculates ‘projected foliage cover’. To calculate ‘opaque canopy cover’ the argument must be set to ‘TRUE’.
* `by.growth_form`: Whether to calculate Cover for a Subset by Growth Form type. A logical value that indicates whether to subset by growth form type. The default, ‘TRUE’, calculates cover for the growth form types specified in the argument ‘my.growth_forms’ (see next). If set to ‘FALSE’, cover calculations are conducted only for the vegetation sub-set by a provided Minimum Height Threshold.
* `my.growth_forms`: Growth Form Types used to Subset Data used for the Cover Calculations. A character vector specifying the growth form types to subset the data used for the cover calculations. Any combination of growth form types can be used. The default, ‘c("Tree/Palm", "Tree Mallee")’, is set to represent trees. It applies only when ‘by.growth_form=TRUE’; otherwise, this argument is ignored and only height sub-setting is applied.
* `min.height`: Minimum Height Threshold used to Subset Data used for the Cover Calculations. A numeric value indicating the minimum height (in metres) of the vegetation to be included in the subset of the data used for the cover calculations. A height must be always provided. The default, ‘5’, is set up for a cover of trees. It can be set to ‘0’ to ignore height and thus include any plant hit. If set to a ‘negative number’, it will return nonsensical output.

The `single_cover_value` function returns a data frame with two columns. The data frame rows correspond to unique sites, while the two columns correspond to the unique site and the percentage cover for the requested subset of vegetation (e.g. “Tree/Palm” higher than '5' metres).
 
When `by.growth_form = FALSE` and `min.height = 0`, the output is nearly the same as the green cover fraction returned by the `fractional_cover` function (see above). The values can differ because ‘fractional_cover’ applies a ‘height rule’ in which the highest intercept at a given point is taken, whereas ‘single_cover_value’ finds any green cover. For example, when dead trees overhang green understorey the values returned by both functions can differ. For general cover purposes, using ‘fractional_cover’ is recommended.  ‘single_cover_value’ is best suited to calculate cover subset by height and growth form.


## Example 3B: Explore veg cover for the transect in South Australia using the single cover value function

In [83]:
# Vegetation Cover of any Growth Form > 0m
# ----------------------------------------
AP.data.transect.vegPI.gt0 = single_cover_value(AP.data.transect$veg.PI, min.height=0)

In [None]:
AP.data.transect.vegPI$vegPI

In [64]:
# Save an AusPlots derived Data Frame using 'write.csv'
# =====================================================================================
# Provide Path for Directory where data will be stored
file.path = "workspace"

# Create Name of the file to be stored
file.name.sting = "trans.VC.gt0" #############  TYPE HERE WHAT YOU WANT TO CALL YOUR FILE
# Add the "txt" extension
file.name = paste(file.name.sting,"csv",sep=".")

# Save the veg cover data to a Text File with columns separated by tabs
write.csv(AP.data.transect.vegPI.gt0, paste(file.path, file.name, sep="/"))

In [85]:
# Vegetation Cover of any Growth Form > 2m
# ----------------------------------------
AP.data.transect.vegPI.gt2 = single_cover_value(AP.data.transect$veg.PI, min.height=2)

In [85]:
# Save an AusPlots derived Data Frame using 'write.csv'
# =====================================================================================
# Provide Path for Directory where data will be stored
file.path = "workspace"

# Create Name of the file to be stored
file.name.sting = "trans.VC.gt2" #############  TYPE HERE WHAT YOU WANT TO CALL YOUR FILE
# Add the "txt" extension
file.name = paste(file.name.sting,"csv",sep=".")

# Save the veg cover data to a Text File with columns separated by tabs
write.csv(AP.data.transect.vegPI.gt2, paste(file.path, file.name, sep="/"))

In [None]:
# Results (> 0m, > 2m, and 0 to 2m) combined in a single Data Frame
# -----------------------------------------------------------------
AP.data.VC.Height = data.frame(site_unique=AP.data.transect$site.info$site_unique, 
                               VCF.gt0=AP.data.transect.vegPI.gt0$percentCover, 
                               VCF.gt2=AP.data.transect.vegPI.gt2$percentCover, 
                               VCG.0to2=(AP.data.transect.vegPI.gt0$percentCover-AP.data.transect.vegPI.gt2$percentCover),
                               latitude=AP.data.transect$site.info$latitude,
                               longitude=AP.data.transect$site.info$longitude
                              )
head(AP.data.VC.Height)

In [None]:
# Save an AusPlots derived Data Frame using 'write.csv'
# =====================================================================================
# Provide Path for Directory where data will be stored
file.path = "workspace"

# Create Name of the file to be stored
file.name.sting = "VCF.Height" #############  TYPE HERE WHAT YOU WANT TO CALL YOUR FILE
# Add the "txt" extension
file.name = paste(file.name.sting,"csv",sep=".")

# Save the veg cover data to a Text File with columns separated by tabs
write.csv(AP.data.VC.Height, paste(file.path, file.name, sep="/"))

## Example 3B: Explore vegetation growth form for the South Australian transect

In [65]:
# Get a unique list of the vegetation growth forms
# The following code says get all the unique values for the column 'growth_form' in the data frame 'AP.data.transect.vegPI'
uniq_gowthform = unique(AP.data.transect.vegPI$growth_form)

In [68]:
# print the list of unique growth forms
print(uniq_gowthform)

 [1] "Tree/Palm"     "Sedge"         "Shrub"         NA             
 [5] "Forb"          "Tree Mallee"   "Tussock grass" "Vine"         
 [9] "Epiphyte"      "Fern"          "Chenopod"      "Hummock grass"
[13] "NC"            "Shrub Mallee"  "Grass-tree"    "Heath-shrub"  
[17] "Rush"          "Bryophyte"    


In [None]:
# Vegetation cover of Tree growth forms > 0m
AP.data.transect.vegPI.trees.gt0 = single_cover_value(AP.data.transect.vegPI, by.growth_form=TRUE, my.growth_forms = c("Tree/Palm", "Tree Mallee"), min.height=0)

* Save the new data frame (AP.data.transect.vegPI.trees.gt0) as a unique CSV file

In [69]:
# make lists of different growth forms
ground_list = c("Sedge", "Forb", "Epiphyte", "Rush", "Bryophyte", "Fern", "Tussock grass", "Grass-tree", "Hummock grass")
Shrub_list = c("Shrub", "Heath-shrub", "Shrub Mallee", "Chenopod")
grass_list = c("Tussock grass", "Hummock grass")

In [69]:
# Vegetation cover of Tree growth forms > 0m
AP.data.transect.vegPI.shrubs.gt0 = single_cover_value(AP.data.transect.vegPI, by.growth_form=TRUE, my.growth_forms = Shrub_list, min.height=0)

* Save the new data frame (AP.data.transect.vegPI.shrubs.gt0) as a unique CSV file

In [None]:
# Vegetation Cover of any Growth Form > 2m
# ----------------------------------------
AP.data.VC.gt2 = single_cover_value(AP.data.AdelReg$veg.PI, by.growth_form=FALSE, min.height=2)
#class(AP.data.VC.gt0)
#dim(AP.data.VC.gt0)
head(AP.data.VC.gt0)
summary(AP.data.VC.gt0)

# Results (> 0m, > 2m, and 0 to 2m) combined in a single Data Frame
# -----------------------------------------------------------------
AP.data.VC.Height = data.frame(site_unique=AP.data.VC.gt0$site_unique, 
                               VCF.gt0=AP.data.VC.gt0$percentCover, 
                               VCF.gt2=AP.data.VC.gt2$percentCover, 
                               VCG.0to2=(AP.data.VC.gt0$percentCover-AP.data.VC.gt2$percentCover))
head(AP.data.VC.Height)
summary(AP.data.VC.Height)

# Vegetation Cover data, sub-setting only by Taxonomy
# ===================================================

# Trees (my.growth_forms=c("Tree/Palm", "Tree Mallee"), which is the default)
# ---------------------------------------------------------------------------
AP.data.VC.trees = single_cover_value(AP.data.AdelReg$veg.PI, min.height=0)
#class(AP.data.VC.trees)
#dim(AP.data.VC.trees)
head(AP.data.VC.trees)
summary(AP.data.VC.trees)

# Grasses (my.growth_forms=c("Hummock.grass", "Tussock.grass"))
# ------------------------------------------------------------------
AP.data.VC.grass = single_cover_value(AP.data.AdelReg$veg.PI, my.growth_forms=c("Hummock grass", "Tussock grass"), min.height=0)
#class(AP.data.VC.grass)
#dim(AP.data.VC.grass)
head(AP.data.VC.grass)
summary(AP.data.VC.grass)

# Results (trees & grass) combined in a single Data Frame
# -----------------------------------------------------------------
AP.data.VC.TreesGrass = data.frame(site_unique=AP.data.VC.trees$site_unique,
                                   VCF.trees=AP.data.VC.trees$percentCover,
								   VCF.grass=AP.data.VC.grass$percentCover)
head(AP.data.VC.TreesGrass)
summary(AP.data.VC.TreesGrass)


# Vegetation Cover data, sub-setting by both Height and Taxonomy
# ==============================================================
# Trees (my.growth_forms=c("Tree/Palm", "Tree Mallee")) > 5 m.
# 'c("Tree/Palm", "Tree Mallee")' is the default values for 'my.growth.forms',
# so it is not really necesary
AP.data.VC.Trees.gt5 = single_cover_value(AP.data.AdelReg$veg.PI, 
                                          my.growth_forms=c("Tree/Palm", "Tree Mallee"), min.height=5)
#class(AP.data.VC.Trees.gt5)
#dim(AP.data.VC.Trees.gt5)
head(AP.data.VC.Trees.gt5)
summary(AP.data.VC.Trees.gt5)