# Exploring OGSL.ca/bio/ data using `robis` and `obistools`

Two packages maintained by the programmers at OBIS, `robis` allows users to access the contents of the OBIS database, and `obistools` provides some QA/QC checks for data held on your local machine.

In [None]:
# obistools is a github-only install
library(devtools)
# QC helper tools
devtools::install_github('iobis/obistools')
# Access and visualization of OBIS-held data
devtools::install_github('iobis/robis')

In [None]:
# Get your source file(s) from ogsl.ca/bio/ and put them into a data frame
library(tidyverse)

# use latin1 as the locale so that it handles accents correctly.
data <- read_csv('~/Downloads/data_20200119-1640_fd181d2d/export.csv', local = locale(encoding = "latin1"))

In [None]:
# Begin to explore it

# print a summary of the dataframe
data

In [None]:
# print the column names of the dataframe
names(data)

In [None]:
# explore the contents of individual column names to help classification.

unique(data$"Institution propriétaire")

In [None]:
# From exploring this data downloaded via ogsl.ca/bio, 
# The occurrence data from the portal appears to be completely DwC-mappable and ready for OBIS ingestion.
# as Occurrence or Occurrence + MoF

"""
 Date  + Format             ->  map timezone + format to create ISO-8601 as datecollected
 Emplacement                -> locality ? Basis of higherGeography? [Emplacement]
 Longitude / Latitude       -> decimalLongitude, decimalLatitude, footprintWKT, geodeticDatum
 Taxon                      -> vernacularName
 Nom latin                  -> scientificName
 Nombre d'individus         -> individualCount
 Poids                      -> dynamicProperties{weightInGrams:[Poids].toGrams()} 
                                    and/or MeasurementOrFact entry w/ weight(s)
 Présence                   -> occurrenceStatus = present  /  absent
 Biomasse                   ->      if mass in percentage : organismQuantity: [Biomasse] w/ organismQuantityType %biomass
                                    else if by total mass:  dynamicProperties{weightInGrams:[Biomasse]}          
                                    else: MeasurementOrFact entry w/ individuals + weight(s)  
 
 Densité                    ->  != individualCount  , perhaps dynamicProperties{density:[Densité]}
 Couverture                 ->  ? 
 


 Méthode d'échantillonnage  -> BasisOfRecord map{ Visuel, (Chalut, Trappe à anguilles + other fishing methods) -> HumanObservation, 
                                   ?? -> LivingSpecimen, 
                                   ?? -> MachineObservation }
 Provenance                 -> establishmentMeans map{ Exotique Envahissante -> invasive,
                                                       Exotique Naturalisée  -> naturalized,
                                                                             -> introduced,
                                                                             -> managed,}
 Collection                  -> Title? DatasetName?
 Institution propriétaire    -> institutionCode
"""

In [None]:
# What sort of metadata do we have about the collection?
# What is the format of the data in the ogsl.ca system? Is there an even more direct mapping?