Skip to content

dbca-wa/biosysR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Travis-CI Build Status Coverage status

biosysR makes BioSys data accessible in R

BioSys is a data warehouse for biological survey data run by the Western Australian Department of Biodiversity, Conservation and Attractions (DBCA).

BioSys is accessible to DBCA staff behind a single-sign-on firewall, whereas the BioSys API is accessible both to staff (behind SSO firewall) through a GUI and from scripts protected (read and write) through basicauth using a BioSys username and password. The BioSys API documentation provides a graphical API browser.

The BioSys API returns JSON dictionaries of projects, datasets, records, and other entities.

If a data consumer wishes to analyse data in a statistical package like R, the data need to be transformed from a nested list of lists (JSON) into a two-dimensional tablular structure.

The main purpose of this R package, somewhat uncreatively named biosysR, is to facilitate accessing and using BioSys data by providing helpers to access the API and flatten the API outputs into a tidy dplyr::tibble.

Installation

Install biosystR from GitHub:

# install.packages("devtools")
devtools::install_github("parksandwildlife/biosysR")
library(biosysR)

Setup

The BioSys API is only accessible with basicauth using a valid Biosys username and password. To get up and running, execute the following commands with your own BioSys username and password:

Sys.setenv(BIOSYS_UN = "USERNAME")
Sys.setenv(BIOSYS_PW = "PASSWORD")

See the package vignette for a comprehensive run-down on BioSys API authentication and setup options.

Usage example

All examples assume that authentication credentials are available as environment variables. See the vignette for more authentication options.

BioSys projects

projects <- biosysR::biosys_projects()
dplyr::glimpse(projects)
#> Observations: 7
#> Variables: 13
#> $ id                <chr> "1", "2", "3", "4", "7", "6", "5"
#> $ name              <chr> "Berkeley Incidental Records", "Kimberley Is...
#> $ code              <chr> "BER", "KI", "LCI", "KNC", "PRS", "SBS", "SCTI"
#> $ description       <chr> "Incidental mainland records captured as par...
#> $ site_count        <int> 41, 163, 208, 27, 104, 0, 64
#> $ dataset_count     <int> 3, 9, 13, 10, 3, 2, 8
#> $ record_count      <int> 154, 42118, 14561, 1621, 726, 3696, 3730
#> $ longitude         <dbl> 127.8207, 125.5086, 126.8049, 128.5613, NA, ...
#> $ latitude          <dbl> -14.48498, -14.60075, -15.54562, -16.08126, ...
#> $ datum             <chr> "4326", "4326", "4326", "4326", "4326", "432...
#> $ timezone          <chr> "Australia/Perth", "Australia/Perth", "Austr...
#> $ site_data_package <list> [NULL, NULL, NULL, NULL, NULL, NULL, NULL]
#> $ custodians        <list> [2, 2, 2, 2, 2, [2, 9, 12], 2]

BioSys datasets

datasets <- biosysR::biosys_datasets(project_id = 6)
dplyr::glimpse(datasets)
#> Observations: 48
#> Variables: 7
#> $ id           <chr> "101", "107", "118", "30", "45", "99", "108", "11...
#> $ record_count <int> 4582, 38, 3307, 33, 414, 426, 95, 42, 23, 1163, 6...
#> $ data_package <list> [["tabular-data-package", "BioSys Config", "anim...
#> $ name         <chr> "Animal Observations", "Animal Observations", "An...
#> $ type         <chr> "species_observation", "species_observation", "sp...
#> $ description  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "...
#> $ project_id   <int> 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 3, 3, 4, 5, 2...

BioSys records

records <- biosysR::biosys_records(project_id = 6)
dplyr::glimpse(records)
#> Observations: 3,696
#> Variables: 54
#> $ id                     <chr> "147647", "147648", "147649", "147650",...
#> $ datetime               <chr> "2016-03-17T16:00:00Z", "2016-03-17T16:...
#> $ species_name           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
#> $ name_id                <chr> "-1", "-1", "-1", "-1", "-1", "-1", "-1...
#> $ file_name              <chr> "SBY_2016-03_Seagrass_biosys.csv", "SBY...
#> $ file_row               <chr> "2", "3", "4", "5", "6", "7", "8", "9",...
#> $ last_modified          <chr> "2017-09-20T08:34:13.411874Z", "2017-09...
#> $ dataset                <chr> "126", "126", "126", "126", "126", "126...
#> $ site                   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
#> $ Impact                 <list> ["", "Epiphyte", "", "Epiphyte", "", "...
#> $ Level3Class            <list> ["SAND", "RUBBLE", "Posidonia spp.", "...
#> $ RecordNo               <list> ["0", "1", "2", "3", "4", "5", "6", "7...
#> $ Level4Class            <list> ["SAND", "RUBBLE", "Posidonia australi...
#> $ Level2Class            <list> ["SAND", "RUBBLE", "Posidoniaceae", "P...
#> $ Replicate              <list> ["Transect 1", "Transect 1", "Transect...
#> $ Level1ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Level3ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Latitude               <list> ["-26.20245", "-26.20245", "-26.20245"...
#> $ TotalPoints            <list> ["6", "6", "6", "6", "6", "6", "6", "6...
#> $ SubstrateCode          <list> ["SBC22022011171457536", "SBC220220111...
#> $ ZoneCode               <list> ["SBY-GUZ-WG", "SBY-GUZ-WG", "SBY-GUZ-...
#> $ Level2ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Zone                   <list> ["Western Gulf", "Western Gulf", "West...
#> $ Level4ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Baseclassmodifiers     <list> ["NA", "Rubble", "NA", "NA", "NA", "Ru...
#> $ Date                   <list> ["18/03/2016", "18/03/2016", "18/03/20...
#> $ SubstrateModifier      <list> ["No relief", "No relief", "No relief"...
#> $ Level5Class            <list> ["SAND", "RUBBLE", "Posidonia australi...
#> $ Level1Class            <list> ["SAND", "RUBBLE", "SEAGRASS", "SEAGRA...
#> $ ImpactCode             <list> ["", "IMP15042011111556150", "", "IMP1...
#> $ ClassLevel             <list> ["Level 1", "Level 1", "Level 4", "Lev...
#> $ Substrate              <list> ["Sand", "Sand", "Sand", "Sand", "Sand...
#> $ FeatureType            <list> ["Point", "Point", "Point", "Point", "...
#> $ Region                 <list> ["Shark Bay Marine Park", "Shark Bay M...
#> $ Analysis               <list> ["Random 6 pts", "Random 6 pts", "Rand...
#> $ PointNo                <list> ["0", "1", "2", "3", "4", "5", "0", "1...
#> $ RegionCode             <list> ["SBY", "SBY", "SBY", "SBY", "SBY", "S...
#> $ ImageName              <list> ["SBY-GUZ-WG-ULS-T1-L_20160318134127",...
#> $ Survey                 <list> ["SBY-GUZ-WG-ULS-T1-2016318134127", "S...
#> $ Longitude              <list> ["113.46631", "113.46631", "113.46631"...
#> $ Time                   <list> ["1:41:27 PM", "1:41:27 PM", "1:41:27 ...
#> $ BaseclassmodifiersCode <list> ["NA", "BMC22022011163936550", "NA", "...
#> $ SectorCode             <list> ["SBY-GUZ", "SBY-GUZ", "SBY-GUZ", "SBY...
#> $ Sector                 <list> ["General Use Zone", "General Use Zone...
#> $ ImageNo                <list> ["0", "0", "0", "0", "0", "0", "1", "1...
#> $ Projection             <list> ["+proj=longlat +ellps=WGS84 +no_defs"...
#> $ Level5ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Site                   <list> ["Useless Loop South", "Useless Loop S...
#> $ Month                  <list> ["March", "March", "March", "March", "...
#> $ SubstrateModifierCode  <list> ["SBM22022011171457583", "SBM220220111...
#> $ Year                   <list> ["2016", "2016", "2016", "2016", "2016...
#> $ CameraSide             <list> ["Left", "Left", "Left", "Left", "Left...
#> $ SiteCode               <list> ["SBY-GUZ-WG-ULS", "SBY-GUZ-WG-ULS", "...
#> $ ReplicateCode          <list> ["SBY-GUZ-WG-ULS-T1", "SBY-GUZ-WG-ULS-...

Example data

In case the BioSys API is not accessible, a sample of available data is supplied.

data(projects)
data(datasets)
data(records)

dplyr::glimpse(projects)
#> Observations: 7
#> Variables: 13
#> $ id                <chr> "1", "2", "3", "4", "7", "6", "5"
#> $ name              <chr> "Berkeley Incidental Records", "Kimberley Is...
#> $ code              <chr> "BER", "KI", "LCI", "KNC", "PRS", "SBS", "SCTI"
#> $ description       <chr> "Incidental mainland records captured as par...
#> $ site_count        <int> 41, 163, 208, 27, 104, 0, 64
#> $ dataset_count     <int> 3, 9, 13, 10, 3, 2, 8
#> $ record_count      <int> 154, 42118, 14561, 1621, 726, 3696, 3730
#> $ longitude         <dbl> 127.8207, 125.5086, 126.8049, 128.5613, NA, ...
#> $ latitude          <dbl> -14.48498, -14.60075, -15.54562, -16.08126, ...
#> $ datum             <chr> "4326", "4326", "4326", "4326", "4326", "432...
#> $ timezone          <chr> "Australia/Perth", "Australia/Perth", "Austr...
#> $ site_data_package <list> [NULL, NULL, NULL, NULL, NULL, NULL, NULL]
#> $ custodians        <list> [2, 2, 2, 2, 2, [2, 9, 12], 2]
dplyr::glimpse(datasets)
#> Observations: 48
#> Variables: 7
#> $ id           <chr> "101", "107", "118", "30", "45", "99", "108", "11...
#> $ record_count <int> 4582, 38, 3307, 33, 414, 426, 95, 42, 23, 1163, 6...
#> $ data_package <list> [["tabular-data-package", "BioSys Config", "anim...
#> $ name         <chr> "Animal Observations", "Animal Observations", "An...
#> $ type         <chr> "species_observation", "species_observation", "sp...
#> $ description  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "...
#> $ project_id   <int> 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 3, 3, 4, 5, 2...
dplyr::glimpse(head(records))
#> Observations: 6
#> Variables: 54
#> $ id                     <chr> "147647", "147648", "147649", "147650",...
#> $ datetime               <chr> "2016-03-17T16:00:00Z", "2016-03-17T16:...
#> $ species_name           <chr> NA, NA, NA, NA, NA, NA
#> $ name_id                <chr> "-1", "-1", "-1", "-1", "-1", "-1"
#> $ file_name              <chr> "SBY_2016-03_Seagrass_biosys.csv", "SBY...
#> $ file_row               <chr> "2", "3", "4", "5", "6", "7"
#> $ last_modified          <chr> "2017-09-20T08:34:13.411874Z", "2017-09...
#> $ dataset                <chr> "126", "126", "126", "126", "126", "126"
#> $ site                   <chr> NA, NA, NA, NA, NA, NA
#> $ Impact                 <list> ["", "Epiphyte", "", "Epiphyte", "", ""]
#> $ Level3Class            <list> ["SAND", "RUBBLE", "Posidonia spp.", "...
#> $ RecordNo               <list> ["0", "1", "2", "3", "4", "5"]
#> $ Level4Class            <list> ["SAND", "RUBBLE", "Posidonia australi...
#> $ Level2Class            <list> ["SAND", "RUBBLE", "Posidoniaceae", "P...
#> $ Replicate              <list> ["Transect 1", "Transect 1", "Transect...
#> $ Level1ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Level3ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Latitude               <list> ["-26.20245", "-26.20245", "-26.20245"...
#> $ TotalPoints            <list> ["6", "6", "6", "6", "6", "6"]
#> $ SubstrateCode          <list> ["SBC22022011171457536", "SBC220220111...
#> $ ZoneCode               <list> ["SBY-GUZ-WG", "SBY-GUZ-WG", "SBY-GUZ-...
#> $ Level2ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Zone                   <list> ["Western Gulf", "Western Gulf", "West...
#> $ Level4ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Baseclassmodifiers     <list> ["NA", "Rubble", "NA", "NA", "NA", "Ru...
#> $ Date                   <list> ["18/03/2016", "18/03/2016", "18/03/20...
#> $ SubstrateModifier      <list> ["No relief", "No relief", "No relief"...
#> $ Level5Class            <list> ["SAND", "RUBBLE", "Posidonia australi...
#> $ Level1Class            <list> ["SAND", "RUBBLE", "SEAGRASS", "SEAGRA...
#> $ ImpactCode             <list> ["", "IMP15042011111556150", "", "IMP1...
#> $ ClassLevel             <list> ["Level 1", "Level 1", "Level 4", "Lev...
#> $ Substrate              <list> ["Sand", "Sand", "Sand", "Sand", "Sand...
#> $ FeatureType            <list> ["Point", "Point", "Point", "Point", "...
#> $ Region                 <list> ["Shark Bay Marine Park", "Shark Bay M...
#> $ Analysis               <list> ["Random 6 pts", "Random 6 pts", "Rand...
#> $ PointNo                <list> ["0", "1", "2", "3", "4", "5"]
#> $ RegionCode             <list> ["SBY", "SBY", "SBY", "SBY", "SBY", "S...
#> $ ImageName              <list> ["SBY-GUZ-WG-ULS-T1-L_20160318134127",...
#> $ Survey                 <list> ["SBY-GUZ-WG-ULS-T1-2016318134127", "S...
#> $ Longitude              <list> ["113.46631", "113.46631", "113.46631"...
#> $ Time                   <list> ["1:41:27 PM", "1:41:27 PM", "1:41:27 ...
#> $ BaseclassmodifiersCode <list> ["NA", "BMC22022011163936550", "NA", "...
#> $ SectorCode             <list> ["SBY-GUZ", "SBY-GUZ", "SBY-GUZ", "SBY...
#> $ Sector                 <list> ["General Use Zone", "General Use Zone...
#> $ ImageNo                <list> ["0", "0", "0", "0", "0", "0"]
#> $ Projection             <list> ["+proj=longlat +ellps=WGS84 +no_defs"...
#> $ Level5ClassCode        <list> ["CBC22022011163926270", "CBC220220111...
#> $ Site                   <list> ["Useless Loop South", "Useless Loop S...
#> $ Month                  <list> ["March", "March", "March", "March", "...
#> $ SubstrateModifierCode  <list> ["SBM22022011171457583", "SBM220220111...
#> $ Year                   <list> ["2016", "2016", "2016", "2016", "2016...
#> $ CameraSide             <list> ["Left", "Left", "Left", "Left", "Left...
#> $ SiteCode               <list> ["SBY-GUZ-WG-ULS", "SBY-GUZ-WG-ULS", "...
#> $ ReplicateCode          <list> ["SBY-GUZ-WG-ULS-T1", "SBY-GUZ-WG-ULS-...

Learn more

See the vignette for in-depth examples of authenticating, transforming, analysing and visualising BioSys data. (Note: work in progress)

vignette("biosysR")

Contribute

Every contribution, constructive feedback, or suggestion is welcome!

Send us your ideas and requests as issues or submit a pull request.

Pull requests should eventually pass tests and checks (not introducing new ERRORs, WARNINGs or NOTEs apart from the "New CRAN package" NOTE):

devtools::document()
devtools::test()
pkgdown::build_site()
devtools::check(check_version = T, force_suggests = T, cran = T)

Code coverage is automatically calculated and reported from TravisCI. To manually submit code coverage reports, run:

Sys.setenv(CODECOV_TOKEN=Sys.getenv("BIOSYS_CODECOV_TOKEN"))
covr::codecov()

Releases

No releases published

Packages

No packages published

Languages