-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new vignette, colors #51
Conversation
Added new vignette to the vignettes, and the file colors.Rda in the data folder, which is used for the color palettes
Current coverage is 0.00% (diff: 100%)@@ master #51 diff @@
===================================
Files 19 19
Lines 689 690 +1
Methods 0 0
Messages 0 0
Branches 0 0
===================================
Hits 0 0
- Misses 689 690 +1
Partials 0 0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work!
Can we add a small section on when to use griddap vs. tabledap?
|
||
```{r} | ||
library("rerddap") | ||
load("colors.rda") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to data(colors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it gets loaded on package load
Besides `rerddap` the following libraries are used in this vignette: | ||
|
||
```{r, warning = FALSE} | ||
library("akima") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible limit the package dependencies here since these all have to be put in Suggests
in the rerddap
DESCRIPTION
file, and then our Travis builds need all those, etc.
Fine to leave any that you absolutely need though
|
||
MUR (Multi-scale Ultra-high Resolution) is an analyzed SST product at 0.01-degree resolution going back to 2002, providing one of the longest satellite based time series at such high resolution. We extract the latest data available for a region off the west coast. | ||
|
||
```{r MUR, fig.width = 6, fig.height = 3, fig.align = 'center', warning = FALSE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would include cache = TRUE
here so this chunk doesn't take too long
date: "`r Sys.Date()`" | ||
output: rmarkdown::html_vignette | ||
vignette: > | ||
%\VignetteIndexEntry{rerddapVignette} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the VignetteIndexEntry
to something a little more informative, the one for the other vignette is rerddap introduction
|
||
```{r} | ||
require("ncdf4") | ||
sstFile <- nc_open('MWsstd1day.nc') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is missing, so this part fails on vignette build
|
||
`rerddap` is a general purpose <span style="color:blue">R</span> client for working with <span style="color:blue">ERDDAP</span> servers. <span style="color:blue">ERDDAP</span> is a web service developed by Bob Simons of NOAA. At the time of this writing, there are over fifty <span style="color:blue">ERDDAP</span> servers (though not all are public facing) providing access to literally petabytes of data and model output relevant to oceanography, meteorology, fisheries and marine mammals, among other areas. <span style="color:blue">ERDDAP</span> is a simple to use, RESTful web service, that allows data to be subsetted and returned in a variety of formats. | ||
|
||
In this vignette we go over some of the nuts and bolts of using the `rerddap` package, and show the power of the combination of the `rerddap` package with <span style="color:blue">ERDDAP</span> servers. Some of the examples are taken from the `xtractomatic` package (available from CRAN), and some from the `rerddapXtracto` package available on Github, but reworked to use `rerddap` directly. Other examples are new to this vignette, and include both gridded and non-gridded datasets from several <span style="color:blue">ERDDAPs</span>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include links to xtractomatic and rerddapXtracto for easy finding
library("parsedate") | ||
library("plot3D") | ||
library("xts") | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bunch of messages/et.c when these pkgs load, e.g.,
library("akima")
library("dplyr")
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library("ggfortify")
maybe duplicate this block of code, and with one of them just set to eval=FALSE
so it doesn't run, but is shown to reader, and with the other one, add warn.conflicts = FALSE
and quietly = TRUE
(or maybe just quietly = TRUE
)
|
||
## Introduction | ||
|
||
`rerddap` is a general purpose <span style="color:blue">R</span> client for working with <span style="color:blue">ERDDAP</span> servers. <span style="color:blue">ERDDAP</span> is a web service developed by Bob Simons of NOAA. At the time of this writing, there are over fifty <span style="color:blue">ERDDAP</span> servers (though not all are public facing) providing access to literally petabytes of data and model output relevant to oceanography, meteorology, fisheries and marine mammals, among other areas. <span style="color:blue">ERDDAP</span> is a simple to use, RESTful web service, that allows data to be subsetted and returned in a variety of formats. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the blue styling for ERDDAAP
text, but it does look like a link since the actual links are the same color. I wonder if you could use a different color, or bold or italicize?
|
||
## Cacheing, "last", "now", idempotency, and a gotcha | ||
|
||
`rerddap` by default caches the requests you make, so that if you happen to make the same request again, the data is restored from the cache, rather than having to go out and retrieve it remotely. For most applications, this a boon (such a when "knitting" and "reknitting" this document), as it can speed things up when doing a lot of request in a script, and works because in most cases an <span style="color:blue">ERDDAP</span> request is "idempotent". This means that the the request will always return the same thing no matter what requests came before - it doesn't depend on state. However this is not true if the script uses either "last" in `griddap()` or "now" in `tabledap()` as these will return different values as time elapses and data are added to the datasets. While it is desirable to have <span style="color:blue">ERDDAP</span> purely idempotent, the "last" and "now" constructs are very helpful for people using <span style="color:blue">ERDDAP</span> in dashboards, webpages, regular input to models and the like, and the benefits far outweigh the problems. However, if you are using either "last" or "now" in an `rerddap` based script, you want to be very careful to clear the `rerddap` cache, otherwise the request will be viewed as the same, and the data from the last request, rather than the latest data, will be returned. Note that several examples in this vignette use "last", and therefore the graphics may look different depending on when you "knitted" the vignette. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add something to the documentation about this, will open an issue
Yes, but tell me the proper way to do so based on what you sent, and how I re-sync everything. Is it another pull request once I have made the changes? I have only used git for my own projects, and use the GtiHub. Desktop app, so I need a little coaching as to how to do this process properly. Do these changes appear already somewhere, and I just rsync upstream?
Thanks,
-Roy
On Jan 3, 2017, at 5:35 PM, Scott Chamberlain ***@***.***> wrote:
@sckott requested changes on this pull request.
nice work!
Can we add a small section on when to use griddap vs. tabledap?
In vignettes/Using_rerddap.Rmd:
> +
+```{r, eval = FALSE}
+install.packages("rerddap")
+```
+
+or the development version can be installed from GitHub:
+
+```{r, eval = FALSE}
+devtools::install_github("ropensci/rerddap")
+```
+
+and to load the library:
+
+```{r}
+library("rerddap")
+load("colors.rda")
change this to data(colors)
In vignettes/Using_rerddap.Rmd:
> +
+```{r, eval = FALSE}
+install.packages("rerddap")
+```
+
+or the development version can be installed from GitHub:
+
+```{r, eval = FALSE}
+devtools::install_github("ropensci/rerddap")
+```
+
+and to load the library:
+
+```{r}
+library("rerddap")
+load("colors.rda")
as it gets loaded on package load
In vignettes/Using_rerddap.Rmd:
> +
+```{r, eval = FALSE}
+devtools::install_github("ropensci/rerddap")
+```
+
+and to load the library:
+
+```{r}
+library("rerddap")
+load("colors.rda")
+```
+
+Besides `rerddap` the following libraries are used in this vignette:
+
+```{r, warning = FALSE}
+library("akima")
if possible limit the package dependencies here since these all have to be put in Suggests in the rerddap DESCRIPTION file, and then our Travis builds need all those, etc.
Fine to leave any that you absolutely need though
In vignettes/Using_rerddap.Rmd:
> +100*100*4*8*365
+```
+
+
+Yes, 116,800,000 bytes or roughly 115MB for that request. Morever the user wanted the data as a .csv file, which usually makes the resulting file 8-10 times larger, so now we are over a 1GB for the request. Even more so, there are four parameters in that dataset, and in `rerddap::griddap()` if "fields" is not specified, all fields are downloaded, therefore the resulting files will be four times as large as given above.
+
+So the gist of this is to think about your request before you make it. Do a little mental math to get a rough estimate of the size of the download. There are times receiving the data as a .csv file is convenient, but make certain the request will not be too large. For larger requests, obtain the data as netCDF files. By default, `rerddap::griddap()` "melts"" the data into a dataframe, so a .csv only provides a small convenience. But for really large downloads, you should select the option in `rerddap::griddap()` to not read in the data, and use instead the `netcdf4` package to read in the data, as this allows for only reading in parts of the data at a time. [Below](#ncdf4) we provide a brief tutorial on reading in data using the `ncdf4` package.
+
+
+## griddap
+
+### MUR SST
+
+MUR (Multi-scale Ultra-high Resolution) is an analyzed SST product at 0.01-degree resolution going back to 2002, providing one of the longest satellite based time series at such high resolution. We extract the latest data available for a region off the west coast.
+
+```{r MUR, fig.width = 6, fig.height = 3, fig.align = 'center', warning = FALSE}
I would include cache = TRUE here so this chunk doesn't take too long
In vignettes/Using_rerddap.Rmd:
> @@ -0,0 +1,671 @@
+---
+title: "Using rerddap to Access Data from ERDDAP Servers"
+author: "Roy Mendelssohn and Scott Chamberlain"
+date: "`r Sys.Date()`"
+output: rmarkdown::html_vignette
+vignette: >
+ %\VignetteIndexEntry{rerddapVignette}
Change the VignetteIndexEntry to something a little more informative, the one for the other vignette is rerddap introduction
In vignettes/Using_rerddap.Rmd:
> +
+## Reading data from a netCDF file. {#ncdf4}
+
+Here we give a brief summary of how to read in part of the data from a netCDF file. The basic steps are:
+
+* Open the netCDF file
+* Map coordinate values to array indices
+* Extract the data
+
+A sample netCDF file, "MWsstd1day.nc" is included. This is a small file and is a toy example, but the basic principles remain the same for a larger file.
+
+Open the netCDF file:
+
+```{r}
+require("ncdf4")
+sstFile <- nc_open('MWsstd1day.nc')
this file is missing, so this part fails on vignette build
In vignettes/Using_rerddap.Rmd:
> +vignette: >
+ %\VignetteIndexEntry{rerddapVignette}
+ %\VignetteEngine{knitr::rmarkdown}
+ \usepackage[utf8]{inputenc}
+---
+```{r initialize, echo = FALSE}
+knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
+library(rerddap)
+```
+---
+
+## Introduction
+
+`rerddap` is a general purpose <span style="color:blue">R</span> client for working with <span style="color:blue">ERDDAP</span> servers. <span style="color:blue">ERDDAP</span> is a web service developed by Bob Simons of NOAA. At the time of this writing, there are over fifty <span style="color:blue">ERDDAP</span> servers (though not all are public facing) providing access to literally petabytes of data and model output relevant to oceanography, meteorology, fisheries and marine mammals, among other areas. <span style="color:blue">ERDDAP</span> is a simple to use, RESTful web service, that allows data to be subsetted and returned in a variety of formats.
+
+In this vignette we go over some of the nuts and bolts of using the `rerddap` package, and show the power of the combination of the `rerddap` package with <span style="color:blue">ERDDAP</span> servers. Some of the examples are taken from the `xtractomatic` package (available from CRAN), and some from the `rerddapXtracto` package available on Github, but reworked to use `rerddap` directly. Other examples are new to this vignette, and include both gridded and non-gridded datasets from several <span style="color:blue">ERDDAPs</span>.
include links to xtractomatic and rerddapXtracto for easy finding
In vignettes/Using_rerddap.Rmd:
> +```
+
+Besides `rerddap` the following libraries are used in this vignette:
+
+```{r, warning = FALSE}
+library("akima")
+library("dplyr")
+library("ggfortify")
+library("ggplot2")
+library("lubridate")
+library("mapdata")
+library("ncdf4")
+library("parsedate")
+library("plot3D")
+library("xts")
+```
There's a bunch of messages/et.c when these pkgs load, e.g.,
library("akima"
)
library(
"dplyr"
)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(
"ggfortify")
maybe duplicate this block of code, and with one of them just set to eval=FALSE so it doesn't run, but is shown to reader, and with the other one, add warn.conflicts = FALSE and quietly = TRUE (or maybe just quietly = TRUE)
In vignettes/Using_rerddap.Rmd:
> +date: "`r Sys.Date()`"
+output: rmarkdown::html_vignette
+vignette: >
+ %\VignetteIndexEntry{rerddapVignette}
+ %\VignetteEngine{knitr::rmarkdown}
+ \usepackage[utf8]{inputenc}
+---
+```{r initialize, echo = FALSE}
+knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
+library(rerddap)
+```
+---
+
+## Introduction
+
+`rerddap` is a general purpose <span style="color:blue">R</span> client for working with <span style="color:blue">ERDDAP</span> servers. <span style="color:blue">ERDDAP</span> is a web service developed by Bob Simons of NOAA. At the time of this writing, there are over fifty <span style="color:blue">ERDDAP</span> servers (though not all are public facing) providing access to literally petabytes of data and model output relevant to oceanography, meteorology, fisheries and marine mammals, among other areas. <span style="color:blue">ERDDAP</span> is a simple to use, RESTful web service, that allows data to be subsetted and returned in a variety of formats.
I like the blue styling for ERDDAAP text, but it does look like a link since the actual links are the same color. I wonder if you could use a different color, or bold or italicize?
In vignettes/Using_rerddap.Rmd:
> +urlBase <- "https://data.ioos.us/gliders/erddap/"
+gliderInfo <- info("sp064-20161214T1913", url = urlBase)
+glider <- tabledap(gliderInfo, fields = c("longitude", "latitude", "depth", "salinity"), 'time>=2016-12-14', 'time<=2016-12-23', url = urlBase)
+glider$longitude <- as.numeric(glider$longitude)
+glider$latitude <- as.numeric(glider$latitude)
+glider$depth <- as.numeric(glider$depth)
+scatter3D(x = glider$longitude , y = glider$latitude , z = -glider$depth, colvar = glider$salinity, col = colors$salinity, phi = 40, theta = 25, bty = "g", type = "p",
+ ticktype = "detailed", pch = 10, clim = c(33.2,34.31), clab = 'Salinity',
+ xlab = "longitude", ylab = "latitude", zlab = "depth",
+ cex = c(0.5, 1, 1.5))
+```
+
+
+## Cacheing, "last", "now", idempotency, and a gotcha
+
+`rerddap` by default caches the requests you make, so that if you happen to make the same request again, the data is restored from the cache, rather than having to go out and retrieve it remotely. For most applications, this a boon (such a when "knitting" and "reknitting" this document), as it can speed things up when doing a lot of request in a script, and works because in most cases an <span style="color:blue">ERDDAP</span> request is "idempotent". This means that the the request will always return the same thing no matter what requests came before - it doesn't depend on state. However this is not true if the script uses either "last" in `griddap()` or "now" in `tabledap()` as these will return different values as time elapses and data are added to the datasets. While it is desirable to have <span style="color:blue">ERDDAP</span> purely idempotent, the "last" and "now" constructs are very helpful for people using <span style="color:blue">ERDDAP</span> in dashboards, webpages, regular input to models and the like, and the benefits far outweigh the problems. However, if you are using either "last" or "now" in an `rerddap` based script, you want to be very careful to clear the `rerddap` cache, otherwise the request will be viewed as the same, and the data from the last request, rather than the latest data, will be returned. Note that several examples in this vignette use "last", and therefore the graphics may look different depending on when you "knitted" the vignette.
We should probably add something to the documentation about this, will open an issue
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: Roy.Mendelssohn@noaa.gov www: http://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
|
@rmendels you can just add changes to your version of rerddap on github and they will be added here automatically. If you had made a different branch off |
All requested edits done. Number of external packages used sharply reduced. Several new examples included. Example netcdf file include in /inst/extdata, and successfully used in vignette. Info file for colors.rda included (colors.R) A lot of minor edits.
Bugs in SODA and IMI titles fixed. References to the original sources of the data for many of the datasets improved.
Added new vignette to the vignettes, and the file colors.Rda in the
data folder, which is used for the color palettes