-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing readGWLdata to work with new data #10
Conversation
- downloads well data from BC gov website - match formatting to previous version
- Alphebatize - Add tidyr - Start directly referring to imports explicity (i.e. dplyr::mutate) - Import "%>%" for entire package
- Alphebatize - Add tidyr - Start directly referring to imports explicity (i.e. dplyr::mutate) - Import "%>%" for entire package
… (otherwise failed on check)
- Add rmarkdown to Suggests for vignettes - Update contents to use get_gwl() - Add details for `which` argument
This pull does the following:
|
Also forgot to mention: The only thing missing from the downloaded/formatted data is two additional metadata fields: EMS_ID and Station_Name. Right now they're filled with NA, and I'm not sure where to get the information from without web-scraping the well details (for example, but even this one doesn't have an EMS ID...). |
Thanks @steffilazerte! I'll try and have a look today or tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @steffilazerte - this looks really great! The tests and the working vignette are fantastic!
Just a couple of small points which are open for discussion (comments inline).
The only other thing is that I generally try to avoid pipes in package functions because it makes debugging really difficult - I know the package already had pipes in it, just wondering what you think?
R/readGWLdata.R
Outdated
|
||
well_avg <- utils::read.csv(text = data[[2]], stringsAsFactors = FALSE) | ||
well_avg <- well_avg[, names(well_avg)[names(well_avg) != "year"]] | ||
well_avg <- tidyr::spread(well_avg, "type", "Value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know they were in the original, but I'm not sure we want the historical mean/max/min attached to every data frame... what about a fourth which
option ("hist_avg"
or something) that allows downloading just that on its own, and leave it up to the user to join them if they need to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about an addition argument, hist_avg = FALSE
? I've already coded it in so this way we can just wrap the code that downloads and joins in if(hist_avg) {...}
. We can make the argument FALSE by default to minimize unnecessary downloading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great idea!
by = "dummydate") | ||
|
||
################################ | ||
# Need station name/location meta information! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a source for well metadata coming in the next couple of weeks, if you're able to hold off then we can add that then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, no hurry. We could still push to master, if you think people may want to use it without the meta data. Or we can wait if you think it could get confusing.
gwlZypTest <- function(dataframe, wells=NULL, byID, col, method="both") { | ||
#' @export | ||
|
||
gwlZypTest <- function(dataframe, wells = NULL, byID, col, method = "both") { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check the existence of columns named in byID
and col
. I will open an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#12 - probably should look across the board to make sure it's done consistently (for later)
dplyr::group_by(.data$EMS_ID, .data$Well_Num, | ||
Year = lubridate::year(.data$Date), | ||
Month = lubridate::month(.data$Date)) %>% | ||
dplyr::mutate(Date = dplyr::case_when(length(.data$Well_Num) < 5 ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
R/readGWLdata.R
Outdated
if (!inherits(path, "textConnection") && !file.exists(path)) { | ||
stop(paste0("The file ", path, "does not exist.")) | ||
} | ||
if(!(which %in% c("all", "recent", "daily"))) stop("type must be either 'all', 'recent', or 'daily'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about using which = c("all", "recent", "daily")
in the argument list, and then which <- match.arg(which)
to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit it never occurred to me, but yes, that would totally work. The only small thing I don't like is that the error message is:
Error in match.arg(which) :
'arg' should be one of “all”, “recent”, “daily”
Which, to an R newbie, isn't really a clear error message.
But either way, 'type' should be 'which' in the original!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope you don't mind, but this inspired me to write a function arg_match
that does a better job of this. I implemented it for get_gwl
in 65acc50
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely don't mind, that's a great function! I like it much better than the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll make use of it in a lot of other places!
R/readGWLdata.R
Outdated
httr::stop_for_status(gwl_data) | ||
gwl_data <- httr::content(gwl_data, as = "text", encoding = "UTF-8") | ||
|
||
gwl_avg <- httr::GET(paste0(url, "minMaxMean.csv")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See question below for whether we should always get the minMaxMean...
|
||
#' Retrive and format groundwater data from BC Government GWL site | ||
#' | ||
#' Go to <http://www.env.gov.bc.ca/wsd/data_searches/obswell/map/> to find your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice documentation
DESCRIPTION
Outdated
ggplot2 (>= 2.0.0), | ||
ggmap (>= 2.6.1), | ||
httr (>= 1.3.1), | ||
lubridate (>= 1.5.0), | ||
rgdal (>= 1.1-3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we should move rgdal and sp to Suggests as they're only used for the one function (utm_dd
). Then at the top of the body of utm_dd
use:
if (!requireNamespace("rgdal") || !requireNamespace("sp")) {
stop("You need the sp and rgdal packages installed to use this function")
}
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, especially as rgdal and sp aren't lightweight packages
Commit 225c64e was odd. In order to keep |
R/utm_dd.R
Outdated
utm <- SpatialPoints(d[2:3], proj4string=CRS(paste0("+proj=utm +datum=", d[4], " +zone=", d[1]))) | ||
sp <- spTransform(utm, CRS("+proj=longlat")) | ||
utm <- sp::SpatialPoints(d[2:3], | ||
proj4string = CRS(paste0("+proj=utm +datum=", d[4], " +zone=", d[1]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need sp::CRS
here and sp::coordinates
below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
No description provided.