Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission: noaastnr (R) #29

Open
12 of 27 tasks
chenzhao299 opened this issue Mar 19, 2021 · 4 comments
Open
12 of 27 tasks

Submission: noaastnr (R) #29

chenzhao299 opened this issue Mar 19, 2021 · 4 comments

Comments

@chenzhao299
Copy link

chenzhao299 commented Mar 19, 2021


Submitting Author: Chen Zhao (@chenzhao2020)
Other Authors: Chirag Rank (@chiragrank), Steffen Pentelow (@spentelow)
Repository: noaastnr
Version submitted: 0.3.1
Editor: Tiffany Timbers
Reviewers: TBD

Archive: TBD
Version accepted: TBD


  • Paste the full DESCRIPTION file inside a code block below:
Package: noaastnr
Title: R Package that Downloads, Processes and Visualizes Weather Data from NOAA Website
Version: 0.3.1
Authors@R: 
    c(person(given = "Chen",
           family = "Zhao",
           role = c("aut", "cre"),
           email = "cz2020@student.ubc.ca"),
      person(given = "Chirag",
           family = "Rank",
           role = "aut"),
      person(given = "Steffen",
           family = "Pentelow",
           role = "aut"))
Description: The US National Oceanic and Atmospheric Administration (NOAA) collects and provides access to weather data from land-based weather stations within the US and around the world. One method for accessing these data is through a publically accessible FTP site.  This package allows users to easily download data from a given station for a given year, extract several key weather parameters from the raw data files, and visualize the variation in these parameters over time. 
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1
Suggests:
    testthat (>= 3.0.0),
    covr,
    knitr,
    rmarkdown
Config/testthat/edition: 3
URL: https://github.com/UBC-MDS/noaastnr
BugReports: https://github.com/UBC-MDS/noaastnr/issues
Imports: 
    curl,
    stringr,
    lubridate,
    dplyr,
    tibble,
    ggplot2,
    tidyr,
    rlang,
    readr,
    utils
VignetteBuilder: knitr

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

This package allows users to download and retrieve data from the US National Oceanic and Atmospheric Administration (NOAA) website, carries out data mungling by extracting and cleaning weather parameters, and visualizes the variation in these parameters over time.

  • Who is the target audience and what are scientific applications of this package?

The target audience would be anyone who is interested in learning about and analyzing weather data. This package could benefit scientific projects that need an tool to obtain historical weather data from the NOAA's FTP site, and process and visualize key weather parameters.

The package rnoaa provides extensive functionality for interfacing with the NOAA's data systems including their APIs and FTP service. The noaastnr package is a lightweight alternative to the rnoaa package which provides access to a specific subset of NOAA data, namely, their Integrated Surface Database (historical global hourly weather data) through their FTP service.

Tiffany Timbers (@ttimbers)

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@dbandrews
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • We are in the same year of the MDS program at UBC.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples (that run successfully locally) for all exported functions
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 3

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Hi Chirag, Steffen and Chen - great work on this package! I really like the abstraction and quick introduction to accessing weather observations from all over the world. I can definitely see myself using this for some personal projects with weather related data.

I found the package structure was spot on - testing coverage looks really thorough as well.

Functionality

I haven't checked the Functionality box due to one small issue with using the time_basis="daily" in plot_weather_data. The issue can be reproduced with nearly the example given in the vignette:

library(noaastnr)
weather_data <- get_weather_data("911650-22536", 2020)
plot_weather_data(weather_data, col_name="air_temp", time_basis = "daily")

Which produces:

Error in as.Date.default(date) : 
  do not know how to convert 'date' to class “Date”

I believe the solution should just be changing month to date on this line

I think the tests are missing this because they aren't rendering the objects which would cause ggplot2 to evaluate the objects mapped in the graph. You could look at using vdiffr to do visual regression tests to catch this plus other errors that may come up - I find it not too painful to use and it locks in desired visuals pretty well. Send me a message if you want more info.

Code suggestions

Otherwise I noticed a few things that could potentially be tweaked to speed things up / reduce duplication if you wanted.

  • get_weather_data: Instead of looping through the lines in the data that is returned, you can cast to tibble and then use a dplyr::mutate statement with vectorized stringr statements to speed this up. From quick testing I think time for one call goes from ~ 2.5 minutes -> 5 seconds for the example weather station (I'll open a PR for this addition to show what I mean).
    image

  • plot_weather_data: Instead of duplicating all the plotting code except for the x-axis (either monthly or daily) and y-axis (temp, pressure etc), you could have one plotting function that takes both as arguments. This should reduce copy/paste mistakes when adding more options - I find I always make a small mistake!

Typos/Documentation

There are just a few typos in your README - The "Dependancies" section header, and "publically" in your opening paragraph. In your vignette, you also reference the plot_weather_data function returning an Altair object.


Thanks for making this package! I think the open source weather/geospatial space is really exciting and this package is a welcome addition to quickly grab data from nearby stations. The links to NOAA's weather station GIS portal is a welcome addition as well and helped me poke around what is available. Let me know if you have any questions or thoughts -

Dustin

@kmoravej
Copy link

kmoravej commented Mar 24, 2021

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
  • I am in the same MDS cohort as the authors of noaastnr package.

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples (that run successfully locally) for all exported functions
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Please refer to the comments.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 3

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Hi Noaastnr team,

Congratulation! You have come up with a great idea and very well structured package. I'm sure it would useful for so many people outside of the MDS. Your code is very easy to follow and readable, Great Job. I have couple of suggestions for you that you may consider:

  • The get_weather_data() function is quite slow. It would be good to vectorize your code to speed it up.
  • I was not able to generate the daily plot using the plot_weather_data() function. I went through your code and it seems you have not defined date column. You can fix it by changing the line 254 in your noaastnr.R to:
  df <-
      dplyr::group_by(df, date = lubridate::floor_date(datetime, "day"))

daily

  • It would be good to mention your names in the License files as well instead of your group number.

  • It would be nice to briefly describe the inputs and outputs of your functions in your Roxygens.

  • In the following part of your documentation, I believe the Altair should be ggplot object.

The function returns an Altair chart object which can be saved or displayed in any environment which can render Altair objects.
  • Your plot_weather_data() can be slightly drier by making a sepeare function for plotting. I believe it makes your code even more easier to follow.

Great Job! I really enjoyed going through your package and I have learned something new as well.
All the best,
Kamal

@chenzhao299
Copy link
Author

Hello @dbandrews and @kmoravej , thank you so much for your comments, those are really helpful. Our team has discussed all the issues you mentioned, and we are going to address them as best as we can. Will keep you updated about our progress. Thanks again for your valuable feedbacks and help!

@chenzhao299
Copy link
Author

Thank you @dbandrews and @kmoravej for the helpful reviews! We have addressed the changes based on your suggestions in the latest pull requests. The details are as follows:

  • Dustin:

    • Fix issue with using time_basis="daily" in `plot_weather_data" function
    • Simplified the repeated codes in plot_weather_data function
    • Fixed readme typo ('publically' to 'publicly')
    • Fixed the output of plot_weather_data function as ggplot object
  • Kamal:

    • Fix issue with using time_basis="daily" in `plot_weather_data" function
    • Added group member names explicitly to License file
    • Fixed the output of plot_weather_data function as ggplot object
    • Simplified the repeated codes in plot_weather_data function

Other fixes:

  • Changed regex in assert statement in get_weather_data() function to allow station numbers that begin with a letter

Please feel free to check them out. Let us know if you have any more suggestions or advices. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants