Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable ci and make stuff reproduce as per #2 #8

Merged
merged 23 commits into from
May 7, 2020
Merged

enable ci and make stuff reproduce as per #2 #8

merged 23 commits into from
May 7, 2020

Conversation

maxheld83
Copy link
Contributor

@maxheld83 maxheld83 commented Apr 29, 2020

Sorry @njahn82, this is not ready for prime time yet as per #2.
But we do technically have CI running and in place (if failing), so it's correct to write that up in the vignette 😏.

It's not so much the CI, per se, but there are still a bunch of things that need to be resolved to make this reproducible and elegant, including #7 #6 #5 #4.

@maxheld83 maxheld83 marked this pull request as draft April 29, 2020 23:01
@maxheld83 maxheld83 self-assigned this Apr 29, 2020
@njahn82
Copy link
Collaborator

njahn82 commented Apr 30, 2020

No worries, will start fixing the remaining issues.

@maxheld83
Copy link
Contributor Author

@njahn82 the version of the website from this branch is now live at https://subugoe.github.io/openairegraph/refs/heads/ci/

Comment on lines +1 to +106

Because hoad is not only a research project, but primarily a data product meant to be used by many, hoad is organised as an R package.

Please adhere by the package development best practices laid out in:

- Hadley Wickham's [R Packages](https://r-pkgs.org/index.html)
- [rOpenSci Development Guide](https://devguide.ropensci.org/)


### Documentation

Individual functions are documented using [{roxygen2}](https://roxygen2.r-lib.org).
Functions can be further organised using curated reference index in [{pkgdown}](https://pkgdown.r-lib.org/).

Scientific articles, blog posts or any other long-form documentation is written as a [{pkgdown}](https://pkgdown.r-lib.org/) *article* inside the packages `/vignettes/` folder.
These articles are written as `bookdown::html_document2()` documents *not* as proper R vignettes, to avoid duplicate publication and to allow more expensive computations.


### Shiny Application

tba.


### Computing Environments

There are two sets of computing environments on which this project is guaranteed to work as per the CI setup:

1. The {hoad} package, including all its functions, is guaranteed to work (pass `R CMD check`) on a wide variety of computing environments as per the CI setup, a subset of the [tidyverse setup](https://github.com/r-lib/actions/tree/master/examples#quickstart-ci-workflow) and [r-hub linux builders](https://github.com/r-hub/rhub-linux-builders).
Because these tests can be quite extensive, they are only run on commits to `master`, as well as pull requests to `master`.
Other branches are thus not guaranteed to work.
2. All other elements of this projects can be reproduced using a custom docker image in the `Dockerfile`:
- the [{pkgdown}](http://pkgdown.r-lib.org) website for hoad (https://subugoe.github.io/hoad/), including its articles (~ vignettes)
- the shiny web application
- (batch) jobs to reproduce more expensive analyses

This image is also published on every commit to [docker hub](https://hub.docker.com/repository/docker/subugoe/hoad), tagged by git sha and reference (branch, release).

This image also includes the RStudio IDE, so you can also develop *inside* this image.

To run this image, navigate your shell to the root of the repository on your machine and run:

```
docker run --env DISABLE_AUTH=true \
--publish 8787:8787 \
--volume $(pwd):/home/rstudio/Documents
subugoe/hoad:refactor-as-pkg
```

This will automatically download the build image from Docker Hub.
You can also rebuild it locally by running

```
docker build --tag hoad .
```

## Fixing typos

You can fix typos, spelling mistakes, or grammatical errors in the documentation directly using the GitHub web interface, as long as the changes are made in the _source_ file.
This generally means you'll need to edit [roxygen2 comments](https://roxygen2.r-lib.org/articles/roxygen2.html) in an `.R`, not a `.Rd` file.
You can find the `.R` file that generates the `.Rd` by reading the comment in the first line.

## Bigger changes

If you want to make a bigger change, it's a good idea to first file an issue and make sure someone from the team agrees that it’s needed.
If you’ve found a bug, please file an issue that illustrates the bug with a minimal
[reprex](https://www.tidyverse.org/help/#reprex) (this will also help you write a unit test, if needed).


### Pull request process

This requires kno

* Fork the package and clone onto your computer. If you haven't done this before, we recommend using `usethis::create_from_github("subugoe/hoad", fork = TRUE)`.

* Install all development dependences with `devtools::install_dev_deps()`, and then make sure the package passes R CMD check by running `devtools::check()`.
If R CMD check doesn't pass cleanly, it's a good idea to ask for help before continuing.
* Create a Git branch for your pull request (PR). We recommend using `usethis::pr_init("brief-description-of-change")`.

* Make your changes, commit to git, and then create a PR by running `usethis::pr_push()`, and following the prompts in your browser.
The title of your PR should briefly describe the change.
The body of your PR should contain `Fixes #issue-number`.

* For user-facing changes, add a bullet to the top of `NEWS.md` (i.e. just below the first header). Follow the style described in <https://style.tidyverse.org/news.html>.

### Code style

* New code should follow the tidyverse [style guide](https://style.tidyverse.org).
You can use the [styler](https://CRAN.R-project.org/package=styler) package to apply these styles, but please don't restyle code that has nothing to do with your PR.

* We use [roxygen2](https://cran.r-project.org/package=roxygen2), with [Markdown syntax](https://cran.r-project.org/web/packages/roxygen2/vignettes/markdown.html), for documentation.

* We use [testthat](https://cran.r-project.org/package=testthat) for unit tests.
Contributions with test cases included are easier to accept.


## Code of Conduct

Please note that the hoad project is released with a
[Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this
project you agree to abide by its terms.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the canonical place to store such meta files is now in .github/ -- {pkgdown} will automatically find them there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contributing guidelines seems to refer to {hoad}

Comment on lines -22 to -25
#' @importFrom progress progress_bar
#' @importFrom purrr pmap
#' @importFrom base64enc base64decode
#' @importFrom tibble tibble
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are redundant; if we always call foo::bar(), then we don't need to pollute our namespaces.
I think @importFrom makes most sense for those packages (very few) where we as pkg devs can't be bothered always explicating foo::bar().
Whenever possible the tidyverse style guide advises being explicit.

@@ -1,253 +1,46 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rendering a README.Rmd adds a lot of extra complexity and leads to derivative files in the git repo.
Instead, I've now moved the general pkg introduction to the openairegraph.Rmd vignette, which pkgdown will highlight on the website by default.
Vignettes always get rendered and published on every build, so that's one less thing to worry about and the repo stays clean.

@@ -1,149 +0,0 @@
<!-- Generated by pkgdown: do not edit by hand -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything in docs/ is deleted, because that pollutes the repo. Instead we know publish from the gh-pages branch automatically on every commit.

pdf_document:
fig_caption: yes
keep_tex: yes
template: lncs.sty
link-citations: yes
pkgdown:
as_is: true
vignette: >
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are only necessary for actually vignettes, and since we're only doing articles (= never plan on publishing vignettes to cran), this complexity can happily go.
I think it makes more sense to only publish articles, not vignettes, because vignettes just add more headaches and then there's two URLs where users can find roughly the same content (confusing).
I think the tidyverse team has also stopped publishing actual "vignettes" though everyone still calls them vignettes.
(Articles is what pkgdown calls vignettes which are pkgdown-only).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, was not aware of it!

Comment on lines -54 to -61
```r
library(openairegraph)
library(jsonlite) # tools to work with json files
library(tidyverse) # tools for tidy data analysis
# NB, you need to download the file from Zenodo first
oaire <- jsonlite::stream_in(file("data/h2020_results.gz"))
openairegraph::oarg_decode(oaire, records_path = "data/records/",
limit = 500, verbose = FALSE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this into a chunk which actually is evaluated in the CI to strengthen reproducibiity.

Comment on lines -80 to -92
future::plan(multisession) # initalize parallel computing
oaire_data <- future.apply::future_lapply(openaire_records,
function(files) {
# load xml file
doc <- xml2::read_xml(files)
# parser
out <- oarg_publications_md(doc)
out$linked_projects <- list(oarg_linked_projects(doc))
out$linked_ftxt <- list(oarg_linked_ftxt(doc))
# use file path as id
out$id <- files
out
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, this currently doesn't work on the CI as per #12 so I had to build a conditional around it

Comment on lines -107 to -111
library(openairegraph)
library(jsonlite)
library(tidyverse)
library(scales)
library(here)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only need this once for the entire vignette/article. library() calls persist.
The danger here, though is, that the chunk in question is cached.
Caches, unfortunately, don't carry side effects, so we should place whatever has side effects (such as library() inside non-cached chunks).

tibble::as_tibble()

pubs_projects <- oaire_df %>%
select(id, type, best_access_right, linked_projects) %>%
unnest(linked_projects)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super slow performance with this, no idea why #16

@@ -247,7 +262,7 @@ oa_monitor_ec %>%
labs(x = NULL,
y = "Open Access Percentage",
caption = "Data: OpenAIRE Research Graph") +
theme_minimal_vgrid(font_family = "Roboto") +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must enable this later in ci #5

@maxheld83 maxheld83 marked this pull request as ready for review May 6, 2020 16:41
@njahn82 njahn82 merged commit 3522dac into master May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants