Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: README_to_vignette() #327

Closed
jonocarroll opened this issue Apr 9, 2018 · 11 comments
Closed

Proposal: README_to_vignette() #327

jonocarroll opened this issue Apr 9, 2018 · 11 comments

Comments

@jonocarroll
Copy link
Contributor

As per the contributing guide, I'm posting an issue as a preface to a PR.

I've been working towards a goal of increasing vignette usage and part of that has been volunteering to write some. I've noticed along the way that while many packages might lack a vignette, they do tend to have a rather nice/complete rendered README.md file. This achieves a certain goal: that new users should see what the package does from the GitHub landing page. The downsides being that this file is not visible from the installed package itself, and that there is not guarantee that the README is not stale due to recent updates. A vignette is likely a more suitable place for this content, but users have either chosen not to use one or are not aware how to do so.

I propose a new function in usethis which converts an existing README.Rmd file to a proper vignette (as best as can be automated). This isn't a complete solution: the user would still need to edit this file, but it significantly lowers the barrier to entry. I feel that usethis is the right home for this function, alongside the usual use_vignette() functionality.

The second part of this would be teaching users how to source part or all of a vignette from the README.Rmd file using the child option from knitr: https://www.garrickadenbuie.com/blog/2018/03/05/dry-vignette-and-readme/

The benefit of this enabled workflow is that the vignette stays with the package, the README still contains the relevant information, updates still only need to occur in one place, and testing is performed on the example code on installation.

I seek the usethis maintainers' input on whether or not this would be a welcome PR. It is ready to go, here.

For additional context, I have processed several packages' README files into suitable vignettes as tests:

@jimhester
Copy link
Member

My main hang up with this idea is if something is in a vignette then there are more strict requirements about what packages you use in it. e.g. in fs I use dplyr and magrittr in the README, so if I made the readme also a vignette I would need to put dplyr and magrittr in Suggests, which means I would have to build them on travis, slowing down my build times (at least when they are not in the package cache). Practically, this means I would likely not have added the dplyr example to the fs README in the first place.

Having the README as a vignette also has complications for READMEs using external resources, e.g. you have to selectively evaluate chunks depending on whether the resource is available on travis / CRANs machines.

Also this PR as-is provides no mechanism for maintaining / updating the vignette as the README changes, which would make this cumbersome to use in practice. I don't really want to maintain similar code and documentation in two places.

I think the idea has some potential merit, but wanted to voice some concerns for package authors to keep in mind.

@jonocarroll
Copy link
Contributor Author

Those are all valid points. I see the requirement that the examples be generated from packages in Suggests as a positive one, rather than a negative. If a user wishes to be sure they can recreate that example, which presumably you've shown because you think it highlights important features of your package, then I would think that testing that would be a good thing, even if it does slow down your build times. At the moment there is no guarantee that code in the README currently works, but this is often the only place the author has detailed a use-case.

This idea doesn't require that the entire README be in the vignettes, merely gives a starting point for a vignette, chunks of which can be extracted into a separate README. I actually don't think most of the README files are good vignettes, but they're better than nothing (which is the most common scenario).

The 'similar code' part is hopefully countered via the child argument in knitr -- the code itself should be maintained in a single place (the vignette), with testing, and the README will reflect the current state, which can then be trusted to be working.

By all means, this is no panacea. I would hope the user ends up creating a detailed vignette for a proper use-case, and has a simplified landing-page-esque README, but one which uses validated code. Those authors already creating good vignettes have no need for this function. The truth is that they are a minority.

@jennybc
Copy link
Member

jennybc commented Apr 9, 2018

I'm sympathetic to the "better than nothing" argument re: turning a README.md into a vignette. But agree with @jimhester about the complications.

As an author of packages with lots of user interaction (like usethis!) and external API calls, I value the fact I can pre-render README.Rmd to README.md and sleep soundly knowing that CRAN won't try to to rerun that code every 24 hours, in two different ways. The points about CI and formal dependencies also resonate. I think our team is generally making a very conscious choice when we focus documentation in README and pkgdown sites vs vignettes. We're tired of fighting battles about example/vignette run time, internet calls that can be slow or fail sporadically, inability to link to or demonstrate use with packages that aren't formal dependencies, conflation of user-facing documentation with package testing, etc.

If a user wishes to be sure they can recreate that example, which presumably you've shown because you think it highlights important features of your package, then I would think that testing that would be a good thing, even if it does slow down your build times. At the moment there is no guarantee that code in the README currently works, but this is often the only place the author has detailed a use-case.

Our strong recommendation is to generate README.md from README.Rmd. This does ensure that the worked example actually works.

OK, so I think we've concluded that the tidyverse/r-lib team is not be the target audience for this functionality. Which means we won't take on maintenance of something with substantial moving parts.

I see two possible ways forward:

  • A function meant to be run once, like use_vignette(), to initiate a vignette. But instead of inserting boilerplate content, it seeds it with content from the README. It's possible this could live in usethis. Or you could also make a separate package to facilitate such one-time events, like Rd2roxygen.
  • The idea of sharing code between the vignette(s) and the README is good one. I know a lot of people do this. But as an actual, ongoing workflow, it would be outside the scope of usethis. Maybe this workflow should be supported by a focused package and it would naturally offer a function for initial conversion of a README to a vignette. Another area of vignette agony is the need to release static vignettes that you, as a developer, still want to render by running code. That might also fit into a package for more complicated vignette workflows.

@jonocarroll
Copy link
Contributor Author

Thankyou @jennybc but I must be misinterpreting something or have failed to present my case accurately.

Our strong recommendation is to generate README.md from README.Rmd. This does ensure that the worked example actually works.

This ensures that the code worked at some point (when the conversion was performed). I've had several instances where development has left the README in a stale state - the code has been updated and the example in the README has not. Moreoever, the README is only available online from that particular VCS, it does not travel (without additional configuration) with the package in a way the user can refer to it from an installed package. In any case, my proposed workflow (turn an existing README into a vignette, refer to the relevant chunks in a README via child) does not preclude that workflow.

I think our team is generally making a very conscious choice when we focus documentation in README and pkgdown sites vs vignettes.

Is the r-lib focus then to move away from vignettes? I find these are the most useful part of a pgkdown site. Else the only information there is the README itself and the per-function documentation. This too is not available to the installation, and can easily become stale.

A function meant to be run once, like use_vignette(), to initiate a vignette. But instead of inserting boilerplate content, it seeds it with content from the README. It's possible this could live in usethis.

I'm definitely missing something/have misrespresented myself because this is precisely my proposal.

@jennybc
Copy link
Member

jennybc commented Apr 9, 2018

I agree that the rendering of README.Rmd should probably wired up to some documentation process, so that it's less likely to grow stale.

Yes I now see that the current (almost) PR is about a one time translation of README.Rmd to a vignette, and I think that could be a useful function. The issue went on to talk about teaching people how to use vignette code as child chunks in a README and making the vignette primary, the README secondary. That moves into new territory that would require different tools and documentation.

Is the r-lib focus then to move away from vignettes?

No, that's certainly not some avowed policy. We want more rendered documentation in this world, not less! But I'd say we probably don't share your view that the vignette (as currently implemented in our ecosystem) is necessarily the ideal delivery format. If I can document something in a vignette, without putting myself at high risk of CRAN grief, I do (readxl, for example). But vignettes (and examples and tests) also have a certain "no good deed goes unpunished" effect on CRAN. Vignette code that could sporadically fail or that requires auth or that could take longer than a few seconds is just another point of vulnerability. So googledrive, for example, has no vignettes but several articles in its pkgdown site. It's just not worth the trouble.

Let's zoom out again. I agree it's great for packages to have some form of narrative documentation with at least one worked example. If someone has a README, then that is perfect content with which to initialize a vignette. We don't need to agree about the overall role of vignettes to move forward on that. Shall we?

@jonocarroll
Copy link
Contributor Author

Thank you, I think we are on the same page now.

The issue went on to talk about teaching people how to use vignette code as child chunks in a README and making the vignette primary, the README secondary. That moves into new territory that would require different tools and documentation.

The goal of this function is indeed for a user (who has provided a README but not a vignette) to migrate that information to a new vignette once, and for that to serve as a starting point for a better vignette. The additional step of making the README dependent on that vignette is beyond this scope but I think is of benefit. At this point both the README and the new vignette spawned from that would be present.

To the above point, I think having a message to the user on successful conversion along the lines of

You can include short independent documents in README.Rmd using the following syntax:
```{r child='vignettes/rmdhunks/example1.Rmd'} 
```
https://yihui.name/knitr/demo/child/

or read-in chunks stored in an independent R script using knitr::read_chunk()
https://yihui.name/knitr/demo/externalization/

but that is perhaps a bit much for a simple message. A better link to a guide might be good there.

As for the rest of the discussion, I find it unfortunate that such an integral component of the ecosystem manages to turn people away from a core feature of the package functionality (potentially a reliable and superior feature) but that's certainly a discussion for elsewhere. If I were to play devil's advocate then I'd suggest that if (strong if) testing/building/checking is properly implemented on the receiving side, then any potential failures of a vignette (be they uncaught auth failures or sporadic failures) signal that the code requires improvement and thus serves as invitation towards additional hardening of code. That said, I see no benefit in enforcing short runtimes or filesystem isolation for these.

My goal here is to make it as easy as possible to start making a vignette, and I think we agree that existing READMEs can be a useful first pass.

I'll update my proposed code to add stripping out of badges and (maybe) installation instructions (these are indeed better suited to the README and not needed from an installed package view) then submit a PR for further discussion/approval/rejection.

@jennybc
Copy link
Member

jennybc commented Apr 10, 2018

Sounds good. Let's concentrate on capturing the current value of README.Rmd in a basic vignette. I'd err on the side of keeping the function simple. The user will have to examine and edit the vignette no matter what.

I agree the above is too much for a message. People don't read them and they are ephemeral. You could put brief links re: workflow advice going forward in the help. Or, even better, insert it in the new vignette, with the idea that user eventually reads it and deletes it. That could include standard advice to, e.g., remove installation instructions.

uncaught auth failures

As soon as CRAN handles encrypted env vars and files, as Travis and AppVeyor do, I'd love to know.

sporadic failures) signal that the code requires improvement and thus serves as invitation towards additional hardening of code

You can't wrap an API like Google Drive or Google Sheets and completely avoid the occasional 502, even with retries. Therefore it's impossible to submit a vignette or example to CRAN than runs such code, because any failure leads to rejection and relegation to human intervention.

jonocarroll added a commit to jonocarroll/usethis that referenced this issue Apr 11, 2018
@hadley
Copy link
Member

hadley commented Apr 17, 2018

I don't think there's much value in converting a readme into a vignette. READMEs are a separate beast and, in my opinion, making them more vignette-like only detracts from their value. You hard-line stand on reproducibility illustrates that you've been lucky that the packages that you have worked on fit gracefully in to the strictures of CRAN.

@hadley hadley closed this as completed Apr 17, 2018
@hadley
Copy link
Member

hadley commented Apr 17, 2018

Of course, you're welcome to include such code in your own package; I just don't think it belongs in usethis.

@jonocarroll
Copy link
Contributor Author

Well, that seems disappointingly summarised, especially since it appears to misrepresent both the motivation and intended usage.

The intention was not that the README should be a drop-in vignette, but that the users who already have a README file could easily consolidate that work and create a first-pass vignette. I had no intention that the "README should be more vignette-like" at all or that the README-vignette.Rmd would be submitted as-is to CRAN in exactly the same way that I would not expect the boilerplate vignette to be submitted.

Moreover, this PR in no way affects the actual README file, so I am left confused by the comment and closure.

@hadley
Copy link
Member

hadley commented Apr 18, 2018

I don't find the motivation compelling and I briefly summarised my reasons - but there's nothing stopping you from including the code in another package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants