Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying options at knit time #19

Closed
halpo opened this issue Nov 23, 2011 · 9 comments
Closed

Specifying options at knit time #19

halpo opened this issue Nov 23, 2011 · 9 comments
Labels
feature Feature requests

Comments

@halpo
Copy link

halpo commented Nov 23, 2011

Sweave packages can specify options at compile time for certain options that the package supports, such as

> Sweave("test.Rnw", prefix.string="figures/fig")
Writing to file test.tex
Processing code chunks with options ...
 1 : echo term verbatim (label = first)
 2 : echo term verbatim (label = cachesomething)
 3 : term verbatim pdf (label = plotit)

You can now run (pdf)latex on 'test.tex'

creates a tex document and places all figures in the figures directory with fig prefixed to the figure. I see no way to specify this in knitr

> knit("test.Rnw", prefix.string='figures/fig-')
Error in knit("test.Rnw", prefix.string = "figures/fig-") : 
  unused argument(s) (prefix.string = "figures/fig-")

knit maybe could accept extra arguments in the form of knit(input, output, pattern, ...) where ... would be used to specify defaults that are then written over for file and chunk options.

This is extremely useful for specifying with a Makefile

%.tex:%Rnw
     Rscript -e "library(knitr);knit('$^","$@", prefix.string='figures/$*-')"

Which creates a tex file but could also puts graphics in a specific graphics folder which helps with cleanup, and prefixes graphics with the filename stem to prevent naming collisions, very useful with automated reporting.

Preferably, I would like to see knitr accept anything as an option and either silently or with warning discard those not valid or not used by specific blocks.

Also as a note. Original Sweave allowed for options set through SWEAVE_OPTS environment variable, although not all extension packages use this, nor is is used very much be people in general I believe.

@yihui
Copy link
Owner

yihui commented Nov 23, 2011

all options are at http://yihui.github.com/knitr/options (see prefix.string) in this case

I do not think it is a good idea to put options in knit(...); to make a document more reproducible, options should be shipped with the document itself (otherwise you need to tell other people how you called knit()), so a better approach for me is to use \SweaveOpts{} inside an Rnw file to make it really self-contained.

Basically I want to keep the number of arguments minimal in a function. In my early packages, I used to use ... but later I feel this is more and more like a disaster.

The environment variable is even more a disaster; I believe it drags Sweave further away from reproducible research.

@halpo
Copy link
Author

halpo commented Nov 23, 2011

The downside to that is that it pushes people into the box of reproducible, even if the do not want it. Consider creating automated reporting where the data changes. Someone might want to retain all graphics and output for the last month or runs, which could be accomplished very easily with run time global options, but otherwise would require editing the file every time to adjust the single parameter you want changed.

@halpo
Copy link
Author

halpo commented Nov 23, 2011

your pointing me to the objects showed me how to get around this. My makefile rule is now:

%.tex:%.Rnw | $(CACHEDIR) $(FIGUREDIR)
    Rscript -e "library('knitr');knitr::opts_chunk[['set']](prefix.string='$(FIGUREDIR)/$*-');knit('$<','$@')"

which achieves the desired result. I still think that the cache directory should be separable from the figures directory, but I totally agree that the environment variable is messy. On the other hand the setup as now configured almost replicates the environment problem since the options are set in the R environment rather than in the function call. Which mean someone must show the R environment objects when compiled rather than just the function call used to compile.

@yihui
Copy link
Owner

yihui commented Nov 24, 2011

In fact the initial design separated the two directories (as cacheSweave did), but later I changed my mind, and the reason was because I saw your post on SO: http://stackoverflow.com/questions/7759914/is-there-a-way-to-define-all-sweave-options-in-code

I thought on this issue for a long time, and came to a conclusion that we probably should not care about the output figures and cache files. What is most important is usually the final PDF document, so it might not hurt to put all the messy intermediate output files into a single directory.

If these two directories are separate, imagine now you have a second Rnw document under the same directory, and you would probably end up with four directories (figure and cache for two Rnw documents respectively), which I believe is messy to manage.

Again, I want to keep the number of options minimal; more options mean the user has to memorize more. Since I have a prefix.string, I will use it for everything which is intermediate output. Note knitr is indeed neater than cacheSweave; it only produce two files for each cached chunk, and cacheSweave produces a whole bunch of sub-directories.

@halpo
Copy link
Author

halpo commented Nov 24, 2011

simplicity can be combined with power. My gripe was with the fragmentation of setting options, which is why I am grateful for knitr. Unifying the method for specifying options is a good thing, but I'm not sure that I like breaking with the established method of passing along as extra options.

On the argument of separating cache and figures, I certainly think that there are reasons for separating them. The most obvious is that for publication, journals want your figures but not your cached intermediate results. Finding, examining, and sharing figures is something that I want to do and do often. sometimes just pulling those out for a separate manuscript. Cache, on the other hand I want to work but be out of sight. It would probably even be a good idea for those directories to be hidden by default. We really are mixing this with issue #20 now.

@yihui
Copy link
Owner

yihui commented Nov 24, 2011

Let's focus on this issue and forget about #20.

The decision on ... is pretty hard, although the implementation efforts will be trivial (one short line of code). This is usually the problem of R itself: there are tons of approaches to do the same thing. I used to think this (flexibility) as an advantage, but now my opinion has changed. I won't add ... unless I see obvious gain over \SweaveOpts{}.

I'm convinced now on the separation of the cache directory. So you'd like it to be a chunk option as well? (instead of calling something like setCacheDir())

@yihui
Copy link
Owner

yihui commented Nov 24, 2011

You have chunk option prefix.cache now.

@halpo
Copy link
Author

halpo commented Nov 29, 2011

prefix.cache works great. Thank you.

@yihui yihui closed this as completed Dec 3, 2011
@github-actions
Copy link

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature Feature requests
Projects
None yet
Development

No branches or pull requests

2 participants