Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal : could roxygen2 parse the R files in collate order, to improve @describeIn and @rdname #323

Closed
Geoff99 opened this issue Feb 22, 2015 · 1 comment

Comments

@Geoff99
Copy link
Contributor

Geoff99 commented Feb 22, 2015

Synopsis

Could 3 lines of code in roxygen2 be changed so that it parses the R files in collate order, when it is generating documentation? This would improve the readability of the generated documentation when using @describeIn and/or @rdname split across several .R files, as may happen when working with S4 classes, generic functions and methods.

Background

The present version of roxygen2, 4.1.0, reads the R code files for a package in alphabetical order when it parses them --- inside roxygen2:::parse_package() there is a line of code which says :

parsed <- lapply(r_files(base_path), parse_file, env = env)

r_files(base-path) returns a list of .r filenames in alphabetical order.

In contrast, both roxygen2::roxygenise() and devtools::document() source the files in collate order.

Motivation

The consequence is that although the sourced code itself has been loaded in the correct (collate) order, the documentation that roxygen2 generates appears in the alphabetical order of the R files. This doesn't matter most of the time, because often the contents of a single Rd file come from a single .r file.

However when @rdname <some-name>, or @describeIn <some-other-name> have been used in more than one R file, and the correct (i.e. collate order) is not alphabetical, it leads to the generated documentation appearing in the wrong order. This typically might happen when using S4 classes, or generic functions and methods. For example if the method appears in a file which alphabetically precedes the file in which the generic function is created, the generated documentation describes the method first, and the generic function second, even if #' @include ... has been used to specify the correct collate order for the files. See the example below.

Proposed change

It requires changes to only 3 lines of code inside roxygen2 to have it parse the code files for a package in collate order.

Two of those lines are in the roxygen2:::package_files function (in source.r) and also have the effect of making that function, which is currently only used by roxygen2:::source_package, behave more as I suspect is actually intended. At the moment roxygen2:::package_files fails to deduplicate the list of file names that it returns, which means that roxygen2:::source_package will source the code files more than once. See the addendum to this example below for an illustration of this.

The third line to be changed is in roxygen2:::parse_package itself.

If you are amenable to considering this change, I can easily submit a pull request. I have written a branched version of roxygen2 already while I was investigating why my documentation didn't look like I expected.

Example to illustrate the issue

File b.r is logically the first file, but alphabetically second.

It creates and documents a generic function, f_gen

#' @rdname Explanation
#'
#' @name Explanation
#'
#' @title First file in code logic order, but second file alphabetically.
#'
#' @details Alphabetically b.r is the second file, but in collate order it is the first.
#'
NULL

a <- 1

#' @title f_gen
#'
#' @description f_gen is a generic function
#'
#' @details The default f_gen just emits a simple message
#'
#' @param x The compulsory parameter
#'
setGeneric('f_gen', def = function(x) {message('I am f_gen, x was ', x)})
File a.r is logically the second file, but alphabetically first.

It creates and documents a method for f_gen

#' @include b.r
#'
#' @rdname Explanation
#'
#' @name Explanation
#'
#' @title Second file in code logic order, but first alphabetically.
#'
#' @details Alphabetically a.r is the first file, but in collate order it is the second.
#'
NULL

a <- 2

#' @describeIn f_gen
#'
#' @title f_gen_logical
#'
#' @description f_gen_logical is the method for f_gen when x is logical
#'
#' @details f_gen_logical is just a pseudo name for documentation purposes.  This is really an f_gen method with signature "logical".  It emits a different message to the f_gen default method.
#'
setMethod('f_gen', signature = "logical", def = function(x) {
  message('I am the f_gen method for a logical signature, x was ', x)
})
Rd for f_gen using current CRAN version of roxygen2

Using the a very recent github version of devtools with the CRAN release version of roxygen2 produces an .Rd file for f_gen which looks as follows. The key point to note is that the documented description and details for the method come before the corresponding information about the generic.

% Generated by roxygen2 (4.1.0): do not edit by hand
% Please edit documentation in R/a.r, R/b.r
\docType{methods}
\name{f_gen,logical-method}
\alias{f_gen}
\alias{f_gen,logical-method}
\title{f_gen_logical}
\usage{
\S4method{f_gen}{logical}(x)

f_gen(x)
}
\arguments{
\item{x}{The compulsory parameter}
}
\description{
f_gen_logical is the method for f_gen when x is logical

f_gen is a generic function
}
\details{
f_gen_logical is just a pseudo name for documentation purposes.  This is really an f_gen method with signature "logical".  It emits a different message to the f_gen default method.

The default f_gen just emits a simple message
}
\section{Methods (by class)}{
\itemize{
\item \code{logical}: 
}}
Rd for f_gen generated using a branched version of roxygen2 with the proposed alterations

The key point to note is that the generic f_gen is described before the method in the documentation (which makes more sense to me).

% Generated by roxygen2 (4.1.0.9001): do not edit by hand
% Please edit documentation in R/b.r, R/a.r
\docType{methods}
\name{f_gen}
\alias{f_gen}
\alias{f_gen,logical-method}
\title{f_gen}
\usage{
f_gen(x)

\S4method{f_gen}{logical}(x)
}
\arguments{
\item{x}{The compulsory parameter}
}
\description{
f_gen is a generic function

f_gen_logical is the method for f_gen when x is logical
}
\details{
The default f_gen just emits a simple message

f_gen_logical is just a pseudo name for documentation purposes.  This is really an f_gen method with signature "logical".  It emits a different message to the f_gen default method.
}
\section{Methods (by class)}{
\itemize{
\item \code{logical}: 
}}

session_info() for the run that created the first (not quite right) Rd file
> session_info()
Session info ------------------------------------------------------------------
 setting  value                       
 version  R version 3.1.2 (2014-10-31)
 system   x86_64, mingw32             
 ui       RStudio (0.98.953)          
 language (EN)                        
 collate  English_Australia.1252      
 tz       Australia/Sydney            

Packages ----------------------------------------------------------------------
 package          * version    date       source        
 devtools           1.7.0.9000 2015-02-20 local         
 digest           * 0.6.4      2013-12-03 CRAN (R 3.1.1)
 htmltools        * 0.2.6      2014-09-08 CRAN (R 3.1.2)
 Rcpp             * 0.11.4     2015-01-24 CRAN (R 3.1.2)
 rmarkdown        * 0.5.1      2015-01-26 CRAN (R 3.1.2)
 roxygen2         * 4.1.0      2014-12-13 CRAN (R 3.1.2)
 rstudio          * 0.98.953   2014-08-02 local         
 rstudioapi       * 0.2        2014-12-31 CRAN (R 3.1.2)
 stringr          * 0.6.2      2012-12-06 CRAN (R 3.1.1)
 testPackageFiles   0.0.0.9000 <NA>       local         
 yaml             * 2.1.13     2014-06-12 CRAN (R 3.1.2)
> 

Addendum --- example of the problem with the current CRAN version of roxygen2:::package_files()

Note that package_files returns the file names twice, once in collate order but without any path information, and again in alphabetical order, but including a relative path.

> roxygen2:::package_files('.')
[1] "b.r"     "a.r"     "./R/a.r" "./R/b.r"
> roxygen2:::r_files('.')
[1] "./R/a.r" "./R/b.r"

@hadley
Copy link
Member

hadley commented Feb 23, 2015

Seems reasonable to me. Can you submit a pull request please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants