-
Notifications
You must be signed in to change notification settings - Fork 238
Description
Synopsis
Could 3 lines of code in roxygen2 be changed so that it parses the R files in collate order, when it is generating documentation? This would improve the readability of the generated documentation when using @describeIn and/or @rdname split across several .R files, as may happen when working with S4 classes, generic functions and methods.
Background
The present version of roxygen2, 4.1.0, reads the R code files for a package in alphabetical order when it parses them --- inside roxygen2:::parse_package() there is a line of code which says :
parsed <- lapply(r_files(base_path), parse_file, env = env)
r_files(base-path) returns a list of .r filenames in alphabetical order.
In contrast, both roxygen2::roxygenise() and devtools::document() source the files in collate order.
Motivation
The consequence is that although the sourced code itself has been loaded in the correct (collate) order, the documentation that roxygen2 generates appears in the alphabetical order of the R files. This doesn't matter most of the time, because often the contents of a single Rd file come from a single .r file.
However when @rdname <some-name>, or @describeIn <some-other-name> have been used in more than one R file, and the correct (i.e. collate order) is not alphabetical, it leads to the generated documentation appearing in the wrong order. This typically might happen when using S4 classes, or generic functions and methods. For example if the method appears in a file which alphabetically precedes the file in which the generic function is created, the generated documentation describes the method first, and the generic function second, even if #' @include ... has been used to specify the correct collate order for the files. See the example below.
Proposed change
It requires changes to only 3 lines of code inside roxygen2 to have it parse the code files for a package in collate order.
Two of those lines are in the roxygen2:::package_files function (in source.r) and also have the effect of making that function, which is currently only used by roxygen2:::source_package, behave more as I suspect is actually intended. At the moment roxygen2:::package_files fails to deduplicate the list of file names that it returns, which means that roxygen2:::source_package will source the code files more than once. See the addendum to this example below for an illustration of this.
The third line to be changed is in roxygen2:::parse_package itself.
If you are amenable to considering this change, I can easily submit a pull request. I have written a branched version of roxygen2 already while I was investigating why my documentation didn't look like I expected.
Example to illustrate the issue
File b.r is logically the first file, but alphabetically second.
It creates and documents a generic function, f_gen
#' @rdname Explanation
#'
#' @name Explanation
#'
#' @title First file in code logic order, but second file alphabetically.
#'
#' @details Alphabetically b.r is the second file, but in collate order it is the first.
#'
NULL
a <- 1
#' @title f_gen
#'
#' @description f_gen is a generic function
#'
#' @details The default f_gen just emits a simple message
#'
#' @param x The compulsory parameter
#'
setGeneric('f_gen', def = function(x) {message('I am f_gen, x was ', x)})
File a.r is logically the second file, but alphabetically first.
It creates and documents a method for f_gen
#' @include b.r
#'
#' @rdname Explanation
#'
#' @name Explanation
#'
#' @title Second file in code logic order, but first alphabetically.
#'
#' @details Alphabetically a.r is the first file, but in collate order it is the second.
#'
NULL
a <- 2
#' @describeIn f_gen
#'
#' @title f_gen_logical
#'
#' @description f_gen_logical is the method for f_gen when x is logical
#'
#' @details f_gen_logical is just a pseudo name for documentation purposes. This is really an f_gen method with signature "logical". It emits a different message to the f_gen default method.
#'
setMethod('f_gen', signature = "logical", def = function(x) {
message('I am the f_gen method for a logical signature, x was ', x)
})Rd for f_gen using current CRAN version of roxygen2
Using the a very recent github version of devtools with the CRAN release version of roxygen2 produces an .Rd file for f_gen which looks as follows. The key point to note is that the documented description and details for the method come before the corresponding information about the generic.
% Generated by roxygen2 (4.1.0): do not edit by hand
% Please edit documentation in R/a.r, R/b.r
\docType{methods}
\name{f_gen,logical-method}
\alias{f_gen}
\alias{f_gen,logical-method}
\title{f_gen_logical}
\usage{
\S4method{f_gen}{logical}(x)
f_gen(x)
}
\arguments{
\item{x}{The compulsory parameter}
}
\description{
f_gen_logical is the method for f_gen when x is logical
f_gen is a generic function
}
\details{
f_gen_logical is just a pseudo name for documentation purposes. This is really an f_gen method with signature "logical". It emits a different message to the f_gen default method.
The default f_gen just emits a simple message
}
\section{Methods (by class)}{
\itemize{
\item \code{logical}:
}}
Rd for f_gen generated using a branched version of roxygen2 with the proposed alterations
The key point to note is that the generic f_gen is described before the method in the documentation (which makes more sense to me).
% Generated by roxygen2 (4.1.0.9001): do not edit by hand
% Please edit documentation in R/b.r, R/a.r
\docType{methods}
\name{f_gen}
\alias{f_gen}
\alias{f_gen,logical-method}
\title{f_gen}
\usage{
f_gen(x)
\S4method{f_gen}{logical}(x)
}
\arguments{
\item{x}{The compulsory parameter}
}
\description{
f_gen is a generic function
f_gen_logical is the method for f_gen when x is logical
}
\details{
The default f_gen just emits a simple message
f_gen_logical is just a pseudo name for documentation purposes. This is really an f_gen method with signature "logical". It emits a different message to the f_gen default method.
}
\section{Methods (by class)}{
\itemize{
\item \code{logical}:
}}
session_info() for the run that created the first (not quite right) Rd file
> session_info()
Session info ------------------------------------------------------------------
setting value
version R version 3.1.2 (2014-10-31)
system x86_64, mingw32
ui RStudio (0.98.953)
language (EN)
collate English_Australia.1252
tz Australia/Sydney
Packages ----------------------------------------------------------------------
package * version date source
devtools 1.7.0.9000 2015-02-20 local
digest * 0.6.4 2013-12-03 CRAN (R 3.1.1)
htmltools * 0.2.6 2014-09-08 CRAN (R 3.1.2)
Rcpp * 0.11.4 2015-01-24 CRAN (R 3.1.2)
rmarkdown * 0.5.1 2015-01-26 CRAN (R 3.1.2)
roxygen2 * 4.1.0 2014-12-13 CRAN (R 3.1.2)
rstudio * 0.98.953 2014-08-02 local
rstudioapi * 0.2 2014-12-31 CRAN (R 3.1.2)
stringr * 0.6.2 2012-12-06 CRAN (R 3.1.1)
testPackageFiles 0.0.0.9000 <NA> local
yaml * 2.1.13 2014-06-12 CRAN (R 3.1.2)
>
Addendum --- example of the problem with the current CRAN version of roxygen2:::package_files()
Note that package_files returns the file names twice, once in collate order but without any path information, and again in alphabetical order, but including a relative path.
> roxygen2:::package_files('.')
[1] "b.r" "a.r" "./R/a.r" "./R/b.r"
> roxygen2:::r_files('.')
[1] "./R/a.r" "./R/b.r"