+
Documenting package datasets
+
+
Datasets used in package examples are such an important part of making a package understandable and usable, but is often overlooked.
+In developing the heplots
package I collected a large collection of data sets illustrating a
+variety of multivariate linear models with some an analyses, and graphical displays. Each of these have much more than the
+usual stub examples, that often look like:
+
+data(dataset)
+# str(dataset); plot(dataset)
+
But .Rd
, and now roxygen
, don’t make it easy to work with numerous datasets in a package, or, more importantly, to document what they illustrate. I’m showing the work to create this vignette, in case these ideas are useful to others.
+
In this release, I started with a file generated by:
+
+vcdExtra::datasets("heplots") |> head(4)
+#> Item class dim Title
+#> 1 AddHealth data.frame 4344x3 Adolescent Mental Health Data
+#> 2 Adopted data.frame 62x6 Adopted Children
+#> 3 Bees data.frame 246x6 Captive and maltreated bees
+#> 4 Diabetes data.frame 145x6 Diabetes Dataset
+
Then, in the roxygen documentation, I added @concept
tags to classify these datasets according to methods used. For example,
+the documentation for the AddHealth
data contains these lines:
+
+#' @name AddHealth
+#' @docType data
+ ...
+#' @keywords datasets
+#' @concept MANOVA
+#' @concept ordered
+
With standard
+processing, these concepts along with the keywords, appear in the Index section of the manual constructed by devtools::build_manual()
. In the pkgdown
+site for this package, they are also searchable in the search box.
+
With a bit of extra processing, I created a dataset datasets.csv
+used below.
+
+