Documention with markdown and remove mentions to master branch (#551)

* Remove the docs/ directory (already deployed via GitHub pages) last updated 3 years ago. * `usethis::use_roxygen_md()` * `roxygen2md::roxygen2md()` and `devtools::document()` * usethis::use_tidy_description() * master -> main * Removal of `@title`, and cosmetic changes * `devtools::document()` * Revert * Redocument * Fix Warnings (no need for \%. % is fine. * styler::style_pkg() * Add `@keywords internal`, so that they don't show up in the function index. * However, the .Rd file is still created, so that people using `?convert_to_NA` will still see the documentation file. * Address comments and redocument. * Revert moving to `@details` * WS * Update DESCRIPTION Missed it when solving conflicts * Address comments * Minor additional cleanup * WS * `use_tidy_description()`
sfirke · Aug 13, 2023 · b2700cf · b2700cf
1 parent 7375941
commit b2700cf
Show file tree

Hide file tree

Showing 68 changed files with 526 additions and 430 deletions.
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -23,7 +23,7 @@ If your proposed contribution addresses multiple issues, it should ideally be br
 * Make sure to track progress upstream (i.e., on our version of `janitor` at `sfirke/janitor`) by doing `git remote add upstream https://github.com/sfirke/janitor.git`. Before making changes make sure to pull changes in from upstream by doing either `git fetch upstream` then merge later or `git pull upstream` to fetch and merge in one step
 * Make your changes (bonus points for making changes on a new feature branch)
 * Push up to your account
-* Submit a pull request to the master branch at `sfirke/janitor`
+* Submit a pull request to the main branch at `sfirke/janitor`
 
 ### Prefer to discuss over email?
 Email Sam.  His email address is in the `DESCRIPTION` file of this repo.

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,22 +1,25 @@
 Package: janitor
 Title: Simple Tools for Examining and Cleaning Dirty Data
 Version: 2.2.0.9000
-Authors@R: c(person("Sam", "Firke", email = "samuel.firke@gmail.com", role = c("aut", "cre")),
-    person("Bill", "Denney", email = "wdenney@humanpredictions.com", role = "ctb"),
-    person("Chris", "Haid", email = "chrishaid@gmail.com", role = "ctb"),
-    person("Ryan", "Knight", email = "ryangknight@gmail.com", role = "ctb"),
-    person("Malte", "Grosser", email = "malte.grosser@gmail.com", role = "ctb"),
-    person("Jonathan", "Zadra", email = "jonathan.zadra@sorensonimpact.com", role = "ctb"))
-Description: The main janitor functions can: perfectly format data.frame column
-    names; provide quick counts of variable combinations (i.e., frequency
-    tables and crosstabs); and explore duplicate records. Other janitor functions
-    nicely format the tabulation results. These tabulate-and-report functions
-    approximate popular features of SPSS and Microsoft Excel. This package
-    follows the principles of the "tidyverse" and works well with the pipe function
-    %>%. janitor was built with beginning-to-intermediate R users in mind and is
-    optimized for user-friendliness.
-URL: https://github.com/sfirke/janitor,
-    https://sfirke.github.io/janitor/
+Authors@R: c(
+    person("Sam", "Firke", , "samuel.firke@gmail.com", role = c("aut", "cre")),
+    person("Bill", "Denney", , "wdenney@humanpredictions.com", role = "ctb"),
+    person("Chris", "Haid", , "chrishaid@gmail.com", role = "ctb"),
+    person("Ryan", "Knight", , "ryangknight@gmail.com", role = "ctb"),
+    person("Malte", "Grosser", , "malte.grosser@gmail.com", role = "ctb"),
+    person("Jonathan", "Zadra", , "jonathan.zadra@sorensonimpact.com", role = "ctb")
+  )
+Description: The main janitor functions can: perfectly format data.frame
+    column names; provide quick counts of variable combinations (i.e.,
+    frequency tables and crosstabs); and explore duplicate records. Other
+    janitor functions nicely format the tabulation results. These
+    tabulate-and-report functions approximate popular features of SPSS and
+    Microsoft Excel. This package follows the principles of the
+    "tidyverse" and works well with the pipe function %>%. janitor was
+    built with beginning-to-intermediate R users in mind and is optimized
+    for user-friendliness.
+License: MIT + file LICENSE
+URL: https://github.com/sfirke/janitor, https://sfirke.github.io/janitor/
 BugReports: https://github.com/sfirke/janitor/issues
 Depends:
     R (>= 3.1.2)
@@ -28,14 +31,11 @@ Imports:
     magrittr,
     purrr,
     rlang,
+    snakecase (>= 0.9.2),
     stringi,
     stringr,
-    snakecase (>= 0.9.2),
-    tidyselect (>= 1.0.0),
-    tidyr (>= 1.0.0)
-License: MIT + file LICENSE
-LazyData: true
-RoxygenNote: 7.2.3
+    tidyr (>= 1.0.0),
+    tidyselect (>= 1.0.0)
 Suggests:
     dbplyr,
     knitr,
@@ -45,6 +45,10 @@ Suggests:
     testthat (>= 3.0.0),
     tibble,
     tidygraph
-VignetteBuilder: knitr
-Encoding: UTF-8
+VignetteBuilder: 
+    knitr
 Config/testthat/edition: 3
+Encoding: UTF-8
+LazyData: true
+Roxygen: list(markdown = TRUE)
+RoxygenNote: 7.2.3
diff --git a/R/adorn_ns.R b/R/adorn_ns.R
@@ -1,18 +1,16 @@
-#' @title Add underlying Ns to a tabyl displaying percentages.
+#' Add underlying Ns to a tabyl displaying percentages.
 #'
-#' @description
-#' This function adds back the underlying Ns to a \code{tabyl} whose percentages were calculated using \code{adorn_percentages()}, to display the Ns and percentages together.  You can also call it on a non-tabyl data.frame to which you wish to append Ns.
+#' This function adds back the underlying Ns to a `tabyl` whose percentages were calculated using `adorn_percentages()`, to display the Ns and percentages together.  You can also call it on a non-tabyl data.frame to which you wish to append Ns.
 #'
-#' @param dat a data.frame of class \code{tabyl} that has had \code{adorn_percentages} and/or \code{adorn_pct_formatting} called on it.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
+#' @param dat a data.frame of class `tabyl` that has had `adorn_percentages` and/or `adorn_pct_formatting` called on it.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
 #' @param position should the N go in the front, or in the rear, of the percentage?
-#' @param ns the Ns to append.  The default is the "core" attribute of the input tabyl \code{dat}, where the original Ns of a two-way \code{tabyl} are stored.  However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
-#' @param format_func a formatting function to run on the Ns.  Consider defining with \code{base::format()}.
-#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all columns are adorned except for the first column and columns not of class \code{numeric}, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
+#' @param ns the Ns to append.  The default is the "core" attribute of the input tabyl `dat`, where the original Ns of a two-way `tabyl` are stored.  However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
+#' @param format_func a formatting function to run on the Ns.  Consider defining with [base::format()].
+#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all columns are adorned except for the first column and columns not of class `numeric`, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
 #'
 #' @return a data.frame with Ns appended
 #' @export
 #' @examples
-#'
 #' mtcars %>%
 #'   tabyl(am, cyl) %>%
 #'   adorn_percentages("col") %>%

diff --git a/R/adorn_pct_formatting.R b/R/adorn_pct_formatting.R
@@ -1,21 +1,34 @@
-#' @title Format a data.frame of decimals as percentages.
+#' Format a `data.frame` of decimals as percentages.
 #'
 #' @description
-#' Numeric columns get multiplied by 100 and formatted as percentages according to user specifications.  This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument.  Non-numeric columns are always excluded.
+#' Numeric columns get multiplied by 100 and formatted as
+#' percentages according to user specifications. This function defaults to
+#' excluding the first column of the input data.frame, assuming that it contains
+#' a descriptive variable, but this can be overridden by specifying the columns
+#' to adorn in the `...` argument.  Non-numeric columns are always excluded.
 #'
-#' The decimal separator character is the result of \code{getOption("OutDec")}, which is based on the user's locale.  If the default behavior is undesirable,
-#' change this value ahead of calling the function, either by changing locale or with \code{options(OutDec = ",")}.  This aligns the decimal separator character with that used in \code{base::print()}.
+#' The decimal separator character is the result of `getOption("OutDec")`, which
+#' is based on the user's locale. If the default behavior is undesirable,
+#' change this value ahead of calling the function, either by changing locale or
+#' with `options(OutDec = ",")`.  This aligns the decimal separator character
+#' with that used in `base::print()`.
 #'
-#' @param dat a data.frame with decimal values, typically the result of a call to \code{adorn_percentages} on a \code{tabyl}.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
+#' @param dat a data.frame with decimal values, typically the result of a call
+#'   to `adorn_percentages` on a `tabyl`. If given a list of data.frames, this
+#'   function will apply itself to each data.frame in the list (designed for
+#'   3-way `tabyl` lists).
 #' @param digits how many digits should be displayed after the decimal point?
-#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
-#' @param affix_sign should the \% sign be affixed to the end?
-#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
-#'
+#' @param rounding method to use for rounding - either "half to even", the base
+#'   R default method, or "half up", where 14.5 rounds up to 15.
+#' @param affix_sign should the % sign be affixed to the end?
+#' @param ... columns to adorn. This takes a tidyselect specification.  By
+#'   default, all numeric columns (besides the initial column, if numeric) are
+#'   adorned, but this allows you to manually specify which columns should be
+#'   adorned, for use on a data.frame that does not result from a call to
+#'   `tabyl`.
 #' @return a data.frame with formatted percentages
 #' @export
 #' @examples
-#'
 #' mtcars %>%
 #'   tabyl(am, cyl) %>%
 #'   adorn_percentages("col") %>%

diff --git a/R/adorn_percentages.R b/R/adorn_percentages.R
@@ -1,12 +1,11 @@
-#' @title Convert a data.frame of counts to percentages.
+#' Convert a data.frame of counts to percentages.
 #'
-#' @description
-#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument.
+#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the `...` argument.
 #'
-#' @param dat a \code{tabyl} or other data.frame with a tabyl-like layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
+#' @param dat a `tabyl` or other data.frame with a tabyl-like layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
 #' @param denominator the direction to use for calculating percentages.  One of "row", "col", or "all".
 #' @param na.rm should missing values (including NaN) be omitted from the calculations?
-#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
+#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
 #'
 #' @return Returns a data.frame of percentages, expressed as numeric values between 0 and 1.
 #' @export

diff --git a/R/adorn_rounding.R b/R/adorn_rounding.R
@@ -1,14 +1,14 @@
-#' @title Round the numeric columns in a data.frame.
+#' Round the numeric columns in a data.frame.
 #'
 #' @description
-#' Can run on any data.frame with at least one numeric column.  This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the \code{...} argument.
+#' Can run on any data.frame with at least one numeric column.  This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the `...` argument.
 #'
-#' If you're formatting percentages, e.g., the result of \code{adorn_percentages()}, use \code{adorn_pct_formatting()} instead.  This is a more flexible variant for ad-hoc usage.  Compared to \code{adorn_pct_formatting()}, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame.   This function retains the class of numeric input columns.
+#' If you're formatting percentages, e.g., the result of `adorn_percentages()`, use `adorn_pct_formatting()` instead.  This is a more flexible variant for ad-hoc usage.  Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame.   This function retains the class of numeric input columns.
 #'
-#' @param dat a \code{tabyl} or other data.frame with similar layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
+#' @param dat a `tabyl` or other data.frame with similar layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
 #' @param digits how many digits should be displayed after the decimal point?
 #' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
-#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
+#' @param ... columns to adorn.  This takes a tidyselect specification.  By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
 #'
 #' @return Returns the data.frame with rounded numeric columns.
 #' @export
@@ -39,7 +39,7 @@
 #'
 #' cases %>%
 #'   adorn_percentages(, , ends_with("ed")) %>%
-#'   adorn_rounding(, , one_of(c("recovered", "died")))
+#'   adorn_rounding(, , all_of(c("recovered", "died")))
 adorn_rounding <- function(dat, digits = 1, rounding = "half to even", ...) {
   # if input is a list, call purrr::map to recursively apply this function to each data.frame
   if (is.list(dat) && !is.data.frame(dat)) {

diff --git a/R/adorn_title.R b/R/adorn_title.R
@@ -1,13 +1,13 @@
 #' @title Add column name to the top of a two-way tabyl.
 #'
 #' @description
-#' This function adds the column variable name to the top of a \code{tabyl} for a complete display of information.  This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
+#' This function adds the column variable name to the top of a `tabyl` for a complete display of information.  This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
 #'
-#' @param dat a data.frame of class \code{tabyl} or other data.frame with a tabyl-like layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
-#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row \code{"top"} or appended to the already-present row name variable (\code{"combined"}).  The formatting in the \code{"top"} option has the look of base R's \code{table()}; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting.  The \code{"combined"} option is more conservative in this regard.
-#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input \code{tabyl} object.  If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
-#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input \code{tabyl} object.  If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
-#' @return the input tabyl, augmented with the column title.  Non-tabyl inputs that are of class \code{tbl_df} are downgraded to basic data.frames so that the title row prints correctly.
+#' @param dat a data.frame of class `tabyl` or other data.frame with a tabyl-like layout.  If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
+#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row `"top"` or appended to the already-present row name variable (`"combined"`).  The formatting in the `"top"` option has the look of base R's `table()`; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting.  The `"combined"` option is more conservative in this regard.
+#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input `tabyl` object.  If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
+#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input `tabyl` object.  If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
+#' @return the input tabyl, augmented with the column title.  Non-tabyl inputs that are of class `tbl_df` are downgraded to basic data.frames so that the title row prints correctly.
 #'
 #' @export
 #' @examples