Skip to content

Commit

Permalink
Documention with markdown and remove mentions to master branch (#551)
Browse files Browse the repository at this point in the history
* Remove the docs/ directory (already deployed via GitHub pages) last updated 3 years ago.

* `usethis::use_roxygen_md()`

* `roxygen2md::roxygen2md()` and `devtools::document()`

* usethis::use_tidy_description()

* master -> main

* Removal of `@title`, and cosmetic changes

* `devtools::document()`

* Revert

* Redocument

* Fix Warnings (no need for \%. % is fine.

* styler::style_pkg()

* Add `@keywords internal`, so that they don't show up in the function index.

* However, the .Rd file is still created, so that people using `?convert_to_NA` will still see the documentation file.

* Address comments and redocument.

* Revert moving to `@details`

* WS

* Update DESCRIPTION

Missed it when solving conflicts

* Address comments

* Minor additional cleanup

* WS

* `use_tidy_description()`
  • Loading branch information
olivroy committed Aug 13, 2023
1 parent 7375941 commit b2700cf
Show file tree
Hide file tree
Showing 68 changed files with 526 additions and 430 deletions.
2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ If your proposed contribution addresses multiple issues, it should ideally be br
* Make sure to track progress upstream (i.e., on our version of `janitor` at `sfirke/janitor`) by doing `git remote add upstream https://github.com/sfirke/janitor.git`. Before making changes make sure to pull changes in from upstream by doing either `git fetch upstream` then merge later or `git pull upstream` to fetch and merge in one step
* Make your changes (bonus points for making changes on a new feature branch)
* Push up to your account
* Submit a pull request to the master branch at `sfirke/janitor`
* Submit a pull request to the main branch at `sfirke/janitor`

### Prefer to discuss over email?
Email Sam. His email address is in the `DESCRIPTION` file of this repo.
Expand Down
52 changes: 28 additions & 24 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,22 +1,25 @@
Package: janitor
Title: Simple Tools for Examining and Cleaning Dirty Data
Version: 2.2.0.9000
Authors@R: c(person("Sam", "Firke", email = "samuel.firke@gmail.com", role = c("aut", "cre")),
person("Bill", "Denney", email = "wdenney@humanpredictions.com", role = "ctb"),
person("Chris", "Haid", email = "chrishaid@gmail.com", role = "ctb"),
person("Ryan", "Knight", email = "ryangknight@gmail.com", role = "ctb"),
person("Malte", "Grosser", email = "malte.grosser@gmail.com", role = "ctb"),
person("Jonathan", "Zadra", email = "jonathan.zadra@sorensonimpact.com", role = "ctb"))
Description: The main janitor functions can: perfectly format data.frame column
names; provide quick counts of variable combinations (i.e., frequency
tables and crosstabs); and explore duplicate records. Other janitor functions
nicely format the tabulation results. These tabulate-and-report functions
approximate popular features of SPSS and Microsoft Excel. This package
follows the principles of the "tidyverse" and works well with the pipe function
%>%. janitor was built with beginning-to-intermediate R users in mind and is
optimized for user-friendliness.
URL: https://github.com/sfirke/janitor,
https://sfirke.github.io/janitor/
Authors@R: c(
person("Sam", "Firke", , "samuel.firke@gmail.com", role = c("aut", "cre")),
person("Bill", "Denney", , "wdenney@humanpredictions.com", role = "ctb"),
person("Chris", "Haid", , "chrishaid@gmail.com", role = "ctb"),
person("Ryan", "Knight", , "ryangknight@gmail.com", role = "ctb"),
person("Malte", "Grosser", , "malte.grosser@gmail.com", role = "ctb"),
person("Jonathan", "Zadra", , "jonathan.zadra@sorensonimpact.com", role = "ctb")
)
Description: The main janitor functions can: perfectly format data.frame
column names; provide quick counts of variable combinations (i.e.,
frequency tables and crosstabs); and explore duplicate records. Other
janitor functions nicely format the tabulation results. These
tabulate-and-report functions approximate popular features of SPSS and
Microsoft Excel. This package follows the principles of the
"tidyverse" and works well with the pipe function %>%. janitor was
built with beginning-to-intermediate R users in mind and is optimized
for user-friendliness.
License: MIT + file LICENSE
URL: https://github.com/sfirke/janitor, https://sfirke.github.io/janitor/
BugReports: https://github.com/sfirke/janitor/issues
Depends:
R (>= 3.1.2)
Expand All @@ -28,14 +31,11 @@ Imports:
magrittr,
purrr,
rlang,
snakecase (>= 0.9.2),
stringi,
stringr,
snakecase (>= 0.9.2),
tidyselect (>= 1.0.0),
tidyr (>= 1.0.0)
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 7.2.3
tidyr (>= 1.0.0),
tidyselect (>= 1.0.0)
Suggests:
dbplyr,
knitr,
Expand All @@ -45,6 +45,10 @@ Suggests:
testthat (>= 3.0.0),
tibble,
tidygraph
VignetteBuilder: knitr
Encoding: UTF-8
VignetteBuilder:
knitr
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
14 changes: 6 additions & 8 deletions R/adorn_ns.R
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
#' @title Add underlying Ns to a tabyl displaying percentages.
#' Add underlying Ns to a tabyl displaying percentages.
#'
#' @description
#' This function adds back the underlying Ns to a \code{tabyl} whose percentages were calculated using \code{adorn_percentages()}, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#' This function adds back the underlying Ns to a `tabyl` whose percentages were calculated using `adorn_percentages()`, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#'
#' @param dat a data.frame of class \code{tabyl} that has had \code{adorn_percentages} and/or \code{adorn_pct_formatting} called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a data.frame of class `tabyl` that has had `adorn_percentages` and/or `adorn_pct_formatting` called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position should the N go in the front, or in the rear, of the percentage?
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl \code{dat}, where the original Ns of a two-way \code{tabyl} are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with \code{base::format()}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class \code{numeric}, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with [base::format()].
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class `numeric`, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return a data.frame with Ns appended
#' @export
#' @examples
#'
#' mtcars %>%
#' tabyl(am, cyl) %>%
#' adorn_percentages("col") %>%
Expand Down
33 changes: 23 additions & 10 deletions R/adorn_pct_formatting.R
Original file line number Diff line number Diff line change
@@ -1,21 +1,34 @@
#' @title Format a data.frame of decimals as percentages.
#' Format a `data.frame` of decimals as percentages.
#'
#' @description
#' Numeric columns get multiplied by 100 and formatted as percentages according to user specifications. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument. Non-numeric columns are always excluded.
#' Numeric columns get multiplied by 100 and formatted as
#' percentages according to user specifications. This function defaults to
#' excluding the first column of the input data.frame, assuming that it contains
#' a descriptive variable, but this can be overridden by specifying the columns
#' to adorn in the `...` argument. Non-numeric columns are always excluded.
#'
#' The decimal separator character is the result of \code{getOption("OutDec")}, which is based on the user's locale. If the default behavior is undesirable,
#' change this value ahead of calling the function, either by changing locale or with \code{options(OutDec = ",")}. This aligns the decimal separator character with that used in \code{base::print()}.
#' The decimal separator character is the result of `getOption("OutDec")`, which
#' is based on the user's locale. If the default behavior is undesirable,
#' change this value ahead of calling the function, either by changing locale or
#' with `options(OutDec = ",")`. This aligns the decimal separator character
#' with that used in `base::print()`.
#'
#' @param dat a data.frame with decimal values, typically the result of a call to \code{adorn_percentages} on a \code{tabyl}. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a data.frame with decimal values, typically the result of a call
#' to `adorn_percentages` on a `tabyl`. If given a list of data.frames, this
#' function will apply itself to each data.frame in the list (designed for
#' 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param affix_sign should the \% sign be affixed to the end?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#'
#' @param rounding method to use for rounding - either "half to even", the base
#' R default method, or "half up", where 14.5 rounds up to 15.
#' @param affix_sign should the % sign be affixed to the end?
#' @param ... columns to adorn. This takes a tidyselect specification. By
#' default, all numeric columns (besides the initial column, if numeric) are
#' adorned, but this allows you to manually specify which columns should be
#' adorned, for use on a data.frame that does not result from a call to
#' `tabyl`.
#' @return a data.frame with formatted percentages
#' @export
#' @examples
#'
#' mtcars %>%
#' tabyl(am, cyl) %>%
#' adorn_percentages("col") %>%
Expand Down
9 changes: 4 additions & 5 deletions R/adorn_percentages.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
#' @title Convert a data.frame of counts to percentages.
#' Convert a data.frame of counts to percentages.
#'
#' @description
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument.
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the `...` argument.
#'
#' @param dat a \code{tabyl} or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param denominator the direction to use for calculating percentages. One of "row", "col", or "all".
#' @param na.rm should missing values (including NaN) be omitted from the calculations?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns a data.frame of percentages, expressed as numeric values between 0 and 1.
#' @export
Expand Down
12 changes: 6 additions & 6 deletions R/adorn_rounding.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
#' @title Round the numeric columns in a data.frame.
#' Round the numeric columns in a data.frame.
#'
#' @description
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the \code{...} argument.
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the `...` argument.
#'
#' If you're formatting percentages, e.g., the result of \code{adorn_percentages()}, use \code{adorn_pct_formatting()} instead. This is a more flexible variant for ad-hoc usage. Compared to \code{adorn_pct_formatting()}, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#' If you're formatting percentages, e.g., the result of `adorn_percentages()`, use `adorn_pct_formatting()` instead. This is a more flexible variant for ad-hoc usage. Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#'
#' @param dat a \code{tabyl} or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a `tabyl` or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns the data.frame with rounded numeric columns.
#' @export
Expand Down Expand Up @@ -39,7 +39,7 @@
#'
#' cases %>%
#' adorn_percentages(, , ends_with("ed")) %>%
#' adorn_rounding(, , one_of(c("recovered", "died")))
#' adorn_rounding(, , all_of(c("recovered", "died")))
adorn_rounding <- function(dat, digits = 1, rounding = "half to even", ...) {
# if input is a list, call purrr::map to recursively apply this function to each data.frame
if (is.list(dat) && !is.data.frame(dat)) {
Expand Down
12 changes: 6 additions & 6 deletions R/adorn_title.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
#' @title Add column name to the top of a two-way tabyl.
#'
#' @description
#' This function adds the column variable name to the top of a \code{tabyl} for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#' This function adds the column variable name to the top of a `tabyl` for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#'
#' @param dat a data.frame of class \code{tabyl} or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row \code{"top"} or appended to the already-present row name variable (\code{"combined"}). The formatting in the \code{"top"} option has the look of base R's \code{table()}; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The \code{"combined"} option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input \code{tabyl} object. If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input \code{tabyl} object. If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class \code{tbl_df} are downgraded to basic data.frames so that the title row prints correctly.
#' @param dat a data.frame of class `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row `"top"` or appended to the already-present row name variable (`"combined"`). The formatting in the `"top"` option has the look of base R's `table()`; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The `"combined"` option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class `tbl_df` are downgraded to basic data.frames so that the title row prints correctly.
#'
#' @export
#' @examples
Expand Down
Loading

0 comments on commit b2700cf

Please sign in to comment.