Skip to content
Permalink
master
Go to file
 
 
Cannot retrieve contributors at this time
175 lines (120 sloc) 7.1 KB
---
title: "Function Overview"
author: "Lukas Burk"
date: "`r format(Sys.time(), '%F', tz = 'UTC', usetz = T)`"
output:
rmarkdown::html_vignette:
toc: yes
vignette: >
%\VignetteIndexEntry{Function Overview}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
library(tadaatoolbox)
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
```
This toolbox has been compiled to make the intro to R and statistics with R a little easier.
Besides that, it also contains some neat helper functions for tasks or problems one might run in frequently in our field.
A neat overview of functions and stuff can be found [on tadaatoolbox.tadaa-data.de](http://tadaatoolbox.tadaa-data.de/)
# High Level Functions
There are several high level functions aimed at quick output generation:
```{r highlevel_intro, results='asis', echo=F}
cat(paste0("- **", apropos("tadaa_"), "**", collapse = "\n"))
```
The point of these function is to combine routine steps into one function, so let's showcase them.
## t-Tests: `tadaa_t.test`
The effect size $d$ is calculated with pooled/weighted variances to ensure accuracy. It then formats the output according to `broom::tidy`, sprinkles it with `pixiedust` and prints either to console, markdown or whatever printing method is passed via the `print` argument to `pixiedust::sprinkle_print_method`.
```{r t-test}
tadaa_t.test(ngo, stunzahl, geschl, print = "markdown")
```
Or, if you're having a non-parametric day, try this:
```{r wilcox}
tadaa_wilcoxon(ngo, stunzahl, geschl, print = "markdown")
```
## ANOVA: `tadaa_aov`
The function knows 3 types of sums of squares, adjustable via the `type` argument and defaulting to type 3 sums of squares. Additionally for type 3, the function also automatically checks the contrasts associated with the factor variables (only for unordered factors), and if `check_names = TRUE`, the contrasts are set to `contr.sum`.
```{r anova}
tadaa_aov(stunzahl ~ jahrgang * geschl, data = ngo, print = "markdown")
```
Or, if you're still unsure about the parametricity of your day:
```{r kruskal}
tadaa_kruskal(stunzahl ~ jahrgang, data = ngo, print = "markdown")
```
## Pairwise tests
### Pairwise t-tests
Since we found `stats::pairwise.t.test` insufficient in cases of two-way ANOVAs, we wrapped the function to also enable the testing of interactions. The adjusted p-values are only adjusted within each term, so it is like performing `stats::pairwise.t.test` 3 times with each factor and additionally the interaction of the two factors (which is what the function does internally).
As a bonus, this functions knows the two additional p-adjustment methods `sidak` and `sidakSD` for the Sidak adjustement and the Sidak-like step-down procedure respectively.
```{r pairw_t}
tadaa_pairwise_t(ngo, response = deutsch, group1 = jahrgang, p.adjust = "bonf", print = "markdown")
tadaa_pairwise_t(ngo, response = deutsch, group1 = jahrgang, group2 = geschl, p.adjust = "sidakSD", print = "markdown")
```
# Statistics Wrappers
These are pretty self-explanatory. The goal is to provide simple functions for commonly used statistics that look and behave the same, and also only return a single numerical value to play nice with `dplyr::summarize`.
* `modus`: A simple function to extract the mode of a frequency table.
* `nom_chisqu`: Simple wrapper for `chisq.test` that produces a single value.
* `nom_phi`: Simple wrapper for `vcd::assocstats` to extract phi.
* `nom_v`: Simple wrapper for `vcd::assocstats` to extract Cramer's V.
* `nom_c`: Simple wrapper for `vcd::assocstats` to extract the contingency coefficient c.
* `nom_lambda`: Simple wrapper for `ryouready::nom.lambda` to extract appropriate lambda.
* `ord_gamma`: Simple wrapper for `ryouready::ord.gamma`.
* `ord_somers_d`: Simple wrapper for `ryouready::ord.somers.d`.
## Summaries
* `tadaa_nom`: All the nominal stats in one table.
* `tadaa_ord`: All the ordinal stats in one table.
# General Helper Functions
## Intervals and recoding
* `generate_recodes`: To produce recode assignments for `car::recode` for evenly sequenced clusters.
* `interval_labels`: To produce labels for clusters created by `cut`.
* `delete_na`: Customizable way to drop `NA` observations from a dataset.
* `pval_string`: Shamelessly adapted from `pixiedust::pvalString`, this will format a p-value as a character string in common `p < 0.001` notation and so on. The difference from the `pixiedust` version is that this function will also print `p < 0.05`.
## Plotting helpers
* `mean_ci_t`: Returns a `data.frame` with `y` (`mean`), `ymin` and `ymax` for the CI bounds.
* `confint_t`: For the underlying function to get the CI width. Returns a single value.
* `confint_norm`: Similar, but baes on normal distribution. Returns a single value.
* `mean_ci_sem`: Standard error and CI, you guessed it, in one table.
```{r confint}
library(ggplot2)
ggplot(data = ngo, aes(x = jahrgang, y = deutsch)) +
stat_summary(fun.data = "mean_ci_t", geom = "errorbar") +
theme_tadaa()
```
As a convenience, we added `tadaa_mean_ci` to quickly plot means with errorbars to get a quick glance at your data.
```{r mean_ci_plot}
tadaa_mean_ci(data = ngo, response = deutsch, group = jahrgang) +
theme_tadaa()
```
### Prebuilt plotting functions
* `tadaa_int`: Simple interaction plot template.
```{r tadaa_int, fig.width=6}
library(ggplot2)
if (!("cowplot" %in% installed.packages())) install.packages("cowplot")
tadaa_int(data = ngo, response = stunzahl, group1 = jahrgang, group2 = geschl, grid = T, brewer_palette = "Set1")
```
# Data
The infamous *ngo* dataset is included for teaching purposes as well. It differs from [ryouready](https://github.com/markheckmann/ryouready)'s provided version with regards to classes and labels. The code below was used to generate the provided version of the dataset:
(Note that `\u00e4` is a unicode encoded Umlaut for compatibility reasons)
```{r data_ngo, eval=FALSE}
ngo <- ryouready::d.ngo
## sjPlot value labels
ngo$geschl <- sjmisc::set_labels(ngo$geschl, c("M\u00e4nnlich", "Weiblich"))
ngo$abschalt <- sjmisc::set_labels(ngo$abschalt, c("Ja", "Nein"))
ngo$jahrgang <- sjmisc::set_labels(ngo$jahrgang, c("11", "12", "13"))
ngo$hausauf <- sjmisc::set_labels(ngo$hausauf, c("gar nicht", "weniger als halbe Stunde",
"halbe Stunde bis Stunde", "1 bis 2 Stunden",
"2 bis 3 Stunden", "3 bis 4 Stunden",
"mehr als 4 Stunden"))
## factors
ngo$geschl <- factor(ngo$geschl, labels = c("M\u00e4nnlich", "Weiblich"))
ngo$jahrgang <- factor(ngo$jahrgang, labels = c("11", "12", "13"), ordered = TRUE)
ngo$hausauf <- car::recode(ngo$hausauf, "0 = NA")
ngo$abschalt <- car::recode(ngo$abschalt, "0 = NA")
ngo$abschalt <- factor(ngo$abschalt, labels = c("Ja", "Nein"))
## Variable labels
ngo$geschl <- sjmisc::set_label(ngo$geschl, "Geschlecht")
ngo$abschalt <- sjmisc::set_label(ngo$abschalt, "Abschalten")
ngo$jahrgang <- sjmisc::set_label(ngo$jahrgang, "Jahrgang")
ngo$hausauf <- sjmisc::set_label(ngo$hausauf, "Hausaufgaben")
## Saving
ngo <- dplyr::tbl_df(ngo)
```
You can’t perform that action at this time.