README.Rmd

---
output: github_document
link-citations: TRUE
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "75%",
  fig.width = 6,
  fig.asp = 7/7.5
)

# Determine ggplot2 theme for the session:
ggplot2::theme_set(ggplot2::theme_minimal())
```

<!-- badges: start -->

[![R-CMD-check](https://github.com/lhdjung/scrutiny/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/lhdjung/scrutiny/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/lhdjung/scrutiny/branch/main/graph/badge.svg)](https://app.codecov.io/gh/lhdjung/scrutiny?branch=main)

<!-- badges: end -->

# scrutiny: Error detection in science

The goal of scrutiny is to test published summary statistics for consistency using techniques like GRIM and to check their plausibility. The package makes these methods easy to use in a tidyverse-friendly way. It hopes to help the new field of error detection go mainstream.

Besides ready-made tests, scrutiny features a complete system for implementing new consistency tests. It also has duplication analysis, more general infrastructure for implementing error detection techniques, as well as specialized data wrangling functions. See the *Articles* tab for vignettes.

scrutiny is a work in progress. You are welcome to contribute with pull requests. However, please [open an issue](https://github.com/lhdjung/scrutiny/issues) first.

Install the package from CRAN:

```{r warning=FALSE, message=FALSE, eval=FALSE}
install.packages("scrutiny")
```

Alternatively, install the development version from GitHub:

```{r, warning=FALSE, message=FALSE, eval=FALSE}
remotes::install_github("lhdjung/scrutiny")
```

## Get started

Here is how to GRIM-test all values in a data frame. When using `grim_map()`, the `consistency` column tells you if the means (`x`) and sample sizes (`n`) are mutually consistent.

```{r example}
library(scrutiny)

# Example data:
pigs1

# GRIM-testing for data frames:
grim_map(pigs1)
```

Test percentages instead of means:

```{r}
pigs2

grim_map(pigs2, percent = TRUE)
```

You can choose how the means are reconstructed for testing --- below, rounded up from 5. When visualizing results, the plot will adjust automatically. Blue dots are consistent values, red dots are inconsistent ones:

```{r}
pigs1 %>% 
  grim_map(rounding = "up") %>% 
  grim_plot()
```

Similarly, use DEBIT to test means and standard deviations of binary data:

```{r}
pigs3

pigs3 %>% 
  debit_map()

pigs3 %>% 
  debit_map() %>% 
  debit_plot()
```

## Guiding ideas

> (...) a critical inspection of the published literature should not be mischaracterized as a hobby for the overly cynical, nor as so-called "methodological terrorism". On the contrary, carefully evaluating presented data is a cornerstone of scientific investigation, and it is only logical to apply this also to the published literature. If we are not willing to critically assess published studies, we also cannot guarantee their veracity.

--- van der Zee et al. (2017, pp. 8-9)

> (...) [data thugs](https://jamesheathers.medium.com/hugs-shrugs-and-data-thugs-663858757c4a) (...) demand data and if they do not receive it, they contact editors and universities and threaten to write blogs and tweets about the errors uncovered.

--- Eric A. Stewart (six retractions; quoted in Pickett 2020, p. 178)

# References

Pickett, J. T. (2020). The Stewart Retractions: A Quantitative and Qualitative Analysis. *Econ Journal Watch*, *17*(1), 152--190. <https://econjwatch.org/articles/the-stewart-retractions-a-quantitative-and-qualitative-analysis>.

van der Zee, T., Anaya, J., & Brown, N. J. L. (2017). Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab. *BMC Nutrition*, *3*(1), 54. <https://doi.org/10.1186/s40795-017-0167-x>.