statart is a set of functions that combines Stata-inspired features with tidyverse-style operations.
You can install the development version of statart like so:
devtools::install_github("socimh/statart")
I highly recommend using tidyverse
with statart
for better
performance.
library(statart)
library(tidyverse)
View the codebook of lifeexp
.
codebook(lifeexp)
#> # A tibble: 6 × 5
#> variable label type n unique
#> <chr> <chr> <chr> <int> <int>
#> 1 region Region double+label 68 3
#> 2 country Country character 68 68
#> 3 popgrowth Avg. annual % growth double 68 30
#> 4 lexp Life expectancy at birth double 68 18
#> 5 gnppc GNP per capita double 63 62
#> 6 safewater Safe water double 40 29
Summarise the numeric variables in lifeexp
.
summ(lifeexp)
#> Warning: country is non-numeric and thus removed.
#> Warning: region is a labelled variable (*).
#> # A tibble: 5 × 8
#> name type n unique min mean sd max
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 *region dbl+lbl 68 3 1 1.5 0.743 3
#> 2 popgrowth dbl 68 30 -0.5 0.972 0.931 3
#> 3 lexp dbl 68 18 54 72.3 4.72 79
#> 4 gnppc dbl 63 62 370 8675. 10635. 39980
#> 5 safewater dbl 40 29 28 76.1 17.9 100
Not that region
is a factor variable, so the mean and standard
deviation of it may be meaningless.
Tabulate a single variable:
tab(starwars, sex)
#> # A tibble: 5 × 6
#> sex n percent cum valid valid_cum
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 female 16 18.4 18.4 19.3 19.3
#> 2 hermaphroditic 1 1.15 19.5 1.20 20.5
#> 3 male 60 69.0 88.5 72.3 92.8
#> 4 none 6 6.90 95.4 7.23 100
#> 5 <NA> 4 4.60 100 NA NA
tab1()
tabulates variables one by one as a list, and s_match()
can
select variables in a stata style.
tab1(starwars, s_match("*color"))
#> $hair_color
#> # A tibble: 13 × 6
#> value n percent cum valid valid_cum
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 auburn 1 1.15 1.15 1.22 1.22
#> 2 auburn, grey 1 1.15 2.30 1.22 2.44
#> 3 auburn, white 1 1.15 3.45 1.22 3.66
#> 4 black 13 14.9 18.4 15.9 19.5
#> 5 blond 3 3.45 21.8 3.66 23.2
#> 6 blonde 1 1.15 23.0 1.22 24.4
#> 7 brown 18 20.7 43.7 22.0 46.3
#> 8 brown, grey 1 1.15 44.8 1.22 47.6
#> 9 grey 1 1.15 46.0 1.22 48.8
#> 10 none 37 42.5 88.5 45.1 93.9
#> 11 unknown 1 1.15 89.7 1.22 95.1
#> 12 white 4 4.60 94.3 4.88 100
#> 13 <NA> 5 5.75 100 NA NA
#>
#> $skin_color
#> # A tibble: 31 × 6
#> value n percent cum valid valid_cum
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 blue 2 2.30 2.30 2.30 2.30
#> 2 blue, grey 2 2.30 4.60 2.30 4.60
#> 3 brown 4 4.60 9.20 4.60 9.20
#> 4 brown mottle 1 1.15 10.3 1.15 10.3
#> 5 brown, white 1 1.15 11.5 1.15 11.5
#> 6 dark 6 6.90 18.4 6.90 18.4
#> 7 fair 17 19.5 37.9 19.5 37.9
#> 8 fair, green, yellow 1 1.15 39.1 1.15 39.1
#> 9 gold 1 1.15 40.2 1.15 40.2
#> 10 green 6 6.90 47.1 6.90 47.1
#> # ℹ 21 more rows
#>
#> $eye_color
#> # A tibble: 15 × 6
#> value n percent cum valid valid_cum
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 black 10 11.5 11.5 11.5 11.5
#> 2 blue 19 21.8 33.3 21.8 33.3
#> 3 blue-gray 1 1.15 34.5 1.15 34.5
#> 4 brown 21 24.1 58.6 24.1 58.6
#> 5 dark 1 1.15 59.8 1.15 59.8
#> 6 gold 1 1.15 60.9 1.15 60.9
#> 7 green, yellow 1 1.15 62.1 1.15 62.1
#> 8 hazel 3 3.45 65.5 3.45 65.5
#> 9 orange 8 9.20 74.7 9.20 74.7
#> 10 pink 1 1.15 75.9 1.15 75.9
#> 11 red 5 5.75 81.6 5.75 81.6
#> 12 red, blue 1 1.15 82.8 1.15 82.8
#> 13 unknown 3 3.45 86.2 3.45 86.2
#> 14 white 1 1.15 87.4 1.15 87.4
#> 15 yellow 11 12.6 100 12.6 100
tab2()
cross-tabulates two variables.
tab2(starwars, sex, gender)
#> # A tibble: 5 × 4
#> `sex \\ gender` feminine masculine `NA`
#> <chr> <int> <int> <int>
#> 1 female 16 0 0
#> 2 hermaphroditic 0 1 0
#> 3 male 0 60 0
#> 4 none 1 5 0
#> 5 <NA> 0 0 4
fre()
family functions simply add total rows (and total columns in
fre2()
) to their tab()
counterparts.
fre(starwars, sex)
Please note that the statart project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.