-
Notifications
You must be signed in to change notification settings - Fork 11
/
categorical-data.Rmd
111 lines (80 loc) · 2.45 KB
/
categorical-data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "Categorical Data"
author: "Aravind Hebbali"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Categorical Data}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r, echo=FALSE, message=FALSE}
library(descriptr)
library(dplyr)
```
## Introduction
In this document, we will introduce you to functions for exploring and
visualizing categorical data.
## Data
We have modified the `mtcars` data to create a new data set `mtcarz`. The only
difference between the two data sets is related to the variable types.
```{r egdata}
str(mtcarz)
```
## Cross Tabulation
The `ds_cross_table()` function creates two way tables of categorical variables.
```{r cross}
ds_cross_table(mtcarz, cyl, gear)
```
If you want the above result as a tibble, use `ds_twoway_table()`.
```{r cross_tibble}
ds_twoway_table(mtcarz, cyl, gear)
```
A `plot()` method has been defined which will generate:
### Grouped Bar Plots
```{r cross_group, fig.width=7, fig.height=7, fig.align='centre'}
k <- ds_cross_table(mtcarz, cyl, gear)
plot(k)
```
### Stacked Bar Plots
```{r cross_stack, fig.width=7, fig.height=7, fig.align='centre'}
k <- ds_cross_table(mtcarz, cyl, gear)
plot(k, stacked = TRUE)
```
### Proportional Bar Plots
```{r cross_prop, fig.width=7, fig.height=7, fig.align='centre'}
k <- ds_cross_table(mtcarz, cyl, gear)
plot(k, proportional = TRUE)
```
## Frequency Table
The `ds_freq_table()` function creates frequency tables.
```{r ftable}
ds_freq_table(mtcarz, cyl)
```
A `plot()` method has been defined which will create a bar plot.
```{r ftable_bar, fig.width=7, fig.height=7, fig.align='centre'}
k <- ds_freq_table(mtcarz, cyl)
plot(k)
```
## Multiple One Way Tables
The `ds_auto_freq_table()` function creates multiple one way tables by creating a
frequency table for each categorical variable in a data set. You can also
specify a subset of variables if you do not want all the variables in the data
set to be used.
```{r oway}
ds_auto_freq_table(mtcarz)
```
## Multiple Two Way Tables
The `ds_auto_cross_table()` function creates multiple two way tables by creating a
cross table for each unique pair of categorical variables in a data set. You
can also specify a subset of variables if you do not want all the variables in
the data set to be used.
```{r tway}
ds_auto_cross_table(mtcarz, cyl, gear, am)
```