-
Notifications
You must be signed in to change notification settings - Fork 0
/
factorAnalysis.Rmd
105 lines (76 loc) · 2.99 KB
/
factorAnalysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
title: "Dimension Reduction"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Dimension Reduction}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(qacr)
library(GPArotation)
```
Based on the William Revelle's comprehensive [psych](https://cran.r-project.org/web/packages/psych/index.html) package, `qacr` provides wrapper functions for simplified input, intuitive output, and easily interpretable graphs, making these techniques more accessible to data analysts new to these forms of analysis.
## Principal Components Analysis
Let's perform a principal components analysis on some ratings data. First we'll import the data into
R.
```{r}
# input data
ratings <- read.csv("https://www.promptcloud.com/wp-content/uploads/2017/02/EFA.csv")
head(ratings)
```
Next, well create a scree plot of the data.
```{r fig.width=6, fig.height=6}
# scree plot
scree_plot(ratings)
```
The scree plot suggests two components. In the next step, we'll extract two principal components and rotate them using a *varimax* rotation.
```{r fig.width=6, fig.height=6}
# extract 2 principal components
fit.pca <- PCA(ratings, nfactor=2, rotate="varimax")
```
The two components account for 35% of the variance in the original data. Next, we'll plot the pattern matrix as both a table and a bar chart.
```{r fig.width=6, fig.height=6}
# plot factor pattern as table
plot(fit.pca)
# plot factor pattern as bar chart
plot(fit.pca, type="bar")
```
Finally, we'll add the component scores to the original data.
```{r}
# save component scores
mydata <- score(ratings, fit.pca)
head(mydata)
```
## Factor Analysis
In this section, the ratings data are re-analyzed using a *maximum likelihood* factor analysis.
Again, we'll start with a scree plot. A random number seed is set to assure reproducibility.
```{r fig.height=6, fig.width=6, message=FALSE, warning=FALSE, message=FALSE}
# scree plot
set.seed(1234)
scree_plot(ratings, method = "ml")
```
The parallel analysis suggests a maximum 4 factors, while the bend in the scree plot strongly suggests two factors. In the next step, we'll extract 2 factors and rotate them using a *promax* rotation.
```{r fig.width=6, fig.height=6}
# extract 2 oblique factors
fit.fa <- FA(ratings, nfactor=2, rotate="promax", fm="ml")
```
The printout includes the factor pattern, the factor structure, and the factor intercorrelations.
The four factors account for 26% of the variance in the original data. Factors 1 and 2 have a low correlation (0.07), suggesting that an orthogonal (e.g., varimax) rotation may give a simpler solution.
Next, we'll plot the pattern matrix as both a table and a bar chart.
```{r fig.width=6, fig.height=6}
# plot factor pattern as table
plot(fit.fa)
# plot factor pattern as bar chart
plot(fit.fa, type="bar")
```
Finally, we'll add the factor scores to the original data.
```{r}
# save component scores
mydata <- score(ratings, fit.fa)
head(mydata)
```