Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
456 lines (389 sloc) 24.3 KB
---
title: "Introduction to SSM Analysis"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{ssm-introduction}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
library(circumplex)
library(ggforce)
library(kableExtra)
library(ggplot2)
library(tibble)
library(dplyr)
library(forcats)
library(psych)
library(knitr)
set.seed(12345)
```
## 1. Background and Motivation
### Circumplex models, scales, and data
Circumplex models are popular within many areas of psychology because they offer a parsimonious account of complex psychological domains, such as emotion and interpersonal functioning. This parsimony is achieved by understanding phenomena in a domain as being a "blend" of two primary dimensions. For instance, circumplex models of emotion typically represent affective phenomena as a blend of *valence* (pleasantness versus unpleasantness) and *arousal* (activity versus passivity), whereas circumplex models of interpersonal functioning typically represent interpersonal phenomena as a blend of *communion* (affiliation versus separation) and *agency* (dominance versus submissiveness). These models are often depicted as circles around the intersection of the two dimensions (see figure). Any given phenomenon can be located within this circular space through reference to the two underlying dimensions (e.g., anger is a blend of unpleasantness and activity).
Circumplex scales contain multiple subscales that attempt to measure different blends of the two primary dimensions (i.e., different parts of the circle). Although there have historically been circumplex scales with as many as sixteen subscales, it has become most common to use eight subscales: one for each "pole" of the two primary dimensions and one for each "quadrant" that combines the two dimensions. In order for a set of subscales to be considered circumplex, they must exhibit certain properties. Circumplex fit analyses can be used to quantify these properties.
Circumplex data is composed of scores on a set of circumplex scales for one or more participants (e.g., persons or organizations). Such data is usually collected via self-report, informant-report, or observational ratings in order to locate psychological phenomena within the circular space of the circumplex model. For example, a therapist might want to understand the interpersonal problems encountered by an individual patient, a social psychologist might want to understand the emotional experiences of a group of participants during an experiment, and a personality psychologist might want to understand what kind of interpersonal behaviors are associated with a trait (e.g., extraversion).
```{r model, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
angles <- c(90, 135, 180, 225, 270, 315, 360, 45)
alabel <- c("PA", "BC", "DE", "FG", "HI", "JK", "LM", "NO")
# Create plot ------------------------------------------------------------------
ggplot() +
# Require plot to be square and remove default styling
coord_fixed() +
theme_void() +
# Expand the axes multiplicatively to fit the labels
scale_x_continuous(expand = c(0.10, 0)) +
scale_y_continuous(expand = c(0.10, 0)) +
# Draw line segments corresponding to the octants
geom_segment(
aes(
x = 0,
y = 0,
xend = 5 * cos(angles[c(1, 3, 5, 7)] * pi / 180),
yend = 5 * sin(angles[c(1, 3, 5, 7)] * pi / 180)
),
color = "gray60",
size = 1
) +
# Draw inner labels for the octant angles
geom_label(
aes(
x = 3 * cos(angles * pi / 180),
y = 3 * sin(angles * pi / 180),
label = sprintf("%d\u00B0", angles)
),
size = 5,
color = "black",
label.size = NA,
hjust = "center",
vjust = "center"
) +
# Draw the circle
geom_circle(aes(x0 = 0, y0 = 0, r = 5),
color = "gray50", size = 1.5
) +
# Draw outer labels for octant abbreviations
geom_label(
aes(
x = 5.1 * cos(angles * pi / 180),
y = 5.1 * sin(angles * pi / 180),
label = alabel
),
size = 5,
color = "black",
label.size = NA,
hjust = "outward",
vjust = "outward"
)
```
### The Structural Summary Method
The Structural Summary Method (SSM) is a technique for analyzing circumplex data that offers practical and interpretive benefits over alternative techniques. It consists of fitting a cosine curve to the data, which captures the pattern of correlations among scores associated with a circumplex scale (i.e., mean scores on circumplex scales or correlations between circumplex scales and an external measure). By plotting a set of example scores below, we can gain a visual intuition that a cosine curve makes sense in this case. First, we can examine the scores with a bar chart ignoring the circular relationship among them.
```{r column, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
requireNamespace("forcats", quietly = TRUE)
data("jz2017")
rci <- jz2017 %>%
select(NARPD, PA:NO) %>%
cor.ci(plot = FALSE)
# Format data for plotting
dat_r <- tibble(
Scale = factor(c("PA", "BC", "DE", "FG", "HI", "JK", "LM", "NO")),
Group = rep(1, 8),
est = rci$rho[2:9],
lci = rci$ci$lower[1:8],
uci = rci$ci$upper[1:8]
)
# Create column plot with 95% CI error bars
ggplot(dat_r, aes(x = forcats::fct_reorder(Scale, est, .desc = TRUE), y = est)) +
geom_hline(yintercept = 0, size = 1.25, color = "darkgray") +
geom_col(position = position_dodge(.9), fill = "red") +
geom_errorbar(aes(ymin = lci, ymax = uci),
width = .20,
position = position_dodge(.9), size = 1
) +
scale_y_continuous(
limits = c(-0.02, 0.475)
) +
labs(title = "Scores with 95% Confidence Intervals") +
theme(
axis.title = element_blank(),
panel.grid.major = element_line(size = 1.0),
panel.grid.minor.y = element_line(size = 0.5),
panel.grid.minor.x = element_blank()
)
```
Next, we can leverage the fact that these subscales have specific angular displacements in the circumplex model (and that 0 and 360 degrees are the same) to create a path diagram.
```{r path, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
dat_r <- tibble(
Scale = factor(c("LM", "PA", "BC", "DE", "FG", "HI", "JK", "LM", "NO"),
levels = c("PA", "BC", "DE", "FG", "HI", "JK", "LM", "NO")
),
Group = rep(1, 9),
est = c(rci$rho[[8]], rci$rho[2:9]),
lci = c(rci$ci$lower[[7]], rci$ci$lower[1:8]),
uci = c(rci$ci$upper[[7]], rci$ci$upper[1:8]),
Angle = c(0, octants())
) %>%
arrange(Angle)
# Plot correlations as connected point ranges with 95% CI ranges
ggplot(data = dat_r, mapping = aes(x = Angle, y = est)) +
geom_hline(yintercept = 0, size = 1.25, color = "darkgray") +
geom_pointrange(aes(ymin = lci, ymax = uci), size = 1.25, color = "red") +
geom_path(size = 1.25, color = "red") +
scale_x_continuous(
limits = c(0, 360),
breaks = c(0, octants()),
expand = c(0.01, 0),
labels = function(x) sprintf("%.0f\U00B0", x)
) +
scale_y_continuous(
limits = c(-0.02, 0.475)
) +
labs(title = "Scores with 95% CIs by Angle") +
theme(
axis.title = element_blank(),
plot.margin = unit(c(10, 30, 10, 10), "points"),
panel.grid.major = element_line(size = 1.0),
panel.grid.minor.y = element_line(size = 0.5),
panel.grid.minor.x = element_blank()
)
```
This already looks like a cosine curve, and we can finally use the SSM to estimate the parameters of the curve that best fits the observed data. By plotting it alongside the data, we can get a sense of how well the model fits our example data.
```{r curve, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
# Calculate SSM parameters
sp <- circumplex:::ssm_parameters(rci$means[1:8], octants() * pi / 180)
# Create function for SSM cosine model
f <- function(x) {
sp[[1]] + sp[[4]] * cos((x - sp[[5]] * 180 / pi) * pi / 180)
}
# Plot correlations along with SSM cosine model
ggplot(data = dat_r, mapping = aes(x = Angle, y = est)) +
geom_hline(yintercept = 0, size = 1.25, color = "darkgray") +
geom_pointrange(aes(ymin = lci, ymax = uci), size = 1.25) +
geom_path(size = 1.25) +
stat_function(fun = f, size = 2, color = "red") +
scale_x_continuous(
limits = c(0, 360),
breaks = c(0, octants()),
expand = c(0.01, 0),
labels = function(x) sprintf("%.0f\U00B0", x)
) +
scale_y_continuous(
limits = c(-0.02, 0.475)
) +
labs(title = "Cosine Curve Estimated by SSM") +
theme(
axis.title = element_blank(),
plot.margin = unit(c(10, 30, 10, 10), "points"),
panel.grid.major = element_line(size = 1.0),
panel.grid.minor.y = element_line(size = 0.5),
panel.grid.minor.x = element_blank()
)
```
### Understanding the SSM parameters
The SSM estimates a cosine curve to the data using the following equation:
$$S_i = e + a \times \cos(\theta_i - d)$$
where $S_i$ and $\theta_i$ are the score and angle on scale $i$, respectively, and $e$, $a$, and $d$ are the elevation, amplitude, and displacement parameters, respectively. Before we discuss these parameters, however, we can also estimate the fit of the SSM model. This is essentially how close the cosine curve is to the observed data points. Deviations (in red, below) will lower model fit.
```{r residuals, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
# Plot correlations as path, SSM cosine model, and differences
ggplot(data = dat_r, mapping = aes(x = Angle, y = est)) +
geom_hline(yintercept = 0, size = 1.25, color = "darkgray") +
stat_function(fun = f, size = 2, color = "gray20") +
geom_point(size = 5.5, color = "black") +
geom_path(size = 1.25, color = "black") +
geom_segment(aes(x = Angle, xend = Angle, y = est, yend = f(Angle)),
size = 4, linetype = "solid", color = "red"
) +
scale_x_continuous(
limits = c(0, 360),
breaks = c(0, octants()),
expand = c(0.01, 0),
labels = function(x) sprintf("%.0f\U00B0", x)
) +
scale_y_continuous(
limits = c(-0.02, 0.475)
) +
labs(title = sprintf(
"Fit = %.2f", sp[[6]]
)) +
theme(
axis.title = element_blank(),
plot.margin = unit(c(10, 30, 10, 10), "points"),
panel.grid.major = element_line(size = 1.0),
panel.grid.minor.y = element_line(size = 0.5),
panel.grid.minor.x = element_blank()
)
```
If fit is less than 0.70, it is considered "unacceptable" and only the elevation parameter should be interpreted. If fit is between 0.70 and 0.80, it is considered "adequate," and if it is greater than 0.80, it is considered "good." Sometimes SSM model fit is called prototypicality or denoted using $R^2$.
The first SSM parameter is elevation or $e$, which is calculated as the mean of all scores. It is the size of the general factor in the circumplex model and its interpretation varies from scale to scale. For measures of interpersonal problems, it is interpreted as generalized interpersonal distress. When using correlation-based SSM, $|e|\ge.15$ is considered "marked" and $|e|<.15$ is considered "modest."
```{r elev, echo = FALSE, out.width = "100%"}
knitr::include_graphics("VIG1-e.gif")
```
The second SSM parameter is amplitude or $a$, which is calculated as the difference between the highest point of the curve and the curve's mean. It is interpreted as the distinctiveness or differentiation of a profile: how much it is peaked versus flat. Similar to elevation, when using correlation-based SSM, $a\ge.15$ is considered "marked" and $a<.15$ is considered "modest."
```{r ampl, echo = FALSE, out.width = "100%"}
knitr::include_graphics("VIG1-a.gif")
```
The final SSM parameter is displacement or $d$, which is calculated as the angle at which the curve reaches its highest point. It is interpreted as the style of the profile. For instance, if $d=90^\circ$ and we are using a circumplex scale that defines 90 degrees as "domineering," then the profile's style is domineering.
```{r disp, echo = FALSE, out.width = "100%"}
knitr::include_graphics("VIG1-d.gif")
```
By interpreting these three parameters, we can understand a profile much more parsimoniously than by trying to interpret all eight subscales individually. This approach also leverages the circumplex relationship (i.e., dependency) among subscales. It is also possible to transform the amplitude and displacement parameters into estimates of distance from the x-axis and y-axis, which will be shown in the output discussed below.
## 2. Example data: jz2017
To illustrate the SSM functions, we will use the example dataset `jz2017`, which was provided by Zimmermann & Wright (2017) and reformatted for this package. This dataset includes self-report data from 1166 undergraduate students. Students completed a circumplex measure of interpersonal problems with eight subscales (PA, BC, DE, FG, HI, JK, LM, and NO) and a measure of personality disorder symptoms with ten subscales (PARPD, SCZPD, SZTPD, ASPD, BORPD, HISPD, NARPD, AVPD, DPNPD, and OCPD). More information about these variables can be accessed using the `?jz2017` command in R.
```{r jz2017}
data("jz2017")
print(jz2017)
```
The circumplex scales in `jz2017` come from the Inventory of Interpersonal Problems - Short Circumplex (IIP-SC). These scales can be arranged into the following circular model, which is organized around the two primary dimensions of agency (y-axis) and communion (x-axis). Note that the two-letter scale abbreviations and angular values are based in convention. A high score on PA indicates that one has interpersonal problems related to being "domineering" or too high on agency, whereas a high score on DE indicates problems related to being "cold" or too low on communion. Scales that are not directly on the y-axis or x-axis (i.e., BC, FG, JK, and NO) represent blends of agency and communion.
```{r iipsc, echo = FALSE, fig.width = 7.5, fig.height = 4, out.width = "100%"}
angles <- c(90, 135, 180, 225, 270, 315, 360, 45)
flabel <- c(
"Domineering",
"Vindictive",
"Cold",
"Socially\nAvoidant",
"Nonassertive",
"Easily\nExploited",
"Overly\nNurturant",
"Intrusive"
)
alabel <- c("PA", "BC", "DE", "FG", "HI", "JK", "LM", "NO")
# Create plot ------------------------------------------------------------------
ggplot() +
# Require plot to be square and remove default styling
coord_fixed() +
theme_void() +
# Expand both axes multiplicatively to fit the labels
scale_x_continuous(expand = c(0.30, 0)) +
scale_y_continuous(expand = c(0.10, 0)) +
# Draw line segments corresponding to the octants
geom_segment(
aes(
x = 0,
y = 0,
xend = 5 * cos(angles * pi / 180),
yend = 5 * sin(angles * pi / 180)
),
color = "gray60",
size = 0.5
) +
# Draw inner labels for the octant abbreviations
geom_label(
aes(
x = 3.75 * cos(angles * pi / 180),
y = 3.75 * sin(angles * pi / 180),
label = alabel
),
size = 5,
color = "gray40",
label.size = NA,
hjust = "center",
vjust = "center"
) +
# Draw inner labels for the octant angles
geom_label(
aes(
x = 2 * cos(angles * pi / 180),
y = 2 * sin(angles * pi / 180),
label = sprintf("%d\u00B0", angles)
),
size = 4,
color = "gray50",
label.size = NA,
hjust = "center",
vjust = "center"
) +
# Draw the circle
geom_circle(aes(x0 = 0, y0 = 0, r = 5),
color = "gray50", size = 1.5
) +
# Draw outer labels for the octant names
geom_label(
aes(
x = 5.1 * cos(angles * pi / 180),
y = 5.1 * sin(angles * pi / 180),
label = flabel
),
size = 5,
color = "black",
label.size = NA,
hjust = "outward",
vjust = "outward"
)
```
## 3. Mean-based SSM Analysis
### Conducting SSM for a group's mean scores
To begin, let's say that we want to use the SSM to describe the interpersonal problems of the average individual in the entire dataset. Although it is possible to analyze the raw scores contained in `jz2017`, our results will be more interpretable if we standardize the scores first. We can do this using the `standardize()` function. The first argument to this function is `.data`, a data frame frame containing the circumplex scales to be standardized. The second argument is `scales` and specifies where in `.data` the circumplex scales are (either in terms of their variable names or their column numbers). The third argument is `angles` and specifies the angle of each of the circumplex scales included in `scales`. Note that the `scales` and `angles` arguments need to be vectors (hence the `c()` function) that have the same ordering and length. Finally, the fourth argument is `norms`, a data frame containing the normative data we will use to standardize the circumplex scales. Here, we will use normative data for the IIP-SC by loading the `iipsc` data frame.
```{r}
data("iipsc")
jz2017s <- standardize(
.data = jz2017,
scales = c(PA, BC, DE, FG, HI, JK, LM, NO),
angles = c(90, 135, 180, 225, 270, 315, 360, 45),
instrument = iipsc,
sample = 1
)
print(jz2017s)
```
Now we can use the `ssm_analyze()` function to perform the SSM analysis. The first three arguments are the same as the first three arguments to `standardize()`. We can pass the new `jz2017s` data frame that contains standardized scores as `.data` and the same vectors to `scales` and `angles` since these haven't changed.
```{r analyze}
results <- ssm_analyze(
.data = jz2017s,
scales = c(PA_z, BC_z, DE_z, FG_z, HI_z, JK_z, LM_z, NO_z),
angles = c(90, 135, 180, 225, 270, 315, 360, 45)
)
```
The output of the function has been saved in the `results` object, which we can examine in detail using the `summary()` function. This will output the call we made to create the output, as well as some of the default options that we didn't bother changing (see `?ssm_analyze` to learn how to change them) and, most importantly, the estimated SSM parameter values with bootstrapped confidence intervals.
```{r summary1a}
summary(results)
```
That was pretty easy! We can now write up these results. However, the `circumplex` package has some features that can make what we just did even easier. First, because the first three arguments of the `ssm_analyze()` function are always the same, we can omit their names. Second, because we organized the `jz2017s` data frame to have the circumplex scale variables adjacent and in order from PA to NO, we can simplify their specification by using the `PA:NO` shortcut. Finally, because the use of octant scales is so common, the `circumplex` package comes with a convenience function for outputting their angular displacements: `octants()`. Note how, even when using these shortcuts, the results are the same except for minor stochastic differences in the confidence intervals due to the randomness inherent to bootstrapping. (To get the exact same results, we could use the `set.seed()` function to control the random number generator in R.)
```{r summary1b}
results2 <- ssm_analyze(jz2017s, PA_z:NO_z, octants())
summary(results2)
```
### Visualizing the results with a table and figure
Next, we can produce a table to display our results. With only a single set of parameters, this table is probably overkill, but in future analyses we will see how this function saves a lot of time and effort. To create the table, simply pass the `results` (or `results2`) object to the `ssm_table()` function.
```r
ssm_table(results2)
```
```{r table1, echo = FALSE}
ssm_table(results2, render = FALSE) %>%
kable(caption = circumplex:::dcaption(results2)) %>%
kable_styling(full_width = TRUE, font_size = 14)
```
Next, let's leverage the fact that we are working within a circumplex space by creating a nice-looking circular plot by mapping the amplitude parameter to the points' distance from the center of the circle and the displacement parameter to the points' rotation from due-east (as is conventional). This, again, is as simple as passing the `results` object to the `ssm_plot()` function.
```{r plot1, fig.width = 7.2, fig.height = 4, out.width = "100%"}
ssm_plot(results2)
```
## 4. Correlation-based SSM Analysis
### Conducting SSM for a group's correlations with an external measure
Next, let's say that we are interested in analyzing not the mean scores on the circumplex scales but rather their correlations with an external measure. This is sometimes referred to as "projecting" that external measure into the circumplex space. As an example, let's project the NARPD variable, which captures symptoms of narcissistic personality disorder, into the circumplex space defined by the IIP-SC. Based on theory and previous findings, we can expect this measure to be associated with some general interpersonal distress and a style that is generally high in agency.
To conduct this analysis, we can start with the syntax from the mean-based analysis. All SSM analyses use the `ssm_analyze()` function and the data, scales, and angles are the same as before. However, we also need to let the function know that we want to analyze correlations with NARPD as opposed to scale means. To do this, we add an additional argument `measures`. Note that since correlations are already standardized, we don't need to worry about standardizing the circumplex scales when `measures` is used.
```{r summary2}
results3 <- ssm_analyze(jz2017, PA:NO, octants(), measures = NARPD)
summary(results3)
```
Note that this output looks very similar to the mean-based output except that the statistical basis is now correlation scores instead of mean scores and instead of saying "Profile [All]" it now says "Profile [NARPD]".
### Visualizing the results with a table and figure
We can also create a similar table and figure using the exact same syntax as before. The `ssm_table()` and `ssm_plot()` functions are smart enough to know whether the results are mean-based or correlation-based and will work in both cases.
```r
ssm_table(results3)
```
```{r table2, echo = FALSE}
ssm_table(results3, render = FALSE) %>%
kable(caption = circumplex:::dcaption(results3)) %>%
kable_styling(full_width = TRUE, font_size = 14)
```
From the table, we can see that the model fit is good (>.80) and that all three SSM parameters are significantly different from zero, i.e., their confidence intervals do not include zero. Furthermore, the confidence intervals for the elevation and amplitude parameters are greater than or equal to 0.15, which can be interpreted as being "marked." So, consistent with our hypotheses, NARPD was associated with marked general interpersonal distress (elevation) and was markedly distinctive in its profile (amplitude). The displacement parameter was somewhere between 100 and 120 degrees; to interpret this we would need to either consult the mapping between scales and angles or plot the results.
```{r plot2, fig.width = 7.5, fig.height = 4, out.width = "100%"}
ssm_plot(results3)
```
From this figure, it is very easy to see that, consistent with our hypotheses, the displacement for NARPD was associated with high agency and was somewhere between the "domineering" and "vindictive" octants.
## 5. Wrap-up
In this vignette, we learned about circumplex models, scales, and data as well as the Structural Summary Method (SSM) for analyzing such data. We learned about the `circumplex` package and how to use the `ssm_analyze()` function to generate SSM results for a single group's mean scores and for correlations with a single external measure. We learned several shortcuts for making calls to this function easier and then explored the basics of SSM visualization by creating simple tables and circular plots. In the next vignette, "Intermediate SSM Analysis", we will build upon this knowledge to learn how to (1) generalize our analyses to multiple groups and multiple measures, (2) perform contrast analyses to compare groups or measures, and (3) export and make basic changes to tables and figures.
## References
* Gurtman, M. B. (1992). Construct validity of interpersonal personality measures: The interpersonal circumplex as a nomological net. _Journal of Personality and Social Psychology, 63_(1), 105–118.
* Gurtman, M. B., & Pincus, A. L. (2003). The circumplex model: Methods and research applications. In J. A. Schinka & W. F. Velicer (Eds.), _Handbook of psychology. Volume 2: Research methods in psychology_ (pp. 407–428). Hoboken, NJ: John Wiley & Sons, Inc.
* Wright, A. G. C., Pincus, A. L., Conroy, D. E., & Hilsenroth, M. J. (2009). Integrating methods to optimize circumplex description and comparison of groups. _Journal of Personality Assessment, 91_(4), 311–322.
* Zimmermann, J., & Wright, A. G. C. (2017). Beyond description in interpersonal construct validation: Methodological advances in the circumplex Structural Summary Approach. _Assessment, 24_(1), 3–23.