/
calcSD.Rmd
206 lines (145 loc) · 8.03 KB
/
calcSD.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
title: "Calculate sexual dimorphism of faces"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
bibliography: facefuns.bib
csl: apa-numeric-superscript-brackets.csl
link-citations: yes
vignette: >
%\VignetteIndexEntry{Calculate sexual dimorphism of faces}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
**WORK IN PROGRESS**
This tutorial will explain how to calculate facial sexual dimorphism (SD) in a set of 2D faces, using a [vector method](#vs) (e.g., [@holzleitner2014; @komori2011; @valenzano2006]) and [linear discriminant analysis](#ds) (e.g., [@scott2010]).
```{r setup}
library(facefuns)
library(ggplot2)
```
## Read and prep landmark data
```{r}
path_to_tem <- system.file("extdata", "tem", package="facefuns")
shapedata <- facefuns(data = read_lmdata(lmdata = path_to_tem,
plot = FALSE),
remove_points = "frlgmm",
pc_criterion = "broken_stick",
plot_sample = FALSE,
quiet = TRUE)
```
In addition, we'll also need to know which of the faces in our sample are female, and which are male.
```{r}
data(LondonSet_info)
head(LondonSet_info)
```
## 1. Vector scores: `calc_(shape)vs`{#vs}
`facefuns::calc_vs` calculates sexual dimorphism by computing an n-dimensional vector between the average female and the average male shape, and then projecting each face onto this vector. Scores are scaled so that a score of 0 corresponds to the average female, and a score of 1 to the average male shape; a negative score, e.g., would thus indicate a hyperfeminine face.
You can use `calc_vs` to calculate vector scores from a data frame or matrix of principal component scores (see `help(calc_vs)`), or `calc_shapevs` to calculate them from the Procrustes-aligned templates in the facefuns output.
`calc_shapevs` takes three arguments, all of which should be a data frame or matrix with specimen as rows, and PC scores as columns:
* `facefuns_obj`: Output from `facefuns::facefuns`
* `anchor1_index`: Vector specifying indices of faces that will constitute lower anchor point
* `anchor2_index`: Vector specifying indices of faces that will constitute upper anchor point
We get the index of which faces in our array of Procrustes-aligned faces are female/male by testing which IDs have a corresponding entry in the info table for which sex == female/male
```{r}
fem_i <- gsub("^ID=","", dimnames(shapedata$aligned)[[3]]) %in%
LondonSet_info$face_id[which(LondonSet_info$face_sex == "female")]
mal_i <- gsub("^ID=", "", dimnames(shapedata$aligned)[[3]]) %in%
LondonSet_info$face_id[which(LondonSet_info$face_sex == "male")]
```
Now, we can run our function:
```{r}
sd_vector <- calc_shapevs(shapedata, fem_i, mal_i)
```
`calc_(shape)vs` will return a tibble that has two columns, `id` and `VS`:
```{r}
head(sd_vector)
```
```{r}
LondonSet_info %>%
dplyr::left_join(sd_vector, by = c("face_id" = "id")) %>%
ggplot(aes(x = VS, color=face_sex, fill = face_sex)) +
geom_histogram(binwidth = .1) +
theme_bw()
```
### Note: calculating vector scores within- and out-of-set
If your current sample is small, but you have a larger sample at hand that was delineated with the same landmarks or a compatible sub-/superset, you could use that bigger sample to build your shape PCA model and construct the sexual dimorphism vector, and then project faces from the smaller sample onto that vector. You can find an example [here](https://osf.io/s7c2z/) (to be updated and added to this vignette in the future).
### Note: calculating sexual dimorphism from symmetrized faces
Two-dimensional images are strongly affected by head posture, which is reflected in the fact that in 2D sets, the first principal components usually show shape associated with heads being tilted up/down, or heads being turned left/right. While it is not possible to "correct" for up/down head postures (plus shape differences associated with an upward-/downward-tilt overlap with those of sexual dimorphism), slight asymmetries due to a left/right tilt *could* be "remedied" by symmetrizing faces.
You could either symmetrize templates before submitting them to facefuns (using `facefuns::symm_templates`) which has the added benefit that it's very little fuss to plot the PCs on which the scores are based; or you could do so by using the option `symm = TRUE` in `calc_shapevs`.
Symmetrizing works by mirroring each landmark template, and calculating the mean between the original and mirrored template. After mirroring, the landmarks need to be relabeled - what used to be point 0 (right pupil) in the original template, is point 1 (the left pupil) after mirroring.
Unfortunately, this information needs to be manually entered, and will very much depend on the landmark template you want to mirror. For the set of 132 landmarks used throughout the examples in `facefuns`, this info is stored in `mirroredlandmarks`.
If `symm = TRUE`, `calc_shapevs` will first symmetrize the faces, re-conduct a GPA and PCA on these symmetrized templates, and then run `calc_vs`.
```{r}
data("mirroredlandmarks")
sd_vector_symm <- calc_shapevs(shapedata, fem_i, mal_i, symm = TRUE, mirroredlandmarks = mirroredlandmarks)
head(sd_vector_symm)
LondonSet_info %>%
dplyr::left_join(sd_vector_symm, by = c("face_id" = "id")) %>%
ggplot(aes(x = VS, color=face_sex, fill = face_sex)) +
geom_histogram(binwidth = .1) +
theme_bw()
```
## 2. Discriminant scores: `calc_ds`{#ds}
Another way to calculate sexual dimorphism is by employing a linear discriminant analyses to predict group membership.
`calc_ds` first conducts a (forward) stepwise variable selection and then carries out a linear discriminant analysis. It takes two arguments:
* `data`: A data frame or matrix with specimen as rows, and PCs as columns. Can also contain an additional column that defines group membership.
* `group_info`: Either a data frame with two columns (col 1 = id matching the rownames in data; col 2 = group), or a single numeric value indexing the column in data that contains group membership info
For this example, we'll specify group membership with a separate data frame.
```{r}
group_info <- LondonSet_info %>%
dplyr::select(face_id, face_sex)
sd_discrim <- calc_ds(shapedata$pc_scores, group_info = group_info)
```
Like `calc_vs`, `calc_ds` will return a tibble that has two columns, in this case `id` and `DS`:
```{r eval=FALSE}
head(sd_discrim)
```
## Compare vector and discriminant scores
```{r message=FALSE}
compareSD <- LondonSet_info %>%
dplyr::left_join(sd_vector, by = c("face_id" = "id")) %>%
dplyr::left_join(sd_discrim, by = c("face_id" = "id"))
compareSD %>%
dplyr::group_by(face_sex) %>%
dplyr::summarise(mean_vs = round(mean(VS), 3),
mean_ds = round(mean(DS), 3))
compareSD %>%
ggplot(aes(x = VS, y = DS)) +
geom_point(aes(color = face_sex)) +
geom_smooth(method='lm', formula = y ~ x, color = "purple") +
theme_bw()
```
### Visualize differences
Plot difference between lowest-scoring (top) and highest-scoring faces (bottom) on VS (left) and DS (right)
```{r, fig.width=7, fig.height=7, fig.show='hold', fig.align='center', echo=FALSE}
.pardefault <- par(no.readonly = T)
par(mar = c(0, 0, 0, 0))
lo_vs <- compareSD %>%
dplyr::slice_min(VS, n=5) %>%
dplyr::pull(face_id)
lo_ds <- compareSD %>%
dplyr::slice_min(DS, n=5) %>%
dplyr::pull(face_id)
hi_vs <- compareSD %>%
dplyr::slice_max(VS, n=5) %>%
dplyr::pull(face_id)
hi_ds <- compareSD %>%
dplyr::slice_max(DS, n=5) %>%
dplyr::pull(face_id)
plotl <- list(geomorph::mshape(shapedata$aligned[,,lo_vs]),
geomorph::mshape(shapedata$aligned[,,lo_ds]),
geomorph::mshape(shapedata$aligned[,,hi_vs]),
geomorph::mshape(shapedata$aligned[,,hi_ds]))
ref <- shapedata$average
plot_list <- rapply(plotl, function(x) {
function() { geomorph::plotRefToTarget(ref, x, mag=1.5) }
}, how = "list")
do.call(cowplot::plot_grid, c(plot_list, list(ncol = 2), scale = 1.5))
par(.pardefault)
```
## References