Skip to content

tidy.prcomp() returns incorrect result with method = "rotation" #923

@clauswilke

Description

@clauswilke

When using tidy() on a prcomp object with method = "rotation" to extract the rotation matrix, the output that is returned is incorrect. The problem is that the PCs are numbered incorrectly. We would expect each variable in the original dataset to be paired exactly once with each PC. Instead, the first variable is paired multiple times with the first PC, the second variable is paired multiple times with the second PC, and so on. Reprex follows below.

library(broom)
library(dplyr)
library(tidyr)

iris_pca <- iris %>%
  select(-Species) %>%
  scale() %>%
  prcomp()

# output generated by tidy.prcomp()
tidy(iris_pca, matrix = "rotation")
#> # A tibble: 16 x 3
#>    column          PC   value
#>    <chr>        <dbl>   <dbl>
#>  1 Sepal.Length     1  0.521 
#>  2 Sepal.Width      2 -0.377 
#>  3 Petal.Length     3  0.720 
#>  4 Petal.Width      4  0.261 
#>  5 Sepal.Length     1 -0.269 
#>  6 Sepal.Width      2 -0.923 
#>  7 Petal.Length     3 -0.244 
#>  8 Petal.Width      4 -0.124 
#>  9 Sepal.Length     1  0.580 
#> 10 Sepal.Width      2 -0.0245
#> 11 Petal.Length     3 -0.142 
#> 12 Petal.Width      4 -0.801 
#> 13 Sepal.Length     1  0.565 
#> 14 Sepal.Width      2 -0.0669
#> 15 Petal.Length     3 -0.634 
#> 16 Petal.Width      4  0.524

# expected output would be something like this
iris_pca$rotation %>%
  as.data.frame() %>%
  mutate(column = row.names(.)) %>%
  pivot_longer(PC1:PC4) %>%
  arrange(name, column)
#> # A tibble: 16 x 3
#>    column       name    value
#>    <chr>        <chr>   <dbl>
#>  1 Petal.Length PC1    0.580 
#>  2 Petal.Width  PC1    0.565 
#>  3 Sepal.Length PC1    0.521 
#>  4 Sepal.Width  PC1   -0.269 
#>  5 Petal.Length PC2   -0.0245
#>  6 Petal.Width  PC2   -0.0669
#>  7 Sepal.Length PC2   -0.377 
#>  8 Sepal.Width  PC2   -0.923 
#>  9 Petal.Length PC3   -0.142 
#> 10 Petal.Width  PC3   -0.634 
#> 11 Sepal.Length PC3    0.720 
#> 12 Sepal.Width  PC3   -0.244 
#> 13 Petal.Length PC4   -0.801 
#> 14 Petal.Width  PC4    0.524 
#> 15 Sepal.Length PC4    0.261 
#> 16 Sepal.Width  PC4   -0.124

Created on 2020-09-03 by the reprex package (v0.3.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions