Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correlation "long" data to matrix #12

Closed
DominiqueMakowski opened this issue Feb 20, 2019 · 7 comments
Closed

Correlation "long" data to matrix #12

DominiqueMakowski opened this issue Feb 20, 2019 · 7 comments
Labels
enhancement 💥 Implemented features can be improved or revised

Comments

@DominiqueMakowski
Copy link
Member

I would like report to create traditional correlation matrices from the data provided by the new correlation package, which is in a long format.

For square matrices (i.e., all variables correlated with all variables), something like this could be a first step:

model <- correlation::correlation(iris)
cells <- model$r
m <- matrix(cells, nrow = as.integer(sqrt(length(cells))), ncol=as.integer(sqrt(length(cells))), byrow = TRUE)

However, colnames and rownames still need to be named appropriately. Moreover, this wouldn't work in the case of uneven matrices, such as:

model <- correlation::correlation(
 select(iris, Sepal.Length),
 select(iris, starts_with("Petal"))
)

@strengejacke do you have by any chance any intuition?

@strengejacke
Copy link
Member

When I try to run your sample, I get an error... Is there anything I'm out of date? I just installed all latest easystats-packages with install_github().

#>  Correlation: 
#> Error in round(x[ll], digits = rdig) : 
#>   non-numeric argument to mathematical function

@strengejacke
Copy link
Member

Ok, works now.

@DominiqueMakowski
Copy link
Member Author

Added, might not be the best solution, but seems like it works :)

@IndrajeetPatil
Copy link
Member

I think the only thing now remains is to retain the variable names.

I have shown below two different workflows, one using psych and one using correlation and the only difference is that the variable names are lost in the correlation-based workflow.

# setup
set.seed(123)
library(psych)
library(report)
library(ggcorrplot)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'ggplot2'
#> The following objects are masked from 'package:psych':
#> 
#>     %+%, alpha

# correlation object
corr_obj <- corr.test(mtcars)

# checking class of object containing correlation coefficients
class(corr_obj$r)
#> [1] "matrix"

# plot
ggcorrplot(corr_obj$r)

# using `correlation` package
model <- correlation::correlation(mtcars)
cells <- model$r
m <- matrix(
  data = cells,
  nrow = as.integer(sqrt(length(cells))),
  ncol = as.integer(sqrt(length(cells))),
  byrow = TRUE
)

# plot
ggcorrplot(m)

Created on 2019-09-20 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       Europe/Berlin               
#>  date     2019-09-20                  
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version    date       lib
#>  assertthat    0.2.1      2019-03-21 [1]
#>  backports     1.1.4      2019-04-10 [1]
#>  bayestestR    0.2.5      2019-08-06 [1]
#>  callr         3.3.1      2019-07-18 [1]
#>  cli           1.1.0      2019-03-19 [1]
#>  colorspace    1.4-1      2019-03-18 [1]
#>  correlation   0.1.0      2019-09-16 [1]
#>  crayon        1.3.4      2017-09-16 [1]
#>  curl          4.1        2019-09-16 [1]
#>  desc          1.2.0      2019-04-03 [1]
#>  devtools      2.2.0.9000 2019-09-19 [1]
#>  digest        0.6.20     2019-07-04 [1]
#>  dplyr         0.8.3      2019-07-04 [1]
#>  ellipsis      0.2.0.1    2019-07-02 [1]
#>  evaluate      0.14       2019-05-28 [1]
#>  foreign       0.8-71     2018-07-20 [2]
#>  fs            1.3.1      2019-05-06 [1]
#>  ggcorrplot  * 0.1.3      2019-05-19 [1]
#>  ggplot2     * 3.2.1      2019-08-10 [1]
#>  glue          1.3.1      2019-03-12 [1]
#>  gtable        0.3.0      2019-03-25 [1]
#>  highr         0.8        2019-03-20 [1]
#>  htmltools     0.3.6      2017-04-28 [1]
#>  httr          1.4.1      2019-08-05 [1]
#>  insight       0.5.0.9000 2019-09-16 [1]
#>  knitr         1.25       2019-09-18 [1]
#>  labeling      0.3        2014-08-23 [1]
#>  lattice       0.20-38    2018-11-04 [2]
#>  lazyeval      0.2.2      2019-03-15 [1]
#>  magrittr      1.5        2014-11-22 [1]
#>  memoise       1.1.0      2017-04-21 [1]
#>  mime          0.7        2019-06-11 [1]
#>  mnormt        1.5-5      2016-10-15 [1]
#>  munsell       0.5.0      2018-06-12 [1]
#>  nlme          3.1-140    2019-05-12 [2]
#>  parameters    0.1.0.9000 2019-09-12 [1]
#>  pillar        1.4.2      2019-06-29 [1]
#>  pkgbuild      1.0.5      2019-08-26 [1]
#>  pkgconfig     2.0.2      2018-08-16 [1]
#>  pkgload       1.0.2      2018-10-29 [1]
#>  plyr          1.8.4      2016-06-08 [1]
#>  prettyunits   1.0.2      2015-07-13 [1]
#>  processx      3.4.1      2019-07-18 [1]
#>  ps            1.3.0      2018-12-21 [1]
#>  psych       * 1.8.12     2019-01-12 [1]
#>  purrr         0.3.2      2019-03-15 [1]
#>  R6            2.4.0      2019-02-14 [1]
#>  Rcpp          1.0.2      2019-07-25 [1]
#>  remotes       2.1.0      2019-06-24 [1]
#>  report      * 0.1.0      2019-09-20 [1]
#>  reshape2      1.4.3      2017-12-11 [1]
#>  rlang         0.4.0      2019-06-25 [1]
#>  rmarkdown     1.15       2019-08-21 [1]
#>  rprojroot     1.3-2      2018-01-03 [1]
#>  scales        1.0.0      2018-08-09 [1]
#>  sessioninfo   1.1.1      2018-11-05 [1]
#>  stringi       1.4.3      2019-03-12 [1]
#>  stringr       1.4.0      2019-02-10 [1]
#>  testthat      2.2.1      2019-07-25 [1]
#>  tibble        2.1.3      2019-06-06 [1]
#>  tidyselect    0.2.5      2018-10-11 [1]
#>  usethis       1.5.1.9000 2019-09-12 [1]
#>  withr         2.1.2      2018-03-15 [1]
#>  xfun          0.9        2019-08-21 [1]
#>  xml2          1.2.2      2019-08-09 [1]
#>  yaml          2.2.0      2018-07-25 [1]
#>  source                                
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  Github (easystats/correlation@e7bd465)
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.1)                        
#>  Github (r-lib/desc@c860e7b)           
#>  Github (r-lib/devtools@2765fbe)       
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.1)                        
#>  local                                 
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.5.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.5.0)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.1)                        
#>  Github (easystats/parameters@bf16cf2) 
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  Github (easystats/report@47dd064)     
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.0)                        
#>  CRAN (R 3.5.1)                        
#>  Github (r-lib/usethis@a2342b8)        
#>  CRAN (R 3.5.1)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.6.1)                        
#>  CRAN (R 3.5.1)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.1/library

@IndrajeetPatil IndrajeetPatil transferred this issue from easystats/report Oct 16, 2019
@IndrajeetPatil IndrajeetPatil added the enhancement 💥 Implemented features can be improved or revised label Oct 16, 2019
@DominiqueMakowski
Copy link
Member Author

library(correlation)
library(ggcorrplot)
#> Warning: package 'ggcorrplot' was built under R version 3.6.1
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 3.6.1

cormat <- as.table(correlation(iris))
row.names(cormat) <- cormat$Parameter
cormat <- cormat[-1]

ggcorrplot(cormat)

Created on 2019-10-21 by the reprex package (v0.3.0)

@strengejacke Think we could add that to see for a plot method?

@DominiqueMakowski
Copy link
Member Author

Can now do:

correlation(iris) %>%
  as.matrix() %>%
  ggcorrplot()

We could also include an argument to get an graph network instead of a heatmap-like plot (See updated README)

@DominiqueMakowski
Copy link
Member Author

Closing this as the main point has been addressed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

No branches or pull requests

3 participants