dev #55

DominiqueMakowski · 2020-03-22T10:21:06Z

No description provided.

codecov-io · 2020-03-22T10:25:56Z

Codecov Report

Merging #55 into master will increase coverage by 0.61%.
The diff coverage is 81.25%.

@@            Coverage Diff             @@
##           master      #55      +/-   ##
==========================================
+ Coverage   73.69%   74.30%   +0.61%     
==========================================
  Files          29       31       +2     
  Lines         764      833      +69     
==========================================
+ Hits          563      619      +56     
- Misses        201      214      +13

Impacted Files	Coverage Δ
R/correlation.R	`78.30% <ø> (ø)`
R/simulate_simpson.R	`0.00% <0.00%> (ø)`
R/utils_bootstrapping.R	`0.00% <ø> (ø)`
R/z_fisher.R	`0.00% <ø> (ø)`
R/cor_test_distance.R	`59.21% <42.85%> (-1.61%)`	⬇️
R/utils_find_correlationtype.R	`80.55% <80.55%> (ø)`
R/cor_test_biserial.R	`88.09% <88.09%> (ø)`
R/cor_test.R	`84.78% <100.00%> (+0.78%)`	⬆️
R/cor_test_freq.R	`76.92% <100.00%> (ø)`
R/cor_test_tetrachoric.R	`90.90% <100.00%> (-2.20%)`	⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19f73b8...81393c9. Read the comment docs.

DominiqueMakowski · 2020-03-22T13:01:01Z

@strengejacke @mattansb @IndrajeetPatil although JOSS submissions are closed anyway for now, I just felt inspired to initialize the paper. It can be fast and light work since the scope is quite narrow. Feel free to expand and add details and things.

I'd like to get a nice main figure, but I'm not sure how it would look like... ideas?

mattansb · 2020-03-22T13:21:59Z

I really like the last two plots here:
https://easystats.github.io/see/articles/correlation.html

DominiqueMakowski · 2020-03-22T14:04:22Z

I really like the last two plots here:

yeah, these will go in as part of the examples, but I was thinking about like a main figure 1 that would summarize the goal and features of the package... ^^

Images that come to my mind are:

a scatterplot showing a slightly non-linear relationship between two variables
- some obvious outliers
And on it, or on the side or something, somehow show the output (but what? how?) of the different methods for correlations... The estimate's value as barplots on the side? behind the scatterplot?
Or one scatter plot repeated for each method, and for each method, we visualize what the method does. For instance, for the rank transformed, we show each point mirroring into its rank position (like a translation). For the percentage bend, we show the extreme points as transparent (suggesting they are not "included"), for Bayesian we show all of the posterior draws etc. But then this gets impossible to do for all I think.

So I don't know...

mattansb · 2020-03-22T14:14:48Z

Hmmm.. seeing as how the methods aren't new, I am inclined to stick to plots we can actually show with our package/s?

But I'm also not sure what would be "nice" and what would be "informative" about the pkg... 🤷‍♂️

DominiqueMakowski · 2020-03-22T15:14:44Z

For bayestestR, we also had in the intro nice plots to "visualize" HDIs and BFs etc., although these were not new methods either, and then in the examples, we had an example of what you could obtain using it.

Here I have in mind the one figure that will be the illustration of the paper...

DominiqueMakowski · 2020-03-23T03:13:34Z

R/cor_test_biserial.R

+#' @keywords internal
+.cor_test_biserial_biserial <- function(data, x, y, continuous, binary, ci){
+
+  # TODO: get rid off psych https://www.statisticshowto.datasciencecentral.com/point-biserial-correlation/


@strengejacke @mattansb the maths geniuses there's an easy formula https://www.statisticshowto.datasciencecentral.com/point-biserial-correlation/ for the biserial correlation so that we don't depend on psych... care to have a look? 😁

Maybe like:

set.seed(123) y <- rbinom(100, 1, .3) x <- rnorm(100) m1 <- mean(x[y == 1]) m0 <- mean(x[y == 0]) sn <- sd(x) q <- mean(y) p <- 1 - q ((m1 - m0) / sn) * sqrt(p * q) #> [1] 0.06151908 y2 <- y y2[y == 0] <- "a" y2[y == 1] <- "f" y3 <- performance:::.factor_to_numeric(y2, lowest = 0) m1 <- mean(x[y == 1]) m0 <- mean(x[y == 0]) sn <- sd(x) q <- mean(y) p <- 1 - q ((m1 - m0) / sn) * sqrt(p * q) #> [1] 0.06151908

^{Created on 2020-03-23 by the reprex package (v0.3.0)}

ups, that was point biseral...

Here you are:

own_biserial <- function(x, y) { cc <- complete.cases(x, y) x <- x[cc] y <- y[cc] y <- performance:::.factor_to_numeric(y, lowest = 0) m1 <- mean(x[y == 1]) m0 <- mean(x[y == 0]) sn <- sd(x) q <- mean(y) p <- 1 - q zp <- dnorm(qnorm(q)) (((m1 - m0) * (p * q / zp)) / sd(x)) } set.seed(123) y <- rbinom(100, 1, .3) x <- rnorm(100) own_biserial(x, y) #> [1] 0.08155037 psych::biserial(x, y) #> [,1] #> [1,] 0.08155037 set.seed(456) y <- rbinom(100, 1, .3) x <- rnorm(100) own_biserial(x, y) #> [1] 0.02964972 psych::biserial(x, y) #> [,1] #> [1,] 0.02964972

^{Created on 2020-03-23 by the reprex package (v0.3.0)}

DominiqueMakowski · 2020-03-23T05:13:16Z

Something like this, it's just for illustrative purposes

strengejacke · 2020-03-23T10:50:57Z

@DominiqueMakowski See #56

@strengejacke

close #56 well done @strengejacke

DominiqueMakowski · 2020-03-24T02:15:52Z

Alright, what abouuuuut we submit 😁

is it sudden? definitely, but one has to use the motivation when it comes 🤷‍♂

mattansb · 2020-03-24T05:30:56Z

Awesome fig! Would make sure to note in the caption that Bayesian is also Pearson (:

…

-- Mattan S. Ben-Shachar, PhD student Department of Psychology & Zlotowski Center for Neuroscience Ben-Gurion University of the Negev The Developmental ERP Lab

On Tue, Mar 24, 2020, 04:16 Dominique Makowski ***@***.***> wrote: [image: figure1] <https://user-images.githubusercontent.com/8875533/77381260-638fcd00-6db8-11ea-9cad-9e72cd51ad03.png> Alright, what abouuuuut we submit 😁 is it sudden? definitely, but one has to use the motivation when it comes 🤷‍♂ — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#55 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AINRP6BMNJMFPT6GD3DBOOLRJAJWJANCNFSM4LRHT5EA> .

DominiqueMakowski · 2020-03-24T06:20:30Z

Would make sure to note in the caption that Bayesian is also Pearson (:

Is it tho? Isn't Pearson's correlation by nature frequentist corresponding to a particular formula? Isn't it more like Bayesian pseudo-Pearson 😅

mattansb · 2020-03-24T07:04:43Z

Pearson defined the population linear correlation Rho, which both the freq and Bayesian methods try to estimate :)

(While the other methods are estimating something else - close sometimes, but not Rho exactly.)

DominiqueMakowski · 2020-03-24T07:08:30Z

On a side note, what are your thoughts about standardizing the output name, for now it's sometimes "r", "rho" or "tau". What about naming them all "r"? or "rho"?

mattansb · 2020-03-24T07:26:21Z

They're not all estimates of Rho, so that would be misleading. (This is why Spearman isn't just a robust Person!)
I thought the easyverse style guide states that column names should be specific to what they contain (and not go the broom route)? 🤔

DominiqueMakowski · 2020-03-24T07:27:12Z

Oh, I thought that JOSS closed the papers processing but that you could still submit, but in fact they closed the submission form so we'll have to wait a bit until the situation improves 🤞

I thought the easyverse style guide states that column names should be specific to what they contain (and not go the broom route)? 🤔

Right right 😅

strengejacke · 2020-03-24T14:08:24Z

although it doesn't seem to impact the results? Why not leave it as is?

  m1 <- mean(var_x[var_y == 1])
  m0 <- mean(var_x[var_y == 0])

DominiqueMakowski · 2020-03-24T14:36:01Z

right...

DominiqueMakowski · 2020-03-24T14:43:39Z

that was via voice recognition when I was on my bike

on a home exercise bike you mean 👁

strengejacke · 2020-03-24T14:49:30Z

on a home exercise bike you mean 👁

No, I'm not doing any sports (beside driving to work by bike and running up'n'down the stairs to pick up things from the kids they just drop anywhere)

strengejacke · 2020-03-24T14:50:18Z

R/cor_test_biserial.R

-  m1 <- mean(var_x[var_y == 1])
-  m0 <- mean(var_x[var_y == 0])
+  m1 <- mean(var_x[var_y == unique(var_y)[1]])
+  m0 <- mean(var_x[var_y == unique(var_y)[2]])
  sn <- stats::sd(var_x)
  q <- mean(var_y)


q <- mean(var_y) computes the proportion, which only works when the values are 0/1...

DominiqueMakowski · 2020-03-24T14:52:10Z

you forgot gym for fingers via coding in R

DominiqueMakowski · 2020-03-24T14:54:44Z

R/cor_test_biserial.R

+  if (.vartype(data[[binary]])$is_factor | .vartype(data[[binary]])$is_character) {
+    data[[binary]] <- as.numeric(as.factor(data[[binary]]))
+  }
+  data[[binary]] <- as.vector((data[[binary]] - min(data[[binary]], na.rm = TRUE)) / diff(range(data[[binary]], na.rm = TRUE), na.rm = TRUE))


@strengejacke it should be alright coz it's normalized here

DominiqueMakowski · 2020-03-25T02:49:53Z

R/cor_test_biserial.R

+  # )[1])
+  m1 <- mean(var_x[var_y == unique(var_y)[2]])
+  m0 <- mean(var_x[var_y == unique(var_y)[1]])
+  sn <- stats::sd(var_x)


btw @strengejacke do you know what purpose does this sn serve? it doesn't seem to be re-used

I have rewritten the code once or twice, maybe l just forgot to remove it...

update CITATION and README with new see plot

a27b7e5

DominiqueMakowski added 4 commits March 22, 2020 18:45

add future blogpost as vignette

73d2b4a

prepare paper folder

5d09115

website

fcfaf2d

initialize paper

df89051

Update paper.md

f320804

isolate biserial correlations

1c4f36a

DominiqueMakowski commented Mar 23, 2020

View reviewed changes

DominiqueMakowski added 2 commits March 23, 2020 13:12

figure1 test

2ed9248

Create make_figures.R

b36d6db

Update paper.md

19242c5

strengejacke mentioned this pull request Mar 23, 2020

Biserial Correlation #56

Closed

DominiqueMakowski added 5 commits March 24, 2020 09:06

remove psych for biserial

7eb7877

close #56 well done @strengejacke

update fig and paper

9310bb7

styler + website

4e3f04d

figure captions

9e50dab

add refs

81393c9

strengejacke and others added 3 commits March 24, 2020 09:36

docs

3644669

method less verbose

f2ccc7e

@strengejacke I put it there but not 100% sure it's the best location

463ec99

DominiqueMakowski added 2 commits March 24, 2020 22:38

do not hardcode values

7a4a20b

hotfix

3605772

strengejacke requested changes Mar 24, 2020

View reviewed changes

DominiqueMakowski commented Mar 24, 2020

View reviewed changes

clean

fd9d4c6

DominiqueMakowski commented Mar 25, 2020

View reviewed changes

DominiqueMakowski added 2 commits March 25, 2020 10:50

Update cor_test_biserial.R

b3f05a1

Update cor_test_biserial.R

72828a3

DominiqueMakowski merged commit 99ef225 into master Mar 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev #55

dev #55

DominiqueMakowski commented Mar 22, 2020

codecov-io commented Mar 22, 2020 •

edited

Loading

DominiqueMakowski commented Mar 22, 2020

mattansb commented Mar 22, 2020

DominiqueMakowski commented Mar 22, 2020

mattansb commented Mar 22, 2020

DominiqueMakowski commented Mar 22, 2020

DominiqueMakowski Mar 23, 2020

strengejacke Mar 23, 2020

strengejacke Mar 23, 2020

strengejacke Mar 23, 2020

DominiqueMakowski commented Mar 23, 2020

strengejacke commented Mar 23, 2020

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020 via email

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020 •

edited

Loading

strengejacke commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

strengejacke commented Mar 24, 2020

strengejacke Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

DominiqueMakowski Mar 24, 2020

DominiqueMakowski Mar 25, 2020

strengejacke Mar 25, 2020

dev #55

dev #55

Conversation

DominiqueMakowski commented Mar 22, 2020

codecov-io commented Mar 22, 2020 • edited Loading

Codecov Report

DominiqueMakowski commented Mar 22, 2020

mattansb commented Mar 22, 2020

DominiqueMakowski commented Mar 22, 2020

mattansb commented Mar 22, 2020

DominiqueMakowski commented Mar 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DominiqueMakowski commented Mar 23, 2020

strengejacke commented Mar 23, 2020

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020 via email

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

mattansb commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020 • edited Loading

strengejacke commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

DominiqueMakowski commented Mar 24, 2020

strengejacke commented Mar 24, 2020

Choose a reason for hiding this comment

DominiqueMakowski commented Mar 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Mar 22, 2020 •

edited

Loading

DominiqueMakowski commented Mar 24, 2020 •

edited

Loading