-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dev #55
dev #55
Conversation
Codecov Report
@@ Coverage Diff @@
## master #55 +/- ##
==========================================
+ Coverage 73.69% 74.30% +0.61%
==========================================
Files 29 31 +2
Lines 764 833 +69
==========================================
+ Hits 563 619 +56
- Misses 201 214 +13
Continue to review full report at Codecov.
|
@strengejacke @mattansb @IndrajeetPatil although JOSS submissions are closed anyway for now, I just felt inspired to initialize the paper. It can be fast and light work since the scope is quite narrow. Feel free to expand and add details and things. I'd like to get a nice main figure, but I'm not sure how it would look like... ideas? |
I really like the last two plots here: |
yeah, these will go in as part of the examples, but I was thinking about like a main figure 1 that would summarize the goal and features of the package... ^^ Images that come to my mind are:
So I don't know... |
Hmmm.. seeing as how the methods aren't new, I am inclined to stick to plots we can actually show with our package/s? But I'm also not sure what would be "nice" and what would be "informative" about the pkg... 🤷♂️ |
For bayestestR, we also had in the intro nice plots to "visualize" HDIs and BFs etc., although these were not new methods either, and then in the examples, we had an example of what you could obtain using it. Here I have in mind the one figure that will be the illustration of the paper... |
R/cor_test_biserial.R
Outdated
#' @keywords internal | ||
.cor_test_biserial_biserial <- function(data, x, y, continuous, binary, ci){ | ||
|
||
# TODO: get rid off psych https://www.statisticshowto.datasciencecentral.com/point-biserial-correlation/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@strengejacke @mattansb the maths geniuses there's an easy formula https://www.statisticshowto.datasciencecentral.com/point-biserial-correlation/ for the biserial correlation so that we don't depend on psych... care to have a look? 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe like:
set.seed(123)
y <- rbinom(100, 1, .3)
x <- rnorm(100)
m1 <- mean(x[y == 1])
m0 <- mean(x[y == 0])
sn <- sd(x)
q <- mean(y)
p <- 1 - q
((m1 - m0) / sn) * sqrt(p * q)
#> [1] 0.06151908
y2 <- y
y2[y == 0] <- "a"
y2[y == 1] <- "f"
y3 <- performance:::.factor_to_numeric(y2, lowest = 0)
m1 <- mean(x[y == 1])
m0 <- mean(x[y == 0])
sn <- sd(x)
q <- mean(y)
p <- 1 - q
((m1 - m0) / sn) * sqrt(p * q)
#> [1] 0.06151908
Created on 2020-03-23 by the reprex package (v0.3.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ups, that was point biseral...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you are:
own_biserial <- function(x, y) {
cc <- complete.cases(x, y)
x <- x[cc]
y <- y[cc]
y <- performance:::.factor_to_numeric(y, lowest = 0)
m1 <- mean(x[y == 1])
m0 <- mean(x[y == 0])
sn <- sd(x)
q <- mean(y)
p <- 1 - q
zp <- dnorm(qnorm(q))
(((m1 - m0) * (p * q / zp)) / sd(x))
}
set.seed(123)
y <- rbinom(100, 1, .3)
x <- rnorm(100)
own_biserial(x, y)
#> [1] 0.08155037
psych::biserial(x, y)
#> [,1]
#> [1,] 0.08155037
set.seed(456)
y <- rbinom(100, 1, .3)
x <- rnorm(100)
own_biserial(x, y)
#> [1] 0.02964972
psych::biserial(x, y)
#> [,1]
#> [1,] 0.02964972
Created on 2020-03-23 by the reprex package (v0.3.0)
Awesome fig!
Would make sure to note in the caption that Bayesian is also Pearson (:
…--
Mattan S. Ben-Shachar, PhD student
Department of Psychology & Zlotowski Center for Neuroscience
Ben-Gurion University of the Negev
The Developmental ERP Lab
On Tue, Mar 24, 2020, 04:16 Dominique Makowski ***@***.***> wrote:
[image: figure1]
<https://user-images.githubusercontent.com/8875533/77381260-638fcd00-6db8-11ea-9cad-9e72cd51ad03.png>
Alright, what abouuuuut we submit 😁
is it sudden? definitely, but one has to use the motivation when it comes
🤷♂
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#55 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AINRP6BMNJMFPT6GD3DBOOLRJAJWJANCNFSM4LRHT5EA>
.
|
Is it tho? Isn't Pearson's correlation by nature frequentist corresponding to a particular formula? Isn't it more like Bayesian pseudo-Pearson 😅 |
Pearson defined the population linear correlation Rho, which both the freq and Bayesian methods try to estimate :) (While the other methods are estimating something else - close sometimes, but not Rho exactly.) |
On a side note, what are your thoughts about standardizing the output name, for now it's sometimes "r", "rho" or "tau". What about naming them all "r"? or "rho"? |
They're not all estimates of Rho, so that would be misleading. (This is why Spearman isn't just a robust Person!) |
Oh, I thought that JOSS closed the papers processing but that you could still submit, but in fact they closed the submission form so we'll have to wait a bit until the situation improves 🤞
Right right 😅 |
m1 <- mean(var_x[var_y == 1])
m0 <- mean(var_x[var_y == 0]) |
right... |
on a home exercise bike you mean 👁 |
No, I'm not doing any sports (beside driving to work by bike and running up'n'down the stairs to pick up things from the kids they just drop anywhere) |
m1 <- mean(var_x[var_y == 1]) | ||
m0 <- mean(var_x[var_y == 0]) | ||
m1 <- mean(var_x[var_y == unique(var_y)[1]]) | ||
m0 <- mean(var_x[var_y == unique(var_y)[2]]) | ||
sn <- stats::sd(var_x) | ||
q <- mean(var_y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q <- mean(var_y)
computes the proportion, which only works when the values are 0/1...
you forgot gym for fingers via coding in R |
if (.vartype(data[[binary]])$is_factor | .vartype(data[[binary]])$is_character) { | ||
data[[binary]] <- as.numeric(as.factor(data[[binary]])) | ||
} | ||
data[[binary]] <- as.vector((data[[binary]] - min(data[[binary]], na.rm = TRUE)) / diff(range(data[[binary]], na.rm = TRUE), na.rm = TRUE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@strengejacke it should be alright coz it's normalized here
R/cor_test_biserial.R
Outdated
# )[1]) | ||
m1 <- mean(var_x[var_y == unique(var_y)[2]]) | ||
m0 <- mean(var_x[var_y == unique(var_y)[1]]) | ||
sn <- stats::sd(var_x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw @strengejacke do you know what purpose does this sn
serve? it doesn't seem to be re-used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rewritten the code once or twice, maybe l just forgot to remove it...
No description provided.