Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CBPM not aligned with paper #170

Closed
samihamdan opened this issue Aug 30, 2022 · 5 comments
Closed

[BUG] CBPM not aligned with paper #170

samihamdan opened this issue Aug 30, 2022 · 5 comments
Labels
bug Something isn't working
Milestone

Comments

@samihamdan
Copy link
Collaborator

The CBPM method we implemented has slight variations from the original: using the mean instead of sum for summation; the weighting might also not be aligned.
While this should not make a fundamental difference, users should at least have the option to exactly follow the paper.

@samihamdan samihamdan added the enhancement New feature or request label Aug 30, 2022
@kaurao
Copy link
Collaborator

kaurao commented Sep 1, 2022

the code from authors is available here: https://www.nitrc.org/projects/bioimagesuite/ (see file codeshare_behavioralprediction.m)
the only difference I can notice is what @samihamdan mentioned, using sum instead of mean.

@NevenaK
Copy link

NevenaK commented Sep 1, 2022

Thanks for the issue @samihamdan! I am using the CBPM implementation and would like to see if/what difference sum vs mean makes. And would also like to use it as the authors intended :) If I can help adjusting that, let me know :)

@kaurao
Copy link
Collaborator

kaurao commented Sep 1, 2022

The mean versus sum should not make any difference to predictions as the difference is same linear transformation across observations (just multiplying features with a constant). The linear model used for learning will take care of this and the predictions will be the same. The difference will be the weights assigned to the two features (pos and neg) but they will be again proportional so that should not be an issue as well. But I still agree that sum should be available to users.

Here is empirical demonstration of this using R ;)

# generate data
set.seed(1)
x1 = rnorm(100)
x2 = rnorm(100)
y = x1 + x2
# get a linear model
l1 = lm(y~., data=data.frame(y=y,x1=x1,x2=x2))
# get a linear model with scaled x
l2 = lm(y~., data=data.frame(y=y,x1=10*x1,x2=5*x2))
# plot
plot(l1$fitted.values, l2$fitted.values, xlab='l1', ylab='l2')

image

@NevenaK
Copy link

NevenaK commented Sep 1, 2022

Nice, thank you!

@fraimondo fraimondo added bug Something isn't working needs thinking We need more discussion around this topic labels Sep 9, 2022
@samihamdan
Copy link
Collaborator Author

This PR: #221 introduces sum as default option for cbpm, but still allows other aggregation functions.

@fraimondo fraimondo added this to the v0.3.0 milestone Apr 6, 2023
@fraimondo fraimondo removed needs thinking We need more discussion around this topic enhancement New feature or request labels Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants