Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative p-values #2

Closed
jamesdalg opened this issue May 2, 2019 · 4 comments
Closed

negative p-values #2

jamesdalg opened this issue May 2, 2019 · 4 comments

Comments

@jamesdalg
Copy link

Using your example code, I get negative p-values, as of version 1.4.2.
set.seed(5)

assume these are longitudinal data, each column is a variable (or feature)

dataset <- matrix( rnorm(400 * 100), ncol = 100 )
id <- rep(1:80, each = 5) ## 80 subjects
reps <- rep( seq(4, 12, by = 2), 80)

5 time points for each subject

dataset contains are the regression coefficients of each subject's values on the

reps (which is assumed to be time in this example)

target <- rep(0:1, each = 200)
a <- MMPC.timeclass(target, reps, id, dataset)
a@pvalues %>% summary()

Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 

-4.01762 -1.39835 -0.68720 -0.98512 -0.37326 -0.01365

@statlink
Copy link

statlink commented May 2, 2019

This is the logartihm of the p-values. The reason for this is mentioned in the help files.
If the p-value is too small, R rounds it to zero. If you have the logartihm though, you can see which p-value is smaller among two or more.

@jamesdalg
Copy link
Author

Thank you so much for responding. I've been trying to figure this out for days. The helpfile doesn't explain this for MXM::MMPC.timeclass(), at least as far as it is written in (https://cran.r-project.org/web/packages/MXM/MXM.pdf):
"pvalues For each feature included in the dataset, this vector reports the strength of its
association with the target in the context of all other variables. Particularly, this
vector reports the max p-values foudn when the association of each variable
with the target is tested against different conditional sets. Lower values indicate
higher association."

Throughout the document, you mention log p-values, but in this particular function doesn't make it clear. I think this small change to documentation would be a huge help to the users of the package. Just a suggestion.

@statlink
Copy link

statlink commented May 2, 2019

Hi James, I will change the help file. I will add your name in the acknowledgements for this.

@jamesdalg
Copy link
Author

Wow! Thanks! Glad to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants