New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_lefse, subgroup argument gives Error in if ((sx < sy) != dir_cmp || sx == sy) { : missing value where TRUE/FALSE needed #62
Comments
I have the same error.... |
Can you post a result of following ?
|
Yes, of course
The functions runs if I have only |
I did remember now that there is a bug in the code for run-lefse, right in area of
Alternatively, you can start using another package called, MicrobiomeProcess. It will let you do lefse with second group as follows;
It is pretty simple to use but powerful. Overall, it is a better package for a 16S microbiome analysis. |
Thank you! Great package! I didn't know about it. |
@cpavloud, Yeah, it is an upcoming R package for the Microbiome stuff. The link you had is actually for the older version, though commands listed in this tutorial still work. I like the older version better than the newer one. I am gonna close this issue, since I don't think the developer will fix this issue any time soon. |
@akhst7 @cpavloud Thanks for your interested in microbiomeMarker, and I'm Sorry for the later reply. Could you provide a reproducible example that would help me to fix this issue? For more details on how to make a great minimal reproducible example, see https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and https://www.tidyverse.org/help/#reprex. |
I cannot use reprex to create a reproducible example as you want it, as it only renders the last line of code in the console. I am using R version 4.1.1 and tidyverse (1.3.1), phyloseq (1.38.0) and microbiomeMarker (1.0.2) When I try to run lefse with a group and a subgroup
I get this error But when I run it with
Also, when I run it with
This is what my sample_data look like
|
@cpavloud Please install the latest development version (1.3.2) from github in which your issue has been fixed.
|
Ok! That worked! |
Could you do me a big favor ? If you have a chance, could you run lefse on MicrobiotaProcess and compared to MicrobiomeMaker and post the result somewhere so that I can see them. Thanks. |
@akhst7 lefse in microbiomeMarker is a reimplementation of LEfSe. For details of the method, please see the LEfSe paper |
This is quite interesting. The result of
is that Also, as probably expected, the result of the
is that However, the
|
OK that's what I thought you would get. I get the exactly the same result. Initially I thought it was due to different default normalization approaches used by MP and MB. MP uses "hellinger"(I think, according to the vignette) while MB uses "cpm". But then, when I swapped MP's RelRareAbundance with MB's cpm, and did diff_analysis, I got the same result. I have not done the opposite and have not manually calculate lefse (basically annova followed by LDA) Huttonhower's lefse package yet but it appears that different normalization significantly affects lefse results, particularly data with a borderline diff. abundance. Perhaps, yiluheihei could address which normalization method we should be using for lefse. |
@akhst7 As you said, the choice of normalization method has large effect on the result of different analysis, and there is no consensus on which method is better than others. For lesfse, I recommend use the "CPM", the default normalization method in lefse paper. And try other methods if the result is not good enough. Moreever, microbiomeMarker currently provides a function library(microbiomeMarker)
data(kostic_crc)
# Remove taxa not seen in at least 20% of the samples
kostic_crc <- phyloseq::filter_taxa(
kostic_crc,
function(x) sum(x > 0) > (0.2*length(x)), TRUE)
compare_res <- compare_DA(
ps = kostic_crc,
group = "DIAGNOSIS",
methods = "lefse",
args = list(lefse = list(list(norm = "CPM"), list(norm = "TSS"))),
BPPARAM = BiocParallel::SnowParam(workers = 2, progressbar = TRUE),
n_rep = 2,
effect_size = 5)
#> | | | 0% | |================== | 25% | |=================================== | 50% | |==================================================== | 75% | |======================================================================| 100%
compare_summary <- summary(compare_res)
#> Warning: Best score is <= 0.
#> You might require to preprocessing your data or re-run with a higher effect size.
compare_summary
#> call
#> 2 run_lefse(ps = "kostic_crc", group = "DIAGNOSIS", taxa_rank = "none", norm = "TSS")
#> 1 run_lefse(ps = "kostic_crc", group = "DIAGNOSIS", taxa_rank = "none", norm = "CPM")
#> auc fpr power fdr score score_0.05 score_0.95
#> 2 0.5000000 0.00000000 0.0000000 0.0000000 0.0000000 0.00 0.0000000
#> 1 0.8755844 0.03832834 0.3333333 0.6388889 -0.5136941 -0.55 -0.4823232 Created on 2022-06-07 by the reprex package (v2.0.1) The higher the Hope this helps. |
Thanks. It is much clear now. |
I run a following phyloseq obj;
I run a following;
I get an error as follows;
Here is a sample_data of ps;
what am I missing here ?
Thanks.
The text was updated successfully, but these errors were encountered: