lefser producing empty results when given pathway data #33

jmartin77777 · 2024-01-24T16:42:12Z

I am trying to use the lefser package to generate LDA effect size on a table of pathway relative abundances. I am able to use the tool to reproduce your example case with the zeller14 dataset. But when I try loading my own data it gives me an empty table. I am starting with an input tab-delimited text file that looks like this:

ID	COMPARISON	12DICHLORETHDEG-PWY	1CMET2-PWY
MB20-007-MB20-007_01	CTRL	0	366.878
MB19-001-MB19-001_01	CTRL	4.50847	318.903
MB20-010-MB20-010_01	PD	0	439.004
MB20-012-MB20-012_01	PD	151.322	433.688
MB20-014-MB20-014_01	PD	98.831	435.307

with each row being a sample, column 1 if the samplename, column 2 is the class, further columns are pathway CPM abundance values generated by Humann3. My R script looks like this:

data <- read.csv("Campbell_PWY_abundance.tsv", sep = "\t", header = TRUE, check.names = FALSE)
group <- data$COMPARISON # Capture the 'COMPARISON' column
abundance_data <- data[, -c(1,2)] # Exclude the 'ID' & 'COMPARISON' column
abundance_data_transposed <- t(abundance_data) # Transpose table
sample_metadata <- DataFrame(COMPARISON = group, row.names = data$ID)
se <- SummarizedExperiment(assays = list(counts = abundance_data_transposed),
                           colData = sample_metadata)
se_relative <- relativeAb(se)
lefse_results <- lefser(se_relative, groupCol = "COMPARISON")

It does generate the lefse_results object, but its just an empty table. I was initially concerned that maybe lefser was not happy with the features being pathways, and not the usual formatted taxonomy. But I then tried importing a taxonomy based table (of the same format as shown above) and am getting the same result. Can you help me troubleshoot what I am doing wrong?

The text was updated successfully, but these errors were encountered:

lwaldron · 2024-02-01T14:57:48Z

@jmartin77777 can you reproduce this with a dataset from curatedMetagenomicData, so that we can reproduce it?

jmartin77777 · 2024-02-07T22:09:55Z

Sorry for the delayed response. I'll try and find a curated dataset to test. Can you recommend any small datasets that might be appropriate?

lwaldron · 2024-02-08T19:09:05Z

AsnicarF_2017 from curatedMetagenomicData has 24 samples and includes HUMAnN3 pathways https://waldronlab.io/curatedMetagenomicData/articles/available-studies.html

lwaldron · 2024-03-18T10:17:30Z

With issues #32 and #35 resolved in release and devel, things should be clearer now as you will see a message if there are no results that pass both the statistical significance threshold and the LDA threshold. You can test which of these is responsible for the lack of results, such as follows (such as setting a p-value threshold to 1, or a LDA threshold to 0).

suppressPackageStartupMessages({
  library(lefser)
  library(curatedMetagenomicData)
})
zeller <-
  curatedMetagenomicData("ZellerG_2014.pathway_abundance",
                         counts = TRUE,
                         dryrun = FALSE)[[1]]
#> Cannot connect to ExperimentHub server, using 'localHub=TRUE' instead
#> Using 'localHub=TRUE'
#>   If offline, please also see BiocManager vignette section on offline use
zeller <- zeller[, zeller$study_condition != "adenoma"]
zeller <- relativeAb(zeller)

lefse_results1 <-
  lefser(
    zeller,
    groupCol = "study_condition",
    kruskal.threshold = 0.05,
    lda.threshold = 2
  )
#> The outcome variable is specified as 'study_condition' and the reference category is 'control'.
#>  See `?factor` or `?relevel` to change the reference category.
head(lefse_results1)
#>                                                                 Names    scores
#> 1                 UNINTEGRATED|g__Ruminococcus.s__Ruminococcus_bromii -3.473240
#> 2 UNINTEGRATED|g__Lachnospiraceae_unclassified.s__Eubacterium_rectale -3.423555
#> 3                 UNINTEGRATED|g__Anaerostipes.s__Anaerostipes_hadrus -3.200505
#> 4                            UNINTEGRATED|g__Blautia.s__Blautia_obeum -3.199587
#> 5                         UNINTEGRATED|g__Blautia.s__Blautia_wexlerae -3.160973
#> 6           UNINTEGRATED|g__Bifidobacterium.s__Bifidobacterium_longum -3.037694

lefse_results2 <-
  lefser(
    zeller,
    groupCol = "study_condition",
    kruskal.threshold = 1e-6,
    lda.threshold = 2
  )
#> The outcome variable is specified as 'study_condition' and the reference category is 'control'.
#>  See `?factor` or `?relevel` to change the reference category.
#> No significant features found.
head(lefse_results2)
#> [1] Names  scores
#> <0 rows> (or 0-length row.names)

lefse_results3 <- 
  lefser(
    zeller, 
    groupCol = "study_condition", 
    kruskal.threshold = 0.05,
    lda.threshold = 2000
  )
#> The outcome variable is specified as 'study_condition' and the reference category is 'control'.
#>  See `?factor` or `?relevel` to change the reference category.
#> No significant features found.
head(lefse_results3)
#> [1] Names  scores
#> <0 rows> (or 0-length row.names)

^{Created on 2024-03-18 with reprex v2.1.0}

lwaldron assigned asyakhl Feb 1, 2024

lwaldron closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lefser producing empty results when given pathway data #33

lefser producing empty results when given pathway data #33

jmartin77777 commented Jan 24, 2024

lwaldron commented Feb 1, 2024

jmartin77777 commented Feb 7, 2024

lwaldron commented Feb 8, 2024

lwaldron commented Mar 18, 2024

lefser producing empty results when given pathway data #33

lefser producing empty results when given pathway data #33

Comments

jmartin77777 commented Jan 24, 2024

lwaldron commented Feb 1, 2024

jmartin77777 commented Feb 7, 2024

lwaldron commented Feb 8, 2024

lwaldron commented Mar 18, 2024