Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in edgeR analysis: library size of zero detected when executing DGEList function #134

Closed
4 tasks
BioinfoHub-PeiQinNg opened this issue Mar 16, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@BioinfoHub-PeiQinNg
Copy link

I have checked the following places for your error:

Description of the bug

I have tried executing nf-core/smrnna workflow, up until the edgeR step the following error shows up
<Command exit status:
1

Command output:
$mature
[1] "SAGCFN_22_00196.mature.stats" "SAGCFN_22_00194.mature.stats"
[3] "SAGCFN_22_00188.mature.stats" "SAGCFN_22_00190.mature.stats"
[5] "SAGCFN_22_00189.mature.stats" "SAGCFN_22_00193.mature.stats"
[7] "SAGCFN_22_00186.mature.stats" "SAGCFN_22_00187.mature.stats"
[9] "SAGCFN_22_00197.mature.stats" "SAGCFN_22_00191.mature.stats"
[11] "SAGCFN_22_00195.mature.stats" "SAGCFN_22_00192.mature.stats"

$hairpin
[1] "SAGCFN_22_00187.hairpin.stats" "SAGCFN_22_00192.hairpin.stats"
[3] "SAGCFN_22_00191.hairpin.stats" "SAGCFN_22_00188.hairpin.stats"
[5] "SAGCFN_22_00186.hairpin.stats" "SAGCFN_22_00195.hairpin.stats"
[7] "SAGCFN_22_00189.hairpin.stats" "SAGCFN_22_00197.hairpin.stats"
[9] "SAGCFN_22_00196.hairpin.stats" "SAGCFN_22_00190.hairpin.stats"
[11] "SAGCFN_22_00193.hairpin.stats" "SAGCFN_22_00194.hairpin.stats"

Command error:
Loading required package: limma
Loading required package: edgeR
Loading required package: statmod
Loading required package: data.table
Loading required package: gplots

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

  lowess

Error in cpm.default(y$counts, lib.size = lib.size, log = log, prior.count = prior.count) :
library sizes should be finite and non-negative
Calls: cpm -> cpm.DGEList -> cpm.default
In addition: Warning message:
In DGEList(counts = data, genes = rownames(data)) :
library size of zero detected
Execution halted

Work dir:
/cancer/storage/SAGC/projects/SAGCQA0237-RainerHaberberger/raw_data/work/9a/6eedaca4a80898bec9761a56b5a926

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line>

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line:
    'nextflow run nf-core/smrnaseq --input '*.fastq.gz' --genome GRCh38 -profile sahmri --custom_config_base 'https://raw.githubusercontent.com/sagc-bioinformatics/configs/sahmri' --mature ../miRBase/mature.fa.gz --hairpin ../miRBase/hairpin.fa.gz -resume -r 1.1.0'

2.. See error:
< Error in cpm.default(y$counts, lib.size = lib.size, log = log, prior.count = prior.count) :
library sizes should be finite and non-negative
Calls: cpm -> cpm.DGEList -> cpm.default
In addition: Warning message:
In DGEList(counts = data, genes = rownames(data)) :
library size of zero detected
Execution halted>

Expected behaviour

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline
  • The .nextflow.log file

System

  • Hardware:
  • Executor:

Nextflow Installation

  • Version:

Container engine

  • Engine:
  • Image tag: <nf-core 1.1.0>
@BioinfoHub-PeiQinNg BioinfoHub-PeiQinNg added the bug Something isn't working label Mar 16, 2022
@apeltzer
Copy link
Member

Can you check if this is also the case for dev - the soon new released version?
You can simply run that with -r dev and providing a samplesheet for your samples.

@apeltzer
Copy link
Member

@lpantano any idea what might be causing this? A bit clueless here, as we see that same behaviour for dev too...

apeltzer added a commit that referenced this issue Mar 17, 2022
@apeltzer apeltzer mentioned this issue Mar 17, 2022
11 tasks
@apeltzer
Copy link
Member

@BioinfoHub-PeiQinNg - you may give -r fix-issue-134 a try - we're running some tests to see if this fixes the issue permanently (I believe it does, filtering out samples with entirely zero expression across all genes...)

@lpantano
Copy link
Contributor

I believe @apeltzer is right, this was reproduced by some people in the past and that should fixed it.

apeltzer added a commit that referenced this issue Mar 18, 2022
@apeltzer
Copy link
Member

Fixed in #136

@rrashford
Copy link

Hello,

I am running into this problem, but am using RStudio instead of the command line because I'm pretty new at bioinformatics. I am trying to obtain normalization factors from my alignment files using the edgeR package. When I run regionCounts that my takes in my alignment bam files (pe argument = list of all .bam files in my working directory), the consensus peakset for my experiment, and my parameters, then feed the result of that into asDGEList to get avgLogCPM, I get the following error:

> peak.counts <- regionCounts(pe.bams, all.peaks.GRanges, param=param)
> peak.abundances <- aveLogCPM(asDGEList(peak.counts)) 
Warning message:
In (function (counts = matrix(0, 0, 0), lib.size = colSums(counts),  :
  library size of zero detected

Is there some way to fix this by running some code within RStudio so that peak.counts picks up counts from the reads? Any suggestions greatly appreciated!

> sessionInfo
function (package = NULL) 
{
    z <- list()
    z$R.version <- R.Version()
    z$platform <- z$R.version$platform
    if (nzchar(.Platform$r_arch)) 
        z$platform <- paste(z$platform, .Platform$r_arch, sep = "/")
    z$platform <- paste0(z$platform, " (", 8 * .Machine$sizeof.pointer, 
        "-bit)")
    z$locale <- Sys.getlocale()
    z$running <- osVersion
    z$RNGkind <- RNGkind()
    if (is.null(package)) {
        package <- grep("^package:", search(), value = TRUE)
        keep <- vapply(package, function(x) x == "package:base" || 
            !is.null(attr(as.environment(x), "path")), NA)
        package <- .rmpkg(package[keep])
    }
    pkgDesc <- lapply(package, packageDescription, encoding = NA)
    if (length(package) == 0) 
        stop("no valid packages were specified")
    basePkgs <- sapply(pkgDesc, function(x) !is.null(x$Priority) && 
        x$Priority == "base")
    z$basePkgs <- package[basePkgs]
    if (any(!basePkgs)) {
        z$otherPkgs <- pkgDesc[!basePkgs]
        names(z$otherPkgs) <- package[!basePkgs]
    }
    loadedOnly <- loadedNamespaces()
    loadedOnly <- loadedOnly[!(loadedOnly %in% package)]
    if (length(loadedOnly)) {
        names(loadedOnly) <- loadedOnly
        pkgDesc <- c(pkgDesc, lapply(loadedOnly, packageDescription))
        z$loadedOnly <- pkgDesc[loadedOnly]
    }
    z$matprod <- as.character(options("matprod"))
    es <- extSoftVersion()
    z$BLAS <- as.character(es["BLAS"])
    z$LAPACK <- La_library()
    l10n <- l10n_info()
    if (!is.null(l10n["system.codepage"])) 
        z$system.codepage <- as.character(l10n["system.codepage"])
    if (!is.null(l10n["codepage"])) 
        z$codepage <- as.character(l10n["codepage"])
    class(z) <- "sessionInfo"
    z
}
<bytecode: 0x000002353acde460>
<environment: namespace:utils>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants