Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running bcbioRNASeq from within bcbio: there is no package called ‘bcbioRNASeq’ #3565

Open
amizeranschi opened this issue Nov 26, 2021 · 73 comments

Comments

@amizeranschi
Copy link
Contributor

Hello!

I'm trying to run a bulk RNA-seq analysis using the following template:

# Template for human RNA-seq using Illumina prepared samples
---
details:
  - analysis: RNA-seq
    genome_build: sacCer3
    algorithm:
## for hg38, change the aligner to hisat2
      aligner: hisat2
      tools_on: bcbiornaseq
      bcbiornaseq:
        organism: saccharomyces cerevisiae
        interesting_groups: panel
upload:
  dir: ../final

However, this ends with the following error:

[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/tpm/tximport-tpm.csv
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/counts/tximport-counts.csv
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/tx2gene.csv
[2021-11-26T07:15Z] Storing directory in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/transcriptome
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] Timing: bcbioRNAseq loading
[2021-11-26T07:15Z] multiprocessing: run_bcbiornaseqload
[2021-11-26T07:15Z] Loading bcbioRNASeq object.
[2021-11-26T07:15Z] Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
[2021-11-26T07:15Z] Execution halted
[2021-11-26T07:15Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
    run_parallel("run_bcbiornaseqload", [sample])
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
    return f(*args, **kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
    return bcbiornaseq.make_bcbiornaseq_object(*args)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 33, in make_bcbiornaseq_object
    do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
Execution halted
' returned non-zero exit status 1.

This is strange to see, because the package does seem to be installed in the rbcbiornaseq environment:

$ bcbio_conda list -n rbcbiornaseq r-bcbiornaseq
# packages in environment at /home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq:
#
# Name                    Version                   Build  Channel
r-bcbiornaseq             0.3.42            r41hdfd78af_0    bioconda
@naumenko-sa
Copy link
Contributor

Hi @amizeranschi !

Thanks for testing and reporting!
I've fixed the paths to Rscript: #3567

I am getting the below error now:

> library(bcbioRNASeq)
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’:
 object ‘metadataBlacklist’ is not exported by 'namespace:AcidBase'
Error: package ‘basejump’ could not be loaded
> library(basejump)
Error: package or namespace load failed for ‘basejump’:
 object ‘metadataBlacklist’ is not exported by 'namespace:AcidBase'

@mjsteinbaugh could you please help us with this error?

Sergey

@mjsteinbaugh
Copy link
Contributor

Hi Sergey yeah I'll take a look tonight. If you can post the R session info via sessionInfo() that will be helpful. I think it's a quick fix.

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 1, 2021

Basically I just need to know which version of bcbioRNASeq, basejump, and AcidBase.

For reference, relevant Python code is here:
https://github.com/bcbio/bcbio-nextgen/blob/master/bcbio/rnaseq/bcbiornaseq.py

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 1, 2021

@amizeranschi What version of bcbio-nextgen are you running? Is this the latest development build?

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 1, 2021

Yeah I think something weird may be going on with the conda environment in that bcbio install. Here's a clean install of the r-bcbiornaseq v0.3.42 recipe:

conda activate r-bcbiornaseq@0.3.42
R
R.version.string
# [1] "R version 4.1.1 (2021-08-10)"
packageVersion("bcbioRNASeq")
# [1] ‘0.3.42’
packageVersion("basejump")
# [1] ‘0.14.22’
packageVersion("AcidBase")
# [1] ‘0.4.5’
suppressPackageStartupMessages({
    library(bcbioRNASeq)
})
# Loads clean

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 2, 2021

Ah OK the conda recipe issue appears to be in CloudBioLinux here: https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L272

Adjusting the r-basejump version to latest stable (v0.14.22) instead of v0.14.19 (or removing the r-basejump line in the YAML file) should fix the issue. I need to push a minor bcbioRNASeq update that tightens up the minimum dependency versions a bit -- sorry about that!

bcbioRNASeq R package dependencies are defined in DESCRIPTION file here, for reference: https://github.com/hbc/bcbioRNASeq/blob/master/DESCRIPTION

@naumenko-sa
Copy link
Contributor

thanks @mjsteinbaugh !

mamba install r-basejump=0.14.22 -n rbcbiornaseq in the existing installation helped me to get going.
Now I have the following error

Error in flatFiles(bcb) : could not find function "flatFiles"
Execution halted
' returned non-zero exit status 1.

My versions:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
_r-mutex                  1.0.1               anacondar_1    conda-forge
binutils_impl_linux-64    2.36.1               h193b22a_2    conda-forge
binutils_linux-64         2.36                 hf3e587d_1    conda-forge
bioconductor-affy         1.72.0            r41hd029910_0    bioconda
bioconductor-affyio       1.64.0            r41hd029910_0    bioconda
bioconductor-all          1.36.0            r41hdfd78af_0    bioconda
bioconductor-annotate     1.72.0            r41hdfd78af_0    bioconda
bioconductor-annotationdbi 1.56.1            r41hdfd78af_0    bioconda
bioconductor-annotationfilter 1.18.0            r41hdfd78af_0    bioconda
bioconductor-annotationhub 3.2.0             r41hdfd78af_0    bioconda
bioconductor-apeglm       1.16.0            r41h399db7b_0    bioconda
bioconductor-beachmat     2.10.0            r41h399db7b_0    bioconda
bioconductor-biobase      2.54.0            r41hd029910_0    bioconda
bioconductor-biocfilecache 2.2.0             r41hdfd78af_0    bioconda
bioconductor-biocgenerics 0.40.0            r41hdfd78af_0    bioconda
bioconductor-biocio       1.4.0             r41hdfd78af_0    bioconda
bioconductor-biocparallel 1.28.0            r41h399db7b_0    bioconda
bioconductor-biocstyle    2.22.0            r41hdfd78af_0    bioconda
bioconductor-biocversion  3.14.0            r41hdfd78af_0    bioconda
bioconductor-biomart      2.50.0            r41hdfd78af_0    bioconda
bioconductor-biostrings   2.62.0            r41hd029910_0    bioconda
bioconductor-clusterprofiler 4.2.0             r41hdfd78af_0    bioconda
bioconductor-complexheatmap 2.10.0            r41hdfd78af_0    bioconda
bioconductor-consensusclusterplus 1.58.0            r41hdfd78af_0    bioconda
bioconductor-degreport    1.30.0            r41hdfd78af_0    bioconda
bioconductor-delayedarray 0.20.0            r41hd029910_0    bioconda
bioconductor-delayedmatrixstats 1.16.0            r41hdfd78af_0    bioconda
bioconductor-deseq2       1.34.0            r41h399db7b_0    bioconda
bioconductor-do.db        2.9                           0    bioconda
bioconductor-dose         3.20.0            r41hdfd78af_0    bioconda
bioconductor-dropletutils 1.14.0            r41h399db7b_0    bioconda
bioconductor-edger        3.36.0            r41h399db7b_0    bioconda
bioconductor-enrichplot   1.14.1            r41hdfd78af_0    bioconda
bioconductor-ensdb.hsapiens.v75 2.99.0           r41hdfd78af_11    bioconda
bioconductor-ensembldb    2.18.1            r41hdfd78af_0    bioconda
bioconductor-fgsea        1.20.0            r41h399db7b_0    bioconda
bioconductor-genefilter   1.76.0            r41hba52eb8_0    bioconda
bioconductor-geneplotter  1.72.0            r41hdfd78af_0    bioconda
bioconductor-genomeinfodb 1.30.0            r41hdfd78af_0    bioconda
bioconductor-genomeinfodbdata 1.2.7             r41hdfd78af_0    bioconda
bioconductor-genomicalignments 1.30.0            r41hd029910_0    bioconda
bioconductor-genomicfeatures 1.46.1            r41hdfd78af_0    bioconda
bioconductor-genomicranges 1.46.0            r41hd029910_0    bioconda
bioconductor-ggtree       3.2.0             r41hdfd78af_0    bioconda
bioconductor-go.db        3.14.0            r41hdfd78af_0    bioconda
bioconductor-gosemsim     2.20.0            r41h399db7b_0    bioconda
bioconductor-graph        1.72.0            r41hd029910_0    bioconda
bioconductor-hdf5array    1.22.0            r41ha2fdcc6_1    bioconda
bioconductor-interactivedisplaybase 1.32.0            r41hdfd78af_0    bioconda
bioconductor-iranges      2.28.0            r41hd029910_0    bioconda
bioconductor-kegggraph    1.54.0            r41hdfd78af_0    bioconda
bioconductor-keggrest     1.34.0            r41hdfd78af_0    bioconda
bioconductor-limma        3.50.0            r41hd029910_0    bioconda
bioconductor-matrixgenerics 1.6.0             r41hdfd78af_0    bioconda
bioconductor-org.hs.eg.db 3.14.0            r41hdfd78af_0    bioconda
bioconductor-org.mm.eg.db 3.14.0            r41hdfd78af_0    bioconda
bioconductor-pathview     1.34.0            r41hdfd78af_0    bioconda
bioconductor-preprocesscore 1.56.0            r41hd029910_0    bioconda
bioconductor-protgenerics 1.26.0            r41hdfd78af_0    bioconda
bioconductor-qvalue       2.26.0            r41hdfd78af_0    bioconda
bioconductor-rgraphviz    2.38.0            r41h399db7b_0    bioconda
bioconductor-rhdf5        2.38.0            r41hfe70e90_1    bioconda
bioconductor-rhdf5filters 1.6.0             r41h399db7b_0    bioconda
bioconductor-rhdf5lib     1.16.0            r41hd029910_0    bioconda
bioconductor-rhtslib      1.26.0            r41hd029910_0    bioconda
bioconductor-rsamtools    2.10.0            r41h399db7b_0    bioconda
bioconductor-rtracklayer  1.54.0            r41ha2fdcc6_1    bioconda
bioconductor-s4vectors    0.32.0            r41hd029910_0    bioconda
bioconductor-scuttle      1.4.0             r41h399db7b_0    bioconda
bioconductor-singlecellexperiment 1.16.0            r41hdfd78af_0    bioconda
bioconductor-sparsematrixstats 1.6.0             r41h399db7b_0    bioconda
bioconductor-summarizedexperiment 1.24.0            r41hdfd78af_0    bioconda
bioconductor-treeio       1.18.0            r41hdfd78af_0    bioconda
bioconductor-tximport     1.22.0            r41hdfd78af_0    bioconda
bioconductor-vsn          3.62.0            r41hd029910_0    bioconda
bioconductor-xvector      0.34.0            r41hd029910_0    bioconda
bioconductor-zlibbioc     1.40.0            r41hd029910_0    bioconda
bwidget                   1.9.14               ha770c72_1    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
cairo                     1.16.0            h6cf1ce9_1008    conda-forge
curl                      7.80.0               h2574ce0_0    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            hba837de_1005    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
gcc_impl_linux-64         9.4.0               h03d3576_11    conda-forge
gcc_linux-64              9.4.0                h391b98a_1    conda-forge
gettext                   0.19.8.1          h73d1719_1008    conda-forge
gfortran_impl_linux-64    9.4.0               h0003116_11    conda-forge
gfortran_linux-64         9.4.0                hf0ab688_1    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gsl                       2.7                  he838d99_0    conda-forge
gxx_impl_linux-64         9.4.0               h03d3576_11    conda-forge
gxx_linux-64              9.4.0                h0316aca_1    conda-forge
harfbuzz                  3.1.1                h83ec7ef_0    conda-forge
icu                       68.2                 h9c3ff4c_0    conda-forge
jbig                      2.1               h7f98852_2003    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kernel-headers_linux-64   2.6.32              he073ed8_15    conda-forge
krb5                      1.19.2               hcc1bbae_3    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
lerc                      3.0                  h9c3ff4c_0    conda-forge
libblas                   3.9.0           12_linux64_openblas    conda-forge
libcblas                  3.9.0           12_linux64_openblas    conda-forge
libcurl                   7.80.0               h2574ce0_0    conda-forge
libdeflate                1.8                  h7f98852_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-devel_linux-64     9.4.0               hd854feb_11    conda-forge
libgcc-ng                 11.2.0              h1d223b6_11    conda-forge
libgfortran-ng            11.2.0              h69a702a_11    conda-forge
libgfortran5              11.2.0              h5c6108e_11    conda-forge
libglib                   2.70.0               h174f98d_1    conda-forge
libgomp                   11.2.0              h1d223b6_11    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0           12_linux64_openblas    conda-forge
libnghttp2                1.43.0               h812cca2_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libsanitizer              9.4.0               h79bfe98_11    conda-forge
libssh2                   1.10.0               ha56f1ee_2    conda-forge
libstdcxx-devel_linux-64  9.4.0               hd854feb_11    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_11    conda-forge
libtiff                   4.3.0                h6f004c6_2    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.1                h7f98852_0    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxml2                   2.9.12               h72842e0_0    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
make                      4.3                  hd18ef5c_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
openssl                   1.1.1l               h7f98852_0    conda-forge
pandoc                    2.16.1               h7f98852_0    conda-forge
pango                     1.48.10              h54213e6_2    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
pcre2                     10.37                h032f7d1_0    conda-forge
pip                       21.3.1             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
python                    3.10.0          h62f1059_2_cpython    conda-forge
python_abi                3.10                    2_cp310    conda-forge
r                         4.1             r41hd8ed1ab_1004    conda-forge
r-acidbase                0.4.5             r41hdfd78af_0    bioconda
r-acidcli                 0.1.7             r41hdfd78af_0    bioconda
r-acidexperiment          0.2.2             r41hdfd78af_0    bioconda
r-acidgenerics            0.5.20            r41hdfd78af_0    bioconda
r-acidgenomes             0.2.19            r41hdfd78af_0    bioconda
r-acidgsea                0.6.4             r41hdfd78af_0    bioconda
r-acidmarkdown            0.1.4             r41hdfd78af_0    bioconda
r-acidplots               0.3.9             r41hdfd78af_0    bioconda
r-acidplyr                0.1.22            r41hdfd78af_0    bioconda
r-acidsinglecell          0.1.8             r41hdfd78af_0    bioconda
r-ape                     5.5               r41h306847c_0    conda-forge
r-aplot                   0.1.1             r41hc72bb7e_0    conda-forge
r-ashr                    2.2_47            r41h03ef668_1    conda-forge
r-askpass                 1.1               r41hcfec24a_2    conda-forge
r-assertive               0.3_6             r41hc72bb7e_0    conda-forge
r-assertive.base          0.0_9             r41hc72bb7e_0    conda-forge
r-assertive.code          0.0_3             r41hc72bb7e_2    conda-forge
r-assertive.data          0.0_3             r41hc72bb7e_2    conda-forge
r-assertive.data.uk       0.0_2             r41hc72bb7e_2    conda-forge
r-assertive.data.us       0.0_2             r41hc72bb7e_2    conda-forge
r-assertive.datetimes     0.0_3             r41hc72bb7e_0    conda-forge
r-assertive.files         0.0_2           r41hc72bb7e_1003    conda-forge
r-assertive.matrices      0.0_2             r41hc72bb7e_2    conda-forge
r-assertive.models        0.0_2             r41hc72bb7e_2    conda-forge
r-assertive.numbers       0.0_2           r41hc72bb7e_1003    conda-forge
r-assertive.properties    0.0_4           r41hc72bb7e_1003    conda-forge
r-assertive.reflection    0.0_5             r41hc72bb7e_0    conda-forge
r-assertive.sets          0.0_3           r41hc72bb7e_1003    conda-forge
r-assertive.strings       0.0_3           r41hc72bb7e_1003    conda-forge
r-assertive.types         0.0_3           r41hc72bb7e_1004    conda-forge
r-assertthat              0.2.1             r41hc72bb7e_2    conda-forge
r-backports               1.3.0             r41hcfec24a_0    conda-forge
r-base                    4.1.1                hb93adac_1    conda-forge
r-base64enc               0.1_3           r41hcfec24a_1004    conda-forge
r-basejump                0.14.22           r41hdfd78af_0    bioconda
r-bbmle                   1.0.24            r41hc72bb7e_0    conda-forge
r-bcbiobase               0.6.21            r41hdfd78af_1    bioconda
r-bcbiornaseq             0.3.42            r41hdfd78af_0    bioconda
r-bdsmatrix               1.3_4             r41hcfec24a_1    conda-forge
r-bh                      1.75.0_0          r41hc72bb7e_0    conda-forge
r-biocmanager             1.30.16           r41hc72bb7e_0    conda-forge
r-bit                     4.0.4             r41hcfec24a_0    conda-forge
r-bit64                   4.0.5             r41hcfec24a_0    conda-forge
r-bitops                  1.0_7             r41hcfec24a_0    conda-forge
r-blob                    1.2.2             r41hc72bb7e_0    conda-forge
r-bookdown                0.24              r41hc72bb7e_0    conda-forge
r-boot                    1.3_28            r41hc72bb7e_0    conda-forge
r-brio                    1.1.2             r41hcfec24a_0    conda-forge
r-broom                   0.7.10            r41hc72bb7e_0    conda-forge
r-bslib                   0.3.1             r41hc72bb7e_0    conda-forge
r-cachem                  1.0.6             r41hcfec24a_0    conda-forge
r-callr                   3.7.0             r41hc72bb7e_0    conda-forge
r-caret                   6.0_90            r41hcfec24a_0    conda-forge
r-cellranger              1.1.0           r41hc72bb7e_1003    conda-forge
r-circlize                0.4.13            r41hc72bb7e_0    conda-forge
r-class                   7.3_19            r41hcfec24a_0    conda-forge
r-cli                     3.1.0             r41h03ef668_0    conda-forge
r-clipr                   0.7.1             r41hc72bb7e_0    conda-forge
r-clue                    0.3_60            r41hcfec24a_0    conda-forge
r-cluster                 2.1.2             r41h859d828_0    conda-forge
r-coda                    0.19_4            r41hc72bb7e_0    conda-forge
r-codetools               0.2_18            r41hc72bb7e_0    conda-forge
r-colorspace              2.0_2             r41hcfec24a_0    conda-forge
r-commonmark              1.7             r41hcfec24a_1002    conda-forge
r-conquer                 1.2.1             r41h6dc32e9_0    conda-forge
r-cowplot                 1.1.1             r41hc72bb7e_0    conda-forge
r-cpp11                   0.4.1             r41hc72bb7e_0    conda-forge
r-crayon                  1.4.2             r41hc72bb7e_0    conda-forge
r-crosstalk               1.2.0             r41hc72bb7e_0    conda-forge
r-curl                    4.3.2             r41hcfec24a_0    conda-forge
r-data.table              1.14.2            r41hcfec24a_0    conda-forge
r-dbi                     1.1.1             r41hc72bb7e_0    conda-forge
r-dbplyr                  2.1.1             r41hc72bb7e_0    conda-forge
r-desc                    1.4.0             r41hc72bb7e_0    conda-forge
r-deseqanalysis           0.4.4             r41hdfd78af_0    bioconda
r-diffobj                 0.3.5             r41hcfec24a_0    conda-forge
r-digest                  0.6.28            r41h03ef668_0    conda-forge
r-doparallel              1.0.16            r41hc72bb7e_0    conda-forge
r-downloader              0.4             r41hc72bb7e_1003    conda-forge
r-dplyr                   1.0.7             r41h03ef668_0    conda-forge
r-dqrng                   0.3.0             r41h03ef668_0    conda-forge
r-dt                      0.19              r41hc72bb7e_0    conda-forge
r-e1071                   1.7_9             r41h03ef668_0    conda-forge
r-ellipsis                0.3.2             r41hcfec24a_0    conda-forge
r-emdbook                 1.3.12            r41hc72bb7e_1    conda-forge
r-etrunct                 0.1             r41ha770c72_1002    conda-forge
r-evaluate                0.14              r41hc72bb7e_2    conda-forge
r-fansi                   0.4.2             r41hcfec24a_0    conda-forge
r-farver                  2.1.0             r41h03ef668_0    conda-forge
r-fastmap                 1.1.0             r41h03ef668_0    conda-forge
r-fastmatch               1.1_3             r41hcfec24a_0    conda-forge
r-filelock                1.0.2           r41hcfec24a_1002    conda-forge
r-fontawesome             0.2.2             r41hc72bb7e_0    conda-forge
r-forcats                 0.5.1             r41hc72bb7e_0    conda-forge
r-foreach                 1.5.1             r41hc72bb7e_0    conda-forge
r-foreign                 0.8_81            r41hcfec24a_0    conda-forge
r-formatr                 1.11              r41hc72bb7e_0    conda-forge
r-fs                      1.5.0             r41h03ef668_0    conda-forge
r-futile.logger           1.4.3           r41hc72bb7e_1003    conda-forge
r-futile.options          1.0.1           r41hc72bb7e_1002    conda-forge
r-future                  1.23.0            r41hc72bb7e_0    conda-forge
r-future.apply            1.8.1             r41hc72bb7e_0    conda-forge
r-generics                0.1.1             r41hc72bb7e_0    conda-forge
r-getoptlong              1.0.5             r41hc72bb7e_0    conda-forge
r-ggdendro                0.1.22            r41hc72bb7e_0    conda-forge
r-ggforce                 0.3.3             r41h03ef668_0    conda-forge
r-ggfun                   0.0.4             r41hc72bb7e_0    conda-forge
r-ggnewscale              0.4.5             r41hc72bb7e_0    conda-forge
r-ggplot2                 3.3.5             r41hc72bb7e_0    conda-forge
r-ggplotify               0.1.0             r41hc72bb7e_0    conda-forge
r-ggpmisc                 0.4.4             r41hc72bb7e_0    conda-forge
r-ggpp                    0.4.2             r41hc72bb7e_0    conda-forge
r-ggraph                  2.0.5             r41h03ef668_0    conda-forge
r-ggrepel                 0.9.1             r41h03ef668_0    conda-forge
r-ggridges                0.5.3             r41hc72bb7e_0    conda-forge
r-globaloptions           0.1.2             r41ha770c72_0    conda-forge
r-globals                 0.14.0            r41hc72bb7e_0    conda-forge
r-glue                    1.5.0             r41hcfec24a_0    conda-forge
r-goalie                  0.5.5             r41hdfd78af_0    bioconda
r-gower                   0.2.2             r41hcfec24a_0    conda-forge
r-graphlayouts            0.7.1             r41h03ef668_0    conda-forge
r-gridextra               2.3             r41hc72bb7e_1003    conda-forge
r-gridgraphics            0.5_1             r41hc72bb7e_0    conda-forge
r-gtable                  0.3.0             r41hc72bb7e_3    conda-forge
r-haven                   2.4.3             r41h2713e49_0    conda-forge
r-hexbin                  1.28.2            r41h859d828_0    conda-forge
r-highr                   0.9               r41hc72bb7e_0    conda-forge
r-hms                     1.1.1             r41hc72bb7e_0    conda-forge
r-htmltools               0.5.2             r41h03ef668_0    conda-forge
r-htmlwidgets             1.5.4             r41hc72bb7e_0    conda-forge
r-httpuv                  1.6.3             r41h03ef668_0    conda-forge
r-httr                    1.4.2             r41hc72bb7e_0    conda-forge
r-igraph                  1.2.8             r41he0372cf_0    conda-forge
r-invgamma                1.1               r41hc72bb7e_1    conda-forge
r-ipred                   0.9_12            r41hcfec24a_0    conda-forge
r-irlba                   2.3.3             r41he454529_3    conda-forge
r-isoband                 0.2.5             r41h03ef668_0    conda-forge
r-iterators               1.0.13            r41hc72bb7e_0    conda-forge
r-jquerylib               0.1.4             r41hc72bb7e_0    conda-forge
r-jsonlite                1.7.2             r41hcfec24a_0    conda-forge
r-kernsmooth              2.23_20           r41h742201e_0    conda-forge
r-knitr                   1.35              r41hc72bb7e_0    conda-forge
r-labeling                0.4.2             r41hc72bb7e_0    conda-forge
r-lambda.r                1.2.4             r41hc72bb7e_1    conda-forge
r-lasso2                  1.2_22            r41hcfec24a_0    conda-forge
r-later                   1.2.0             r41h03ef668_0    conda-forge
r-lattice                 0.20_45           r41hcfec24a_0    conda-forge
r-lava                    1.6.10            r41hc72bb7e_0    conda-forge
r-lazyeval                0.2.2             r41hcfec24a_2    conda-forge
r-lifecycle               1.0.1             r41hc72bb7e_0    conda-forge
r-listenv                 0.8.0             r41hc72bb7e_1    conda-forge
r-lmodel2                 1.7_3             r41hc72bb7e_0    conda-forge
r-locfit                  1.5_9.4           r41hcfec24a_1    conda-forge
r-logging                 0.10_108          r41ha770c72_2    conda-forge
r-lubridate               1.8.0             r41h03ef668_0    conda-forge
r-magrittr                2.0.1             r41hcfec24a_1    conda-forge
r-markdown                1.1               r41hcfec24a_1    conda-forge
r-mass                    7.3_54            r41hcfec24a_0    conda-forge
r-matrix                  1.3_4             r41he454529_0    conda-forge
r-matrixmodels            0.5_0             r41hc72bb7e_0    conda-forge
r-matrixstats             0.61.0            r41hcfec24a_0    conda-forge
r-memoise                 2.0.0             r41hc72bb7e_0    conda-forge
r-mgcv                    1.8_38            r41he454529_0    conda-forge
r-mime                    0.12              r41hcfec24a_0    conda-forge
r-mixsqp                  0.3_43            r41h306847c_1    conda-forge
r-mnormt                  2.0.2             r41h859d828_0    conda-forge
r-modelmetrics            1.2.2.2           r41h03ef668_1    conda-forge
r-munsell                 0.5.0           r41hc72bb7e_1003    conda-forge
r-mvtnorm                 1.1_3             r41h859d828_0    conda-forge
r-nlme                    3.1_153           r41h859d828_0    conda-forge
r-nnet                    7.3_16            r41hcfec24a_0    conda-forge
r-nozzle.r1               1.1_1           r41ha770c72_1003    conda-forge
r-numderiv                2016.8_1.1        r41hc72bb7e_3    conda-forge
r-openssl                 1.4.5             r41he36bf35_1    conda-forge
r-openxlsx                4.2.4             r41h03ef668_0    conda-forge
r-parallelly              1.28.1            r41hc72bb7e_0    conda-forge
r-patchwork               1.1.1             r41hc72bb7e_0    conda-forge
r-pheatmap                1.0.12            r41hc72bb7e_2    conda-forge
r-pillar                  1.6.4             r41hc72bb7e_0    conda-forge
r-pipette                 0.7.2             r41hdfd78af_0    bioconda
r-pkgconfig               2.0.3             r41hc72bb7e_1    conda-forge
r-pkgload                 1.2.3             r41h03ef668_0    conda-forge
r-plogr                   0.2.0           r41hc72bb7e_1003    conda-forge
r-plyr                    1.8.6             r41h03ef668_1    conda-forge
r-png                     0.1_7           r41hcfec24a_1004    conda-forge
r-polyclip                1.10_0            r41h03ef668_2    conda-forge
r-polynom                 1.4_0             r41hc72bb7e_2    conda-forge
r-praise                  1.0.0           r41hc72bb7e_1004    conda-forge
r-prettyunits             1.1.1             r41hc72bb7e_1    conda-forge
r-proc                    1.18.0            r41h03ef668_0    conda-forge
r-processx                3.5.2             r41hcfec24a_0    conda-forge
r-prodlim                 2019.11.13        r41h03ef668_1    conda-forge
r-progress                1.2.2             r41hc72bb7e_2    conda-forge
r-progressr               0.9.0             r41hc72bb7e_0    conda-forge
r-promises                1.2.0.1           r41h03ef668_0    conda-forge
r-proxy                   0.4_26            r41hcfec24a_0    conda-forge
r-ps                      1.6.0             r41hcfec24a_0    conda-forge
r-psych                   2.1.9             r41hc72bb7e_0    conda-forge
r-purrr                   0.3.4             r41hcfec24a_1    conda-forge
r-pzfx                    0.3.0             r41hc72bb7e_0    conda-forge
r-quantreg                5.86              r41h52d45c5_0    conda-forge
r-r.methodss3             1.8.1             r41hc72bb7e_0    conda-forge
r-r.oo                    1.24.0            r41hc72bb7e_0    conda-forge
r-r.utils                 2.11.0            r41hc72bb7e_0    conda-forge
r-r6                      2.5.1             r41hc72bb7e_0    conda-forge
r-rappdirs                0.3.3             r41hcfec24a_0    conda-forge
r-rcolorbrewer            1.1_2           r41h785f33e_1003    conda-forge
r-rcpp                    1.0.7             r41h03ef668_0    conda-forge
r-rcpparmadillo           0.10.7.3.0        r41h306847c_0    conda-forge
r-rcppeigen               0.3.3.9.1         r41h306847c_0    conda-forge
r-rcppnumerical           0.4_0             r41h03ef668_1    conda-forge
r-rcurl                   1.98_1.5          r41hcfec24a_0    conda-forge
r-rdrop2                  0.8.2.1           r41hc72bb7e_0    conda-forge
r-readr                   2.0.2             r41h03ef668_0    conda-forge
r-readxl                  1.3.1             r41h2713e49_4    conda-forge
r-recipes                 0.1.17            r41hc72bb7e_0    conda-forge
r-recommended             4.1             r41hd8ed1ab_1004    conda-forge
r-rematch                 1.0.1           r41hc72bb7e_1003    conda-forge
r-rematch2                2.1.2             r41hc72bb7e_1    conda-forge
r-reshape                 0.8.8             r41hcfec24a_2    conda-forge
r-reshape2                1.4.4             r41h03ef668_1    conda-forge
r-restfulr                0.0.13            r41hdf9a8c9_1    bioconda
r-rio                     0.5.27            r41hc72bb7e_0    conda-forge
r-rjson                   0.2.20          r41h03ef668_1002    conda-forge
r-rlang                   0.4.12            r41hcfec24a_0    conda-forge
r-rmarkdown               2.11              r41hc72bb7e_0    conda-forge
r-rpart                   4.1_15            r41hcfec24a_2    conda-forge
r-rprojroot               2.0.2             r41hc72bb7e_0    conda-forge
r-rsqlite                 2.2.8             r41h03ef668_0    conda-forge
r-rstudioapi              0.13              r41hc72bb7e_0    conda-forge
r-rvcheck                 0.1.8             r41hc72bb7e_1    conda-forge
r-sass                    0.4.0             r41h03ef668_0    conda-forge
r-scales                  1.1.1             r41hc72bb7e_0    conda-forge
r-scatterpie              0.1.6             r41hc72bb7e_0    conda-forge
r-sessioninfo             1.2.1             r41hc72bb7e_0    conda-forge
r-shadowtext              0.0.9             r41hc72bb7e_0    conda-forge
r-shape                   1.4.6             r41ha770c72_0    conda-forge
r-shiny                   1.7.1             r41h785f33e_0    conda-forge
r-sitmo                   2.0.2             r41h03ef668_0    conda-forge
r-snow                    0.4_4             r41hc72bb7e_0    conda-forge
r-sourcetools             0.1.7           r41h9c3ff4c_1002    conda-forge
r-sparsem                 1.81              r41h859d828_0    conda-forge
r-spatial                 7.3_14            r41hcfec24a_0    conda-forge
r-splus2r                 1.3_3             r41h859d828_0    conda-forge
r-squarem                 2021.1            r41hc72bb7e_0    conda-forge
r-stringi                 1.7.5             r41hcabe038_0    conda-forge
r-stringr                 1.4.0             r41hc72bb7e_2    conda-forge
r-survival                3.2_13            r41hcfec24a_0    conda-forge
r-syntactic               0.5.0             r41hdfd78af_0    bioconda
r-sys                     3.4               r41hcfec24a_0    conda-forge
r-testthat                3.1.0             r41h03ef668_0    conda-forge
r-tibble                  3.1.6             r41hcfec24a_0    conda-forge
r-tidygraph               1.2.0             r41h03ef668_0    conda-forge
r-tidyr                   1.1.4             r41h03ef668_0    conda-forge
r-tidyselect              1.1.1             r41hc72bb7e_0    conda-forge
r-tidytree                0.3.5             r41hc72bb7e_0    conda-forge
r-timedate                3043.102        r41hc72bb7e_1002    conda-forge
r-tinytex                 0.35              r41hc72bb7e_0    conda-forge
r-tmvnsim                 1.0_2             r41h859d828_3    conda-forge
r-truncnorm               1.0_8           r41hcfec24a_1002    conda-forge
r-tweenr                  1.0.2             r41h03ef668_0    conda-forge
r-tzdb                    0.2.0             r41h03ef668_0    conda-forge
r-upsetr                  1.4.0             r41hc72bb7e_2    conda-forge
r-utf8                    1.2.2             r41hcfec24a_0    conda-forge
r-vctrs                   0.3.8             r41hcfec24a_1    conda-forge
r-viridis                 0.6.2             r41hc72bb7e_0    conda-forge
r-viridislite             0.4.0             r41hc72bb7e_0    conda-forge
r-vroom                   1.5.6             r41h03ef668_0    conda-forge
r-waldo                   0.3.1             r41hc72bb7e_0    conda-forge
r-withr                   2.4.2             r41hc72bb7e_0    conda-forge
r-xfun                    0.28              r41h03ef668_0    conda-forge
r-xml                     3.99_0.8          r41hcfec24a_0    conda-forge
r-xml2                    1.3.2             r41h03ef668_1    conda-forge
r-xtable                  1.8_4             r41hc72bb7e_3    conda-forge
r-xts                     0.12.1            r41hcfec24a_0    conda-forge
r-yaml                    2.2.1             r41hcfec24a_1    conda-forge
r-yulab.utils             0.0.4             r41hc72bb7e_0    conda-forge
r-zip                     2.2.0             r41hcfec24a_0    conda-forge
r-zoo                     1.8_9             r41hcfec24a_1    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
sed                       4.8                  he412f7d_0    conda-forge
setuptools                58.5.3          py310hff52083_0    conda-forge
sqlite                    3.36.0               h9cd32fc_2    conda-forge
sysroot_linux-64          2.12                he073ed8_15    conda-forge
tk                        8.6.11               h27826a3_1    conda-forge
tktable                   2.10                 hb7b940f_3    conda-forge
tzdata                    2021e                he74cb21_0    conda-forge
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.2                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-libxt                1.2.1                h7f98852_2    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h36c2ea0_1013    conda-forge
zstd                      1.5.0                ha95c52a_0    conda-forge

@mjsteinbaugh
Copy link
Contributor

@naumenko-sa Where's flatFiles() being called? I'm not seeing this in the bcbio-nextgen source code. Try renaming that to coerceToList() instead -- flatFiles() was made defunct and then later removed in basejump because the function returned an unstructured list from an S4 object, but not actual "flat files" on disk.

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 2, 2021

Ah nevermind got it, it's here:

flatline = 'flat <- flatFiles(bcb)'

Yeah rename flatFiles() to coerceToList() instead, and that should fix it. I can push an update to basejump that keeps this deprecated again for the time being.

@naumenko-sa
Copy link
Contributor

Thanks @mjsteinbaugh !

I've fixed that and rda -> rds

The next issue is:

subprocess.CalledProcessError: Command 'bcbio_devel/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla -e load("/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/2_bulk_rnaseq/seqc/final/bcbioRNASeq/data/bcb.rds");date=format(Sys.time(), "%Y-%m-%d");dir="/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/2_bulk_rnaseq/seqc/final/bcbioRNASeq/data/../results/2021-12-02/gene/counts";library(tidyverse);library(bcbioRNASeq);counts = bcbioRNASeq::counts(bcb) %>% as.data.frame() %>% round() %>% tibble::rownames_to_column("gene");metadata = colData(bcb) %>% as.data.frame() %>% tibble::rownames_to_column("sample");readr::write_csv(counts, file.path(dir, "counts.csv.gz"));readr::write_csv(metadata, file.path(dir, "metadata.csv.gz"));
Error in load("/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/2_bulk_rnaseq/seqc/final/bcbioRNASeq/data/bcb.rds") : 
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning messages:
1: In readChar(con, 5L, useBytes = TRUE) :
  truncating string with embedded nuls
2: file ‘bcb.rds’ has magic number 'X'
  Use of save versions prior to 2 is deprecated 
Execution halted
' returned non-zero exit status 1.

Could you please take a look?

Sergey

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 2, 2021

Yeah for RDS files, use object <- readRDS("bcb.rds"). R data serialized (RDS) files are saved without the object name (e.g. "bcb"), which is preferable, but requires assignment into the current working environment. That error in base R is too vague and needs to be improved in a future update, suggesting readRDS instead of load.

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 2, 2021

Another minor thing is that R 4.1 adds support for a native pipe |>, which is more performant for large objects than the magrittr pipe %>%. We may want to switch to this in the bcbio code.

@naumenko-sa
Copy link
Contributor

naumenko-sa commented Dec 2, 2021

Thanks @mjsteinbaugh ! Almost there!

With these changes + tools_on: keep_gene_version which keeps transcript versions in tx2gene:
#3568

ENST00000456328.2,ENSG00000223972
ENST00000450305.2,ENSG00000223972
ENST00000488147.1,ENSG00000227232
ENST00000619216.1,ENSG00000278267
ENST00000473358.1,ENSG00000243485
ENST00000469289.1,ENSG00000243485
ENST00000607096.1,ENSG00000284332
ENST00000417324.1,ENSG00000237613
ENST00000461467.1,ENSG00000237613
ENST00000606857.1,ENSG00000268020

I am getting:

→ Importing '/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/2021-12-02_seqc/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• HBRR_rep1
• HBRR_rep2
• HBRR_rep3
• UHRR_rep1
• UHRR_rep2
• UHRR_rep3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/2021-12-02_seqc/tx2gene.csv' using data.table::`fread()`.
→ Importing salmon transcript-level counts from 'quant.sf' files using tximport 1.22.0.
countsFromAbundance: lengthScaledTPM
txOut: TRUE
reading in files with read_tsv
1 2 3 4 5 6 
Error in .isTximportReturn(txi) : Assert failure.
[2] identical(rownames(infReps[[1L]]), rownames(abundance)) is not TRUE.
Calls: bcbioRNASeq -> .tximport -> assert -> .isTximportReturn -> assert
Execution halted
' returned non-zero exit status 1.

The sample sheet:

samplename,description,category
UHRR_rep1,UHRR_rep1,UHRR
HBRR_rep1,HBRR_rep1,HBRR
UHRR_rep2,UHRR_rep2,UHRR
HBRR_rep2,HBRR_rep2,HBRR
UHRR_rep3,UHRR_rep3,UHRR
HBRR_rep3,HBRR_rep3,HBRR

The yaml template:

details:
  - analysis: RNA-seq
    genome_build: hg38
    algorithm:
      quality_format: standard
      aligner: false
      strandedness: unstranded
      tools_on:
      - bcbiornaseq
      - keep_gene_version
      bcbiornaseq:
        organism: homo sapiens
        interesting_groups: category
upload:
  dir: ../final
resources:
  star:
    cores: 10
    memory: 10G

The basic tximport companion works ok:
https://github.com/bcbio/bcbio-nextgen/blob/master/bcbio/scripts/R/bcbio2se.R

Sergey

@mjsteinbaugh
Copy link
Contributor

OK cool I'll take a look and see if we need to publish any fixes in the package

@mjsteinbaugh
Copy link
Contributor

@naumenko-sa Hi Sergey, following up on this, I'm working on a code update this week and will ping you back soon.

@amizeranschi
Copy link
Contributor Author

@mjsteinbaugh @naumenko-sa

Thanks a lot for looking into this. Please let me know if I can do anything to help with testing.

@naumenko-sa
Copy link
Contributor

@mjsteinbaugh sorry for bugging, any luck with the update?
We need to release bcbio1.2.9 this week,
I'd be happy to include the updated bcbioRNAseq rather than to pin the r35 version.

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 13, 2021

@naumenko-sa Yep totally, I'll work on fixing it this week ASAP. What's your timeline for the 1.2.9 release?

@naumenko-sa
Copy link
Contributor

thanks Michael!
Honestly, this issue is the main release blocker for now - we have fixed PureCN, snpeff5.0, picard which were also blocking issues. Releasing 1.2.9 tomorrow or Wed would be ideal to give users some time for the post-release testing before the NY.

@naumenko-sa naumenko-sa pinned this issue Dec 13, 2021
@mjsteinbaugh
Copy link
Contributor

OK I'll work on fixing this today

@mjsteinbaugh
Copy link
Contributor

Can you send me a copy of the example data from /n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final above? That will be easier to test locally

@naumenko-sa
Copy link
Contributor

Sure, uploading them here: https://www.dropbox.com/sh/w9ogvhbeqirluq4/AAB-YpjkbhgUP8YHpZOfV9sTa?dl=0
should be 6 files in ~ 30 min

@mjsteinbaugh
Copy link
Contributor

What's the R code that gets passed in the bcbioRNASeq() function call? I'm having difficulty reproducing this with any test dataset

@mjsteinbaugh
Copy link
Contributor

Ah OK think I may have it -- seems to be a situation where level = "genes" is working but level = "transcripts" is not working as expected. I'll dig into this further

@mjsteinbaugh
Copy link
Contributor

Here's the error with more verbosity, I'm working on a version bump that will fix this:

→ Importing salmon transcript-level counts from quant.sf files using tximport 1.22.0.
countsFromAbundance: lengthScaledTPM
txOut: TRUE
reading in files with read_tsv
1 2 3 4 5 6 
Error: Assert failure.
[2] identical(rownames(infReps[[1L]]),
rownames(abundance)) is not TRUE.
Backtrace:
    █
 1. └─bcbioRNASeq::bcbioRNASeq(uploadDir, level = "transcripts")
 2.   └─bcbioRNASeq::.tximport(...) R/AllGenerators.R:397:8
 3.     ├─goalie::assert(.isTximportReturn(txi)) R/internal-tximport.R:110:4
 4.     └─bcbioRNASeq::.isTximportReturn(txi) R/internal-tximport.R:110:4
 5.       └─goalie::assert(...) R/internal-tximport.R:145:8
 6.         └─AcidCLI:::stop(...)
 7.           └─cli::cli_abort(x)

@naumenko-sa
Copy link
Contributor

Thanks Michael!
The upload is done!

@mjsteinbaugh
Copy link
Contributor

Cool I think I have a working fix, will push an update to GitHub soon

@mjsteinbaugh
Copy link
Contributor

I'm working on some additional improvements to the package that we can table for a later release...this fix should work so you can push the bcbio-nextgen 1.2.9 update

@mjsteinbaugh
Copy link
Contributor

OK I think bcbioRNASeq v0.3.43 should fix this issue. I'm working on updating on bioconda.

@naumenko-sa naumenko-sa unpinned this issue Dec 16, 2021
@amizeranschi
Copy link
Contributor Author

@naumenko-sa @mjsteinbaugh

Thanks again for all your help so far. After upgrading to the latest development version and getting the sacCer3 data, the RNA-seq analysis progressed further for me, but still ended up crashing.

Let me know if you want me to share a script with everything I'm doing here, in case it could help with reproducing and debugging. Here's the error I'm running into:

[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Timing: bcbioRNAseq loading
[2021-12-18T11:58Z] multiprocessing: run_bcbiornaseqload
[2021-12-18T11:58Z] Loading bcbioRNASeq object.
[2021-12-18T11:58Z] Loading required package: basejump
[2021-12-18T11:58Z] Attaching package: ‘basejump’
[2021-12-18T11:58Z] The following objects are masked from ‘package:stats’:
[2021-12-18T11:58Z]     complete.cases, cor, end, median, na.omit, quantile, sd, start, var
[2021-12-18T11:58Z] The following objects are masked from ‘package:utils’:
[2021-12-18T11:58Z]     head, relist, tail
[2021-12-18T11:58Z] The following objects are masked from ‘package:base’:
[2021-12-18T11:58Z]     %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
[2021-12-18T11:58Z]     as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
[2021-12-18T11:58Z]     do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
[2021-12-18T11:58Z]     intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
[2021-12-18T11:58Z]     ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
[2021-12-18T11:58Z]     rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
[2021-12-18T11:58Z]     setdiff, sort, split, sub, subset, summary, t, table, tapply,
[2021-12-18T11:58Z]     union, unique, unsplit, which, which.max, which.min
[2021-12-18T11:58Z] 🧪 # bcbioRNASeq
[2021-12-18T11:58Z] ℹ Importing bcbio-nextgen RNA-seq run.
[2021-12-18T11:58Z] 🧪 ## Run info
[2021-12-18T11:58Z] uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
[2021-12-18T11:58Z] projectDir: 2021-12-18_rna-seq-analysis
[2021-12-18T11:58Z] ℹ 7 samples detected:
[2021-12-18T11:58Z] • AE1
[2021-12-18T11:58Z] • AE2
[2021-12-18T11:58Z] • AE3
[2021-12-18T11:58Z] • bcbioRNASeq
[2021-12-18T11:58Z] • RT1
[2021-12-18T11:58Z] • RT2
[2021-12-18T11:58Z] • RT3
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
[2021-12-18T11:58Z] 🧪 ## Sample metadata
[2021-12-18T11:58Z] → Getting sample metadata from YAML.
[2021-12-18T11:58Z] Loading a subset of samples:
[2021-12-18T11:58Z] • AE1
[2021-12-18T11:58Z] • AE2
[2021-12-18T11:58Z] • AE3
[2021-12-18T11:58Z] • RT1
[2021-12-18T11:58Z] • RT2
[2021-12-18T11:58Z] • RT3
[2021-12-18T11:58Z] → Getting sample quality control metrics from YAML.
[2021-12-18T11:58Z] 🧪 ## Counts
[2021-12-18T11:58Z] 🧪 ### tximport
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
[2021-12-18T11:58Z] Error in validObject(.Object) :
[2021-12-18T11:58Z]   invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
[2021-12-18T11:58Z] Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
[2021-12-18T11:58Z] Execution halted
[2021-12-18T11:58Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Attaching package: ‘basejump’
The following objects are masked from ‘package:stats’:
    complete.cases, cor, end, median, na.omit, quantile, sd, start, var
The following objects are masked from ‘package:utils’:
    head, relist, tail
The following objects are masked from ‘package:base’:
    %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
    as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
    do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
    intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
    ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
    setdiff, sort, split, sub, subset, summary, t, table, tapply,
    union, unique, unsplit, which, which.max, which.min
🧪 # bcbioRNASeq
ℹ Importing bcbio-nextgen RNA-seq run.
🧪 ## Run info
uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
projectDir: 2021-12-18_rna-seq-analysis
ℹ 7 samples detected:
• AE1
• AE2
• AE3
• bcbioRNASeq
• RT1
• RT2
• RT3
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• AE1
• AE2
• AE3
• RT1
• RT2
• RT3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
Error in validObject(.Object) : 
  invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
    run_parallel("run_bcbiornaseqload", [sample])
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
    return f(*args, **kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
    return bcbiornaseq.make_bcbiornaseq_object(*args)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 31, in make_bcbiornaseq_object
    do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Attaching package: ‘basejump’
The following objects are masked from ‘package:stats’:
    complete.cases, cor, end, median, na.omit, quantile, sd, start, var
The following objects are masked from ‘package:utils’:
    head, relist, tail
The following objects are masked from ‘package:base’:
    %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
    as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
    do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
    intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
    ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
    setdiff, sort, split, sub, subset, summary, t, table, tapply,
    union, unique, unsplit, which, which.max, which.min
🧪 # bcbioRNASeq
ℹ Importing bcbio-nextgen RNA-seq run.
🧪 ## Run info
uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
projectDir: 2021-12-18_rna-seq-analysis
ℹ 7 samples detected:
• AE1
• AE2
• AE3
• bcbioRNASeq
• RT1
• RT2
• RT3
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• AE1
• AE2
• AE3
• RT1
• RT2
• RT3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
Error in validObject(.Object) : 
  invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
Execution halted
' returned non-zero exit status 1.

@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Dec 18, 2021

Thanks @amizeranschi I see the problem there, and it appears to be specific to the sacCer3 genome:

🧪 ### tximport
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
Error in validObject(.Object) : 
  invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject

Can you post a copy of the tx2gene.csv file shown here so I can work on a fix?

The Tx2Gene class and importer is defined in our AcidGenomes package, for reference.

Best,
Mike

@mjsteinbaugh
Copy link
Contributor

I also need to add an update to exclude the bcbioRNASeq directory, which is new in the bcbio-nextgen v1.2.9 update:

ℹ 7 samples detected:
• AE1
• AE2
• AE3
• bcbioRNASeq
• RT1
• RT2
• RT3

This should return:

ℹ 6 samples detected:
• AE1
• AE2
• AE3
• RT1
• RT2
• RT3

This is handled by the sampleDirs function in our bcbioBase package.

@amizeranschi
Copy link
Contributor Author

Thanks for the reply @mjsteinbaugh

I'm attaching the file you requested: tx2gene.csv: tx2gene.csv

I'm also attaching a file with the commands I used to set up bcbio, to download the data and to set up the bcbio runs.

The relevant lines for this analysis are 115-166 (downloading the data) and 206-235 (setting up and running the analysis).

Hope this helps.

VM-setup.txt

@mjsteinbaugh
Copy link
Contributor

OK great thanks, I'll work on a fix for this over the weekend and will be in touch soon with an update.

@naumenko-sa naumenko-sa reopened this Dec 19, 2021
@mjsteinbaugh
Copy link
Contributor

mjsteinbaugh commented Jan 8, 2022

OK this tx2gene issue with the sacCer3 genome should be fixed by the pending update to r-acidgenomes 0.2.20. I'm working on pushing this to bioconda today.

See relevant code change here: https://github.com/acidgenomics/r-acidgenomes/blob/main/R/AllClasses.R#L864

You can check this with your install here:

packageVersion("AcidGenomes")
## 0.2.20
library(AcidGenomes)
tx2gene <- importTx2Gene(
    file = pasteURL(
        "github.com",
        "bcbio",
        "bcbio-nextgen",
        "files",
        "7739401",
        "tx2gene.csv",
        protocol = "https"
    )
)
print(tx2gene)
## Tx2Gene with 7036 rows and 2 columns
##                txId      geneId
##         <character> <character>
## 1       ETS1-1_rRNA      ETS1-1
## 2       ETS1-2_rRNA      ETS1-2
## 3       ETS2-1_rRNA      ETS2-1
## 4       ETS2-2_rRNA      ETS2-2
## 5        HRA1_ncRNA        HRA1
## ...             ...         ...
## 7032   YPR202W_mRNA     YPR202W
## 7033   YPR203W_mRNA     YPR203W
## 7034 YPR204C-A_mRNA   YPR204C-A
## 7035   YPR204W_mRNA     YPR204W
## 7036     ZOD1_ncRNA        ZOD1

@naumenko-sa
Copy link
Contributor

thanks @mjsteinbaugh!
I've pinned it in cloudbiolinux:
https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L280

@amizeranschi please let us know if it works for you.

@amizeranschi
Copy link
Contributor Author

Hello,

Thanks for looking into this. I upgraded bcbio and tools to latest development and launched R from the directory ${bcbio_dir}/anaconda/envs/rbcbiornaseq/bin and the commands mentioned by @mjsteinbaugh above ran successfuly. AcidGenomes v. 0.2.20 seems to be available.

However, the bcbio analysis still ended up crashing, this time due to the version of a different package:

[2022-01-09T13:21Z] multiprocessing: run_bcbiornaseqload
[2022-01-09T13:21Z] Loading bcbioRNASeq object.
[2022-01-09T13:21Z] Loading required package: basejump
[2022-01-09T13:21Z] Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
[2022-01-09T13:21Z]  namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
[2022-01-09T13:21Z] Error: package ‘basejump’ could not be loaded
[2022-01-09T13:21Z] Execution halted
[2022-01-09T13:21Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
 namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
Error: package ‘basejump’ could not be loaded
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
    run_parallel("run_bcbiornaseqload", [sample])
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
    return f(*args, **kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
    return bcbiornaseq.make_bcbiornaseq_object(*args)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 31, in make_bcbiornaseq_object
    do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
 namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
Error: package ‘basejump’ could not be loaded
Execution halted
' returned non-zero exit status 1.

@mjsteinbaugh
Copy link
Contributor

You're seeing this error because the conda environment solver isn't working correctly. We should be installing these versions:

r-acidgenomes                 0.2.20   r41hdfd78af_0  bioconda
r-acidexperiment               0.2.2   r41hdfd78af_0  bioconda
r-acidsinglecell               0.1.9   r41hdfd78af_0  bioconda
r-basejump                   0.14.23   r41hdfd78af_0  bioconda
r-bcbiobase                   0.6.22   r41hdfd78af_0  bioconda
r-bcbiornaseq                 0.3.44   r41hdfd78af_0  bioconda

@amizeranschi
Copy link
Contributor Author

@naumenko-sa

Could you pin these package versions as well in cloudbiolinux?

@naumenko-sa
Copy link
Contributor

@amizeranschi
Copy link
Contributor Author

Thanks, but that doesn't seem to be ebough to get everything installed as it should. You might have to pin all the 6 package versions mentioned above.

Error: package or namespace load failed for ‘bcbioRNASeq’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
 namespace ‘bcbioBase’ 0.6.21 is being loaded, but >= 0.6.22 is required
Execution halted
' returned non-zero exit status 1.

@naumenko-sa
Copy link
Contributor

please try again, I hope we got all of them now

@amizeranschi
Copy link
Contributor Author

Thanks a lot. We're definitely making progress.

This time, bcbiornaseq complains about the ref-transcripts.gtf file for sacCer3:

🧪 ### featureCounts
→ Importing aligned counts from featureCounts.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2022-01-11_rna-seq-analysis/featureCounts/combined.counts' using data.table::`fread()`.
🧪 ## Feature metadata
bcbio GTF file:
/home/user/bcbio-nextgen/genomes/Scerevisiae/sacCer3/rnaseq/ref-transcripts.gtf
→ Making <GRanges> from GFF file ('ref-transcripts.gtf').
→ Getting GFF metadata for 'ref-transcripts.gtf'.
Error: Failed to detect provider (e.g. "Ensembl") from 'ref-transcripts.gtf'.
Backtrace:
    █
 1. └─bcbioRNASeq::bcbioRNASeq(...)
 2.   └─AcidGenomes::makeGRangesFromGFF(...)
 3.     └─AcidGenomes:::.makeGRangesFromRtracklayer(...)
 4.       └─AcidGenomes::getGFFMetadata(file)
 5.         └─AcidCLI::abort(...)
 6.           └─cli::cli_abort(x)
Execution halted
' returned non-zero exit status 1.

I have checked now and that GTF file doesn't have any header. It was installed as part of the sacCer3 genome by bcbio.

@mjsteinbaugh
Copy link
Contributor

Thanks @amizeranschi can you post that GTF file so I can take a look and work on a fix?

@amizeranschi
Copy link
Contributor Author

Sure thing, here you go. I changed the extension to .txt so that GitHub would accept it.

ref-transcripts.gtf.txt

@mjsteinbaugh
Copy link
Contributor

OK this appears to be fixed in the development version of r-acidgenomes, which is not yet suitable for deployment on bioconda just yet. I'll post an update when I finish rolling out a stable release supporting this fix.

https://github.com/acidgenomics/r-acidgenomes/tree/develop

@amizeranschi
Copy link
Contributor Author

@mjsteinbaugh

Would you consider adding support in bcbiornaseq for differential affinity in ChIP-seq and ATAC-seq peaks? Given that bcbio produces consensus peaks and computes read counts, these could be used in DESeq2 exactly like in the RNA-seq scenario.

https://bcbio-nextgen.readthedocs.io/en/latest/contents/atac.html#differential-affinity-analysis

@mjsteinbaugh
Copy link
Contributor

@amizeranschi OK I think this should be fixed on bioconda.

@Melisa-Magallanes
Copy link

Hello,
I'm getting a similar error trying to install trinity by conda:
ERROR conda.core.link:_execute(730): An error occurred while installing package 'bioconda::bioconductor-go.db-3.14.0-r41hdfd78af_0'.
I've tried with many conda versions but the error persist:
What can I do to fix it?

@mjsteinbaugh
Copy link
Contributor

Hi @Melisa-Magallanes thanks for the update -- I'll try clean installing bcbio and see if I can reproduce

@amizeranschi
Copy link
Contributor Author

@Melisa-Magallanes

Just in case your error is similar to what I've been seeing (post-link script failed for package bioconda::bioconductor-go.db-3.14.0-r41hdfd78af_0), then know that this is a relatively common problem now and it's being addressed.

Have a look here: bioconda/bioconda-recipes#36499 (comment)

@mjsteinbaugh
Copy link
Contributor

Thanks for the update! I'll see if we can come up with a fix in bioconda-recipes.

@amizeranschi
Copy link
Contributor Author

amizeranschi commented Aug 20, 2022

Great, thanks a lot. Please have a look at bioconductor-org.hs.eg.db as well. I've been getting a similar error with it while attempting to install bcbio.

Edit: I've submitted a couple of pull requests.
bioconda/bioconda-recipes#36554
bioconda/bioconda-recipes#36555

@mjsteinbaugh
Copy link
Contributor

@amizeranschi r-bcbiornaseq has been updated to 0.5.1 on bioconda. I'm working on updating this in the main bcbio-nextgen install with @naumenko-sa

@naumenko-sa
Copy link
Contributor

Thanks @mjsteinbaugh !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants