Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing process > 'pipeline:process_bams:combine_bams_and_tags (1)' #81

Closed
ktpolanski opened this issue Mar 15, 2024 · 12 comments
Closed

Comments

@ktpolanski
Copy link

Operating System

Other Linux (please specify below)

Other Linux

No response

Workflow Version

v1.1.0

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

~/nextflow-23.12.0-edge-all run epi2me-labs/wf-single-cell \
    --fastq fastq/ \
    --kit_name multiome \
    --kit_version v1 \
    --expected_cells 5000 \
    --ref_genome_dir /home/ubuntu/cellranger/GRCh38-2020-A/ \
    --sample $SAMPLE \
    -c openstack.cfg \
    --max_threads 20 \
    -profile standard \
    -resume

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

I am rerunning a sample that has previously worked fine on older versions of the workflow, most recently -r prerelease under v1.0.3 finishing on March 7th. I am encountering an error which I have previously not seen. I checked the git blame and it seems the part of the workflow that is causing the explosion was recently modified.

It seems the samtools part of the command runs fine. There is a tags folder with 20 symlinked TSVs, each 452MB in size. There is a chr_tags folder but it's empty.

Relevant log output

ERROR ~ Error executing process > 'pipeline:process_bams:combine_bams_and_tags (1)'

Caused by:
  Process `pipeline:process_bams:combine_bams_and_tags (1)` terminated with an error exit status (255)

Command executed:

  samtools merge -@ 7 --write-index -o "PAQ62150.tagged.sorted.bam##idx##PAQ62150.tagged.sorted.bam.bai" bams/*.bam
  
  mkdir chr_tags
  # merge the tags TSVs, keep header from first
  csvtk concat -tT tags/*         | csvtk split -tl -f chr -o chr_tags/
  # Strip appended source filename ("stdin-"") from the split TSVs
  for file in chr_tags/*; do mv "${file}" "${file//stdin-//}"; done

Command exit status:
  255

Command output:
  (empty)

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  [ERRO] xopen: no content

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

The demo data does clear the step.
@ddiez
Copy link

ddiez commented Mar 16, 2024

I got the same error, with the same exit error (255) and command error ([ERRO] xopen: no content), and the folder chr_tags is also empty. I tried running the csvtk commands manually on the existing files and they produced the expected output in the folder chr_tags.

@ktpolanski
Copy link
Author

ktpolanski commented Mar 22, 2024

I pulled out a few of the many thousand input FASTQ files and the command cleared the step. Seems to be something about input girth?

@ktpolanski
Copy link
Author

I gave -profile singularity a try and the process just cleared. This is not the same exact compute environment as what I encountered this on, as Singularity just refused to behave there (#87). But hey, progress!

@cjw85
Copy link
Contributor

cjw85 commented Mar 26, 2024

We've found there's a bug in the version of csvtk thats being used in the workflow (shenwei356/csvtk#259). We've replaced the use of csvtk in our development branch.

@ddiez
Copy link

ddiez commented Mar 29, 2024

@cjw85 Thanks. I can confirm that the latest prerelease version solves this issue.

@cjw85
Copy link
Contributor

cjw85 commented Mar 29, 2024

We have not made any updates to the code since this issue was reported.

@ddiez
Copy link

ddiez commented Mar 29, 2024

You mean the "development branch" is not the prerelease one? Well, then for whatever reason the latest version in prerelease did not stop at that point anymore.

@cjw85
Copy link
Contributor

cjw85 commented Mar 29, 2024

My apologies, yes the prerelease branch on GitHub tracks our internal mainline dev branch. It does contain changes to the 'pipeline:process_bams:combine_bams_and_tags stage of the workflow.

@ktpolanski
Copy link
Author

I'm not fully following.

At the time of encountering the issue, I had 20 symlinked TSVs in the input folder, each 452MB in size. So nothing seems like it was empty. I moved to Singularity and somehow the problem went away, despite me not switching to prerelease.

@cghchuwudai
Copy link

~/nextflow-23.12.0-edge-all run epi2me-labs/wf-single-cell \
    --fastq fastq/ \
    --kit_name multiome \
    --kit_version v1 \
    --expected_cells 5000 \
    --ref_genome_dir /home/ubuntu/cellranger/GRCh38-2020-A/ \
    --sample $SAMPLE \
    -c openstack.cfg \
    --max_threads 20 \
    -profile standard \
    -resume

We used the same code, but the wf-single-cell software returned the same error. It seems that the problem has not been resolved, and we were using version v1.1.0. Similarly, using the '-profile singularity' option resulted in another error.

@ktpolanski
Copy link
Author

Try -r prerelease, as per devs above that should circumvent the problematic process.

I still don't get how me switching to singularity helped, but somehow it did.

@cjw85
Copy link
Contributor

cjw85 commented May 8, 2024

The fox for this issue is now included in V2.0.0.

@cjw85 cjw85 closed this as completed May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants