Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline fails for large studies #236

Closed
alexblaessle opened this issue Nov 3, 2023 · 1 comment
Closed

Pipeline fails for large studies #236

alexblaessle opened this issue Nov 3, 2023 · 1 comment
Labels
bug Something isn't working
Milestone

Comments

@alexblaessle
Copy link

Description of the bug

Trying to download a SRP project with 25000 crashes the pipeline at SRA_MERGE_SAMPLESHEET

The reason for this is that the list of files handed over is too long for bash commands. Probably chunking or handling it by reading in a list of files in a python script is probably a good way to circumvent this.

Command used and terminal output

nextflow run /path/to/fetchngs/ -profile singularity --input ids.csv --dbgap_key prj_34697.ngc --outdir out/ --skip_fastq_download -resume

ERROR ~ Error executing process > 'NFCORE_FETCHNGS:SRA:SRA_MERGE_SAMPLESHEET'

Caused by:
  Process `NFCORE_FETCHNGS:SRA:SRA_MERGE_SAMPLESHEET` terminated with an error exit status (139)

Command executed:

  cp ./samplesheets/* ~/tmp/
  cp ./mappings/* ~/tmp/


  head -n 1 `ls ./samplesheets/* | head -n 1` > samplesheet.csv
  for fileid in `ls ./samplesheets/*`; do
      awk 'NR>1' $fileid >> samplesheet.csv
  done

  head -n 1 `ls ./mappings/* | head -n 1` > id_mappings.csv
  for fileid in `ls ./mappings/*`; do
      awk 'NR>1' $fileid >> id_mappings.csv
  done

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_FETCHNGS:SRA:SRA_MERGE_SAMPLESHEET":
      sed: $(echo $(sed --version 2>&1) | sed 's/^.*GNU sed) //; s/ .*$//')
  END_VERSIONS

Command exit status:
  139

Command output:
  (empty)

Work dir:

Relevant files

No response

System information

NF version: 23.04.2
Hardware: HPC
Executor: local
Container: Singularity
OS: CentOS
Version: 1.11

@alexblaessle alexblaessle added the bug Something isn't working label Nov 3, 2023
alexblaessle pushed a commit to alexblaessle/fetchngs that referenced this issue Nov 7, 2023
Prefetching large studies resulted in error stated in issue

nf-core#236

Resolved this by handing over text file with filepaths instead
of list of filepaths. Rewrote sra_merge_samplesheet accordingly.

Tested with study with ~25k samples.
@drpatelh drpatelh added this to the 1.12.0 milestone Jan 3, 2024
@drpatelh
Copy link
Member

drpatelh commented Jan 4, 2024

Fixed in #238 #243

@drpatelh drpatelh closed this as completed Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants