bcbio run not running samples in parallel #3719

toddcreasy · 2023-10-24T13:19:44Z

I'm trying to run bcbio on 20 samples and it's only running one at a time as it steps through the yaml seemingly. I'm using slurm. I was wondering if someone can look at this bcbio snippet and if they see anything wrong?

Slurm script

#!/bin/bash

#SBATCH --job-name=bcbiopipeline_GPC3_PDx
#SBATCH -t 1-00:00
#SBATCH -c 1
#SBATCH -p core -n 32
#SBATCH --mem-per-cpu=40G
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
#SBATCH --export=ALL

echo "Load bcbio..."
module use /projects/ngs_prodifx/local/software/modules/
module load bcbio-nextgen/1.2.9

bcbio_nextgen.py seqc.yaml -n 72 -t ipython -s slurm -q core -r t=0-72:00 -r conmem=20 --timeout 3000 --tag "rna"

yaml snippet:

resources:
  default:
    memory: 30G
    cores: 72
details:
- algorithm:
    aligner: star
    disambiguate: mm10
    expression_caller:
    - salmon
    - kallisto
    quality_format: standard
    strandedness: auto
    transcriptome_fasta: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/GRCh38.primary_assembly.genome_GPC3CART.fa
    transcriptome_gtf: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/gencode.v29.annotation_GPC3CART.gtf
  analysis: RNA-seq
  description: 10_LI6612_Baseline_8743
  files:
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-10_R1_001.fastq.gz
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-10_R2_001.fastq.gz
  genome_build: hg38
  metadata:
    panel: Baseline
- algorithm:
    aligner: star
    disambiguate: mm10
    expression_caller:
    - salmon
    - kallisto
    quality_format: standard
    strandedness: auto
    transcriptome_fasta: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/GRCh38.primary_assembly.genome_GPC3CART.fa
    transcriptome_gtf: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/gencode.v29.annotation_GPC3CART.gtf
  analysis: RNA-seq
  description: 11_LI6612_Baseline_8765
  files:
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-11_R1_001.fastq.gz
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-11_R2_001.fastq.gz
  genome_build: hg38
  metadata:
    panel: Baseline
- algorithm:
    aligner: star
    disambiguate: mm10
    expression_caller:
    - salmon
    - kallisto
    quality_format: standard
    strandedness: auto
    transcriptome_fasta: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/GRCh38.primary_assembly.genome_GPC3CART.fa
    transcriptome_gtf: /swiftcache/ngs/oncology/analysis/todd_creasy/working/reference/gencode.v29.annotation_GPC3CART.gtf
  analysis: RNA-seq
  description: 12_LI6612_Baseline_8770
  files:
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-12_R1_001.fastq.gz
  - /swiftcache/ngs/oncology/analysis/todd_creasy/working/fastq/SM-745/SM-745-12_R2_001.fastq.gz
  genome_build: hg38
  metadata:
    panel: Baseline
fc_name: seqc
upload:
  dir: ../output/final

The text was updated successfully, but these errors were encountered:

naumenko-sa · 2023-10-26T18:38:43Z

Hi @toddcreasy !

You don't need to allocate all these nodes in the start script - bcbio will create and launch jobs according to the resources requested. You just start 1 core main job: https://github.com/naumenko-sa/bioscripts/blob/master/clusters/bcbio.ipython.o2.sh, then it creates a controller job, and many workers.

You also need to remove the first 3 lines of resources specification. The resources are already fine-tunned in the bcbio installation on that system in the sysconfig. Here you are asking for 72 cores and 30G RAM/core which does not make sense.

SN

toddcreasy · 2023-10-31T13:35:28Z

Issue resolved.

toddcreasy closed this as completed Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bcbio run not running samples in parallel #3719

bcbio run not running samples in parallel #3719

toddcreasy commented Oct 24, 2023

naumenko-sa commented Oct 26, 2023

toddcreasy commented Oct 31, 2023

bcbio run not running samples in parallel #3719

bcbio run not running samples in parallel #3719

Comments

toddcreasy commented Oct 24, 2023

naumenko-sa commented Oct 26, 2023

toddcreasy commented Oct 31, 2023