Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bowtie2 Mapping Alignment exceeded running time limit error #158

Closed
koushik20 opened this issue Apr 17, 2023 · 3 comments
Closed

Bowtie2 Mapping Alignment exceeded running time limit error #158

koushik20 opened this issue Apr 17, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@koushik20
Copy link

Description of the bug

Hi,

Thanks for the detailed documentation!
I am running nfcore/hic version 2.0.0 with GRCh38 reference genome but always getting Process exceeded running time limit (16h)

Below is the terminal output

executor >  local (2)
[e4/33766a] process > NFCORE_HIC:HIC:INPUT_CHECK:SAMPLESHEET_CHECK (input_file.csv)   [100%] 1 of 1, cached: 1 ✔
[9f/54196b] process > NFCORE_HIC:HIC:PREPARE_GENOME:CUSTOM_GETCHROMSIZES (genome.fa)  [100%] 1 of 1, cached: 1 ✔
[d3/12afd7] process > NFCORE_HIC:HIC:PREPARE_GENOME:GET_RESTRICTION_FRAGMENTS (^GATC) [100%] 1 of 1, cached: 1 ✔
[c6/5b0dc7] process > NFCORE_HIC:HIC:FASTQC (BT549_Rep2)                              [100%] 2 of 2, cached: 2 ✔
[0d/50c45a] process > NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN (BT549_Rep2) [ 25%] 1 of 4, failed: 1
[-        ] process > NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:TRIM_READS                 -
[-        ] process > NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN_TRIMMED      -
[-        ] process > NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:MERGE_BOWTIE2              -
[-        ] process > NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:COMBINE_MATES              -
[-        ] process > NFCORE_HIC:HIC:HICPRO:GET_VALID_INTERACTION                     -
[-        ] process > NFCORE_HIC:HIC:HICPRO:MERGE_VALID_INTERACTION                   -
[-        ] process > NFCORE_HIC:HIC:HICPRO:MERGE_STATS                               -
[-        ] process > NFCORE_HIC:HIC:HICPRO:HICPRO2PAIRS                              -
[d5/1dd856] process > NFCORE_HIC:HIC:COOLER:COOLER_MAKEBINS (null})                   [100%] 7 of 7, cached: 7 ✔
[-        ] process > NFCORE_HIC:HIC:COOLER:COOLER_CLOAD                              -
[-        ] process > NFCORE_HIC:HIC:COOLER:COOLER_BALANCE                            -
[-        ] process > NFCORE_HIC:HIC:COOLER:COOLER_ZOOMIFY                            -
[-        ] process > NFCORE_HIC:HIC:COOLER:COOLER_DUMP                               -
[-        ] process > NFCORE_HIC:HIC:COOLER:SPLIT_COOLER_DUMP                         -
[-        ] process > NFCORE_HIC:HIC:HIC_PLOT_DIST_VS_COUNTS                          -
[-        ] process > NFCORE_HIC:HIC:COMPARTMENTS:COOLTOOLS_EIGSCIS                   -
[-        ] process > NFCORE_HIC:HIC:TADS:COOLTOOLS_INSULATION                        -
[-        ] process > NFCORE_HIC:HIC:CUSTOM_DUMPSOFTWAREVERSIONS                      -
[-        ] process > NFCORE_HIC:HIC:MULTIQC                                          -
Execution cancelled -- Finishing pending tasks before exit
Error executing process > 'NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN (BT549_Rep2)'

Caused by:
  Process exceeded running time limit (16h)

Command executed:

  INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\.rev.1.bt2$//"`
  [ -z "$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\.rev.1.bt2l$//"`
  [ -z "$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1
  
  bowtie2 \
      -x $INDEX \
      -U HiChIP_BT549-B_S6_R2_001.fastq.gz \
      --threads 12 \
      --un-gz BT549_Rep2_0_R2.unmapped.fastq.gz \
      --very-sensitive --end-to-end --reorder \
      2> BT549_Rep2_0_R2.bowtie2.log \
      | samtools view -F 4 --threads 12 -o BT549_Rep2_0_R2.bam -
  
  if [ -f BT549_Rep2_0_R2.unmapped.fastq.1.gz ]; then
      mv BT549_Rep2_0_R2.unmapped.fastq.1.gz BT549_Rep2_0_R2.unmapped_1.fastq.gz
  fi
  
  if [ -f BT549_Rep2_0_R2.unmapped.fastq.2.gz ]; then
      mv BT549_Rep2_0_R2.unmapped.fastq.2.gz BT549_Rep2_0_R2.unmapped_2.fastq.gz
  fi
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN":
      bowtie2: $(echo $(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*$//')
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      pigz: $( pigz --version 2>&1 | sed 's/pigz //g' )
  END_VERSIONS

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /mnt/hichip_results/BT549/work/0d/50c45a4cea8db207d2ce122b4f009b

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

The pipeline always stops at this particular bowtie2 mapping step. I gave a separate nextflow.config file and assigned greater memory to this specific step.

process {
  withName: 'NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN' {
    memory = 80.GB
  }
}

So My Questions are
Why does the pipeline aborts at 16h timestamp even though I gave 240h max time?
When I ran some samples earlier with GRCh37 the pipeline was completed successfully so I there an issue with using GRCh38?
I tried to run with different --max_cpus, --max_memory, --max_time configurations but the pipeline always aborts at this particular step (command executed step) see above

Thank you!

Command used and terminal output

Input script filename: run_hicpro.sh

sudo nextflow run nf-core/hic -r 2.0.0 \
       --input '/mnt/hichip_results/BT549/input_file.csv' \
       -profile docker \
       -resume \
       --fastq_chunks_size 20000000 \
       --max_memory '128.GB' \
       --max_time '240.h' \
       --max_cpus 60 \
       --outdir "/mnt/hicpro_results/BT549_Apr2023" \
       --genome GRCh38 \
       --save_pairs_intermediates \
       --bwt2_opts_end2end '--very-sensitive --end-to-end --reorder' \
       --bwt2_opts_trimmed '--very-sensitive --end-to-end --reorder' \
       --digestion 'dpnii' \
       --ligation_site 'GATCGATC' \
       --restriction_site '^GATC' \
       --min_cis_dist 1000 \
       --min_mapq 20 \
       --bin_size '5000,20000,40000,150000,500000,1000000' \
       --saveReference

Input command: sudo bash run_hicpro.sh

Relevant files

nextflow.log

System information

Nextflow version - 22.10.7
Hardware - Desktop
Executor - local
Container engine: Docker
OS Ubuntu - 20.04.5 Linux
Version - nf-core/hic 2.0.0

@koushik20 koushik20 added the bug Something isn't working label Apr 17, 2023
@ninashenker
Copy link

@koushik20 I'm having this same issue - were you able to fix it?

@koushik20
Copy link
Author

I gave a separate custom nextflow config file and the pipeline was completed without any errors.

process {
  withLabel:process_high {
    memory = 64.GB
    cpus = 52
    time = 36.h
  }
}

process {
  withLabel:process_medium {
    memory = 64.GB
    cpus = 52
    time = 36.h
  }
}

process {
  withLabel:process_low {
    memory = 64.GB
    cpus = 52
    time = 36.h
  }
}

process {
  withName:'NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN' {
    memory = 64.GB
    cpus = 52
    time = 36.h
  }
}

process {
  withName:'NFCORE_HIC:HIC:HICPRO:HICPRO_MAPPING:BOWTIE2_ALIGN_TRIMMED' {
    memory = 64.GB
    cpus = 52
    time = 36.h
  }
}

memory = { check_max( 64.GB * task.attempt, 'memory' ) }

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
  if (type == 'memory') {
    try {
      if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
        return params.max_memory as nextflow.util.MemoryUnit
      else
        return obj
    } catch (all) {
      println "   ### ERROR ###   Max memory '${params.max_memory}' is not valid! Using default value: $obj"
      return obj
    }
  } else if (type == 'time') {
    try {
      if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
        return params.max_time as nextflow.util.Duration
      else
        return obj
    } catch (all) {
      println "   ### ERROR ###   Max time '${params.max_time}' is not valid! Using default value: $obj"
      return obj
    }
  } else if (type == 'cpus') {
    try {
      return Math.min( obj, params.max_cpus as int )
    } catch (all) {
      println "   ### ERROR ###   Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
      return obj
    }
  }
}

@ninashenker
Copy link

Thank you so much! This worked nicely, though for some samples the bowtie alignment step is taking over 48 hours... seems too long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants