Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong BWA index directory in references #70

Closed
Lupezes opened this issue Jan 31, 2023 · 7 comments
Closed

Wrong BWA index directory in references #70

Lupezes opened this issue Jan 31, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@Lupezes
Copy link

Lupezes commented Jan 31, 2023

Description of the bug

CIRIquant step can't find BWA index files.
This behaviour is seen regardless if reference files are fetched from AWS or locally assigned.

Command used and terminal output

nextflow run nf-core/circrna \
 -profile docker \
 --input test_samples.csv \
 --genome GRCh38 \
 --input_type fastq \
 -r 5e17f6cbbc74b2c3bc807d26662dd7f411759b33 \
 --module 'circrna_discovery, mirna_prediction' \
 --tool 'ciriquant' \
 --bwa "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/" \
 --bowtie "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/" \
 --bowtie2 "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/" \
 --star "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/STARIndex/" \
 --gtf "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf" \
 --bed12 "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.bed" \
 --mature "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa"

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'CIRIQUANT (SRR16316888)'

Caused by:
  Process `CIRIQUANT (SRR16316888)` terminated with an error exit status (1)

Command executed:

  CIRIquant \
      -t 16 \
      -1 SRR16316888_1.fastq.gz \
      -2 SRR16316888_2.fastq.gz \
      --config travis.yml \
      --no-gene \
      -o SRR16316888 \
      -p SRR16316888
  
  ## Apply Filtering
  cp SRR16316888/SRR16316888.gtf .
  
  ## extract counts (convert float/double to int [no loss of information])
  grep -v "#" SRR16316888.gtf | awk '{print $14}' | cut -d '.' -f1 > counts
  grep -v "#" SRR16316888.gtf | awk -v OFS="	" '{print $1,$4,$5,$7}' > SRR16316888.tmp
  paste SRR16316888.tmp counts > SRR16316888_unfilt.bed
  
  ## filter bsj_reads
  awk '{if($5 >= 0) print $0}' SRR16316888_unfilt.bed > SRR16316888_filt.bed
  grep -v '^$' SRR16316888_filt.bed > SRR16316888_ciriquant
  
  ## correct offset bp position
  awk -v OFS="	" '{$2-=1;print}' SRR16316888_ciriquant > SRR16316888_ciriquant.bed
  
  rm SRR16316888.gtf
  
  ## Re-work for Annotation
  awk -v OFS="	" '{print $1, $2, $3, $1":"$2"-"$3":"$4, $5, $4}' SRR16316888_ciriquant.bed > SRR16316888_ciriquant_circs.bed

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-circrna-1.0.0/bin/CIRIquant", line 8, in 
      sys.exit(main())
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
      config = check_config(check_file(args.config_file))
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/utils.py", line 95, in check_config
      BWA_INDEX = os.path.splitext(check_file(config['reference']['bwa_index'] + '.bwt'))[0]
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
      raise ConfigError('File: {}, not found'.format(file_name))
  CIRIquant.utils.ConfigError: File: /home/lab32/Downloads/teste/results/reference_genome/BWAIndex/genome.fa.bwt, not found

Work dir:
  /home/lab32/Downloads/teste/work/2d/9d56bcc3e7d596e02df26bf69a5383

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Relevant files

nextflow.log

System information

  • N E X T F L O W ~version 22.10.4, build 5836 (09-12-2022 09:58 UTC)
  • Desktop
  • local
  • docker
  • Ubuntu 22 LTS
  • nf-core/circrna v1.0.0
@Lupezes Lupezes added the bug Something isn't working label Jan 31, 2023
@Lupezes
Copy link
Author

Lupezes commented Jan 31, 2023

It seems that BWA index directory and the directory to which the travis.yml refers to don't match. I've tried changing the directory of BWA index files, and it was of no help.
In my situation, what happens is the following:

My local BWA index directory is: /home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/

The pipeline creates a folder like this: /home/lab32/Downloads/teste/results/reference_genome/BWAIndex/BWAIndex
(that contains the index files)

However, travis.yml refers to this: /home/lab32/Downloads/teste/results/reference_genome/BWAIndex/
(that contains the folder where the index files are present)

P.S: I downloaded those files from AWS IGenome. They have already been tested with the standalone version of CIRIquant and it worked.

@BarryDigby
Copy link
Collaborator

Hey @Lupezes,

I'm in the process of switching to DSL2, can I ask you to use the dev branch? Your command would be:

nextflow pull -r dev nf-core/circrna
nextflow run -r dev nf-core/circrna \
 -profile docker \
 --input test_samples.csv \
 --genome GRCh38 \
 --module 'circrna_discovery,mirna_prediction' \
 --tool 'ciriquant' \
 --bwa "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/" \
 --gtf "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf" \
 --mature "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa"

You shouldn't need bowtie, bowtie2 nor STAR for CIRIquant. Pass the hisat2 reference if you have it in your directory! We don't make use of the reference bed12 file either.

I'll hold off on committing to dev for the evening so you get a chance to run it.

Best,

Barry

@Lupezes
Copy link
Author

Lupezes commented Jan 31, 2023

Hi @BarryDigby!

I was using a previous version, because I wasn't able to make the current one work properly!
As suggested, I pulled the current version (deleted the previous one and downloaded it). I tried the suggested command to run the pipeline, however I got another error.

Invalid method invocation `call` with arguments: /home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex (sun.nio.fs.UnixPath) on _closure8 type

nextflow.log

Here is is the nextflow workflow report

Workflow execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: null.

The full error message was:

No signature of method: Script_132045f3$_runScript_closure1$_closure2$_closure8.call() is applicable for argument types: (sun.nio.fs.UnixPath) values: [/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex]
Possible solutions: any(), any(), each(groovy.lang.Closure), tap(groovy.lang.Closure), any(groovy.lang.Closure), each(groovy.lang.Closure)

execution_report_2023-01-31_15-47-19.zip

Just to ensure the files are in the correct directory:

ls /home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex 
genome.fa  genome.fa.amb  genome.fa.ann  genome.fa.bwt  genome.fa.pac  genome.fa.rbwt  genome.fa.rpac  genome.fa.rsa  genome.fa.sa

I believe this is a problem to be solved in another issue, however I could make the 'test' profile of this version work.

@BarryDigby
Copy link
Collaborator

BarryDigby commented Feb 1, 2023

Ah, you did not surround your BWAIndex path with quotes ;)

edit: does not appear to solve it, leave it with me an I will sort it out this morning

@BarryDigby
Copy link
Collaborator

Going to document the cause here for my future self:

Invalid method invocation call with arguments: /home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex (sun.nio.fs.UnixPath) on _closure8 type is pointing to this map function for CIRIquant yaml generation:

CIRIQUANT_YML( gtf, fasta, bwa_index.map{ meta, index -> return index }, hisat2_index )

BWA indices are output as a [[meta], path] tuple so when the user specifies the path to previously generated index files, we need to recapitulate the [[meta], path] tuple structure instead of just the path. This is now handled using some sarek logic when staging the bwa_index channel:

bwa_index      = params.fasta ? params.bwa ? Channel.fromPath(params.bwa).map{ it -> [[id:it[0].baseName], it] } : PREPARE_GENOME.out.bwa : []

Check PREPARE_GENOME outputs

Bowtie
/data/github/circrna/work/fb/e5d28f3ff1cd6938ab26ee490c0297/bowtie
Bowtie2
[[id:chrI], /data/github/circrna/work/a3/72d197d22a067c5461fb4a51e71152/bowtie2]
BWA
[[id:chrI], /data/github/circrna/work/61/6c66997160f35498e68d457a318585/bwa]
HISAT2
/data/github/circrna/work/b0/4d79b66f3622747dcde57d4fa4f716/hisat2
STAR
/data/github/circrna/work/20/e64c9b4a063dae0ed545b0c24ddd7a/star
Segemehl
/data/github/circrna/work/92/828329416bcb2eb29efeb9fb5278aa/chrI.idx

Make sure to make a tuple when user supplies path to bwa & bowtie2. Come back to this issue when nf-core standardizes their genome index outputs...

@BarryDigby
Copy link
Collaborator

@Lupezes Thanks for raising the issue - good catch.

Just double checking everything now on the dev branch. If you're able to hang tight the dev branch should be released as version 1.0.0 later this week

@Lupezes
Copy link
Author

Lupezes commented Feb 1, 2023

@BarryDigby Thank you for the attention!

No problem! I'll be waiting for the release then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants